Data Management

Being a Welsh Government funded programme, GMEP must follow the Joint Codes of Practice for Research (JCoPR) which set out standards for the quality of science and the quality of research processes. It ensures the research approaches are robust and also gives confidence that processes and procedures used to gather and interpret the results are appropriate, rigorous, repeatable and auditable. CEH is also developing a Quality Management System based on the ISO9001 , a globally recognized quality management standard. GMEP is being managed according to this standard. In accordance with both standards a number of processes and procedures have been put in place to ensure compliance.

Field Survey,

Modelling,

External data sources

Data Management

Data Sources

Field Survey

Accurate data collection by field surveyors is crucial to subsequent analyses and must be checked as it is recorded. Bespoke field survey software running on ruggedised field computers has been created to ensure all habitat mapping and feature descriptions are checked against quality control (QC) rules and formats as they are created. This means that we can be confident in the quality of data from the field and can rapidly feed this into automated analyses. A rigorous training programme ensures consistency and accuracy among the highly skilled surveyors.

Modelling

Scientific models are held in version controlled repositories with any changes in source code, additions, or deletions clearly documented. Full descriptions of the models together with all input parameters, required data sets, and outputs are maintained on the collaborative project wiki space.

External data sources

Third-party datasets are held under license in an Oracle spatial database, with each project partner accessing the data they require from their own connection. Datasets include Glastir and legacy scheme uptake, hydrology, land boundaries, habitat and land cover, climate, soils, Ordnance Survey mapping, aerial photography, and historic features. Each dataset contained within the database is fully documented on the project's wiki site.

Lab analysis

Data Management

Procedures and Protocols

Project procedures and protocols are documented, approved and reviewed regularly; and a GMEP web based collaborative project site (SharePoint) is used to manage the version control of all documentation relating to the project.

Lab analysis

All chemical analyses undertaken meet the requirements of the ISO17025 standard. And all laboratory 'Standard Operating Procedures' meet these requirements.

Data storage

Data Management

Data storage

All datasets produced and received by the project are stored in an access-controlled, secure, Oracle database.

In this way, data are carefully managed, access is provided only when data licenses have been agreed, and multiple users can simultaneously access the most up to date datasets, for querying and analysis.

The Environmental Information Data Centre (EIDC) is a NERC Data Centre who manage datasets concerned with terrestrial and freshwater science. Depositing datasets with the EIDC Hub ensures that they will be stored securely and accessible for the long-term. EIDC also provides cataloguing and data citation facilities.

Metadata is held in a central web-based collaborative repository (Wiki space). All datasets collected in the field or returned from laboratories, third party datasets, data processing scripts, and analysis scripts are fully documented by project staff, and entered onto the wiki website. Other project staff can then view the latest datasets, request database access, query the data fields or analysis methods, and download metadata and documentation.

Data Analysis

Data Management

Data Analysis

All processing, data formatting and statistical analyses are completed using well documented scripts. This ensures that every aspect of the analysis is recorded and there is a clear audit trail of each step involved in the process. Scripts and workflows are then held in a protected, version controlled repository.

In addition to well commented scripts used to run the analyses, each specific run of the script that produces results is accompanied by an analysis summary report saved on the collaborative wiki space. This ensures that each set of results has a clear audit trail, is repeatable and can be reproduced by any member of the project team. The reports also link directly to the corresponding input and output data used in the analysis and the metadata record associated with this.

Reporting

Data Management

Reporting

Data from GMEP will be available in the future in the CEH data catalogue, which contains data holdings of CEH and the terrestrial and freshwater ecological community. The catalogue is publicly searchable on the internet and stores INSPIRE compliant metadata and any associated view and download services.