Menu

JRA3 FREE Data - Research Overview

JRA3 FREE Data

Facilitating the Re-use and Exchange of Experimental Data

JRA3 FREE Data

Maximising effective data flux between hydraulic laboratories and among experimental models, numerical models and field case studies is key to increasing impact of environmental hydraulics for understanding and managing climate change adaptation. This JRA will develop new methods for sharing and exchanging experimental results to encourage greater data flux and will demonstrate how the exchange of data between laboratories, numerical models and the field environment provides comprehensive solutions to climate change adaptation. We will positively engage with stakeholders at a range of levels to develop robust pathways to impact from physical modelling research that can be used by the environmental hydraulic community to ensure that research in experimental facilities are as widely used as possible.

At its core, this JRA will build on the importance of open access to public sector research which has been recognised by the OECD, the G8, national governments and the European Commission, whose Open Data Strategy for Europe is expected to deliver a €40 billion boost to the EU's economy each year. The HYDRALAB community is an excellent example of open scientific collaboration, but within environmental hydraulics there needs to be greater use of open software, open software architectures and open access publication. Data should be a key element within this mix and with open data becoming a requirement of funding bodies our data flux and exchange needs to increase. Much more data (and many more data structures) are becoming available from field observations, laboratory experiments and numerical modelling. The science of climate change adaptation will improve most efficiently when the best use is made of the most appropriate data source, which is driven by effective exchange between applications. This advance will require an increase in the richness of meta-data information about shared data, which should enable any interested party to find a dataset, judge its suitability and quality, download it and use it, without having to interact directly with the team who captured the data.

The main aims of ‘FREE Data’ are to develop tools and protocols for the effective sharing of data that allows effective flux and exchange with numerical modelling and field case studies. As part of the work we will test and demonstrate these tools and protocols with a range of datasets, starting with relatively simple cases and building up to the most complex and most interdisciplinary datasets that will be generated in HYDRALAB+: those produced by work on ecohydraulics. These aims have the following objectives:

  • To develop data standards and licenses to better facilitate the re-use of data;
  • To develop a novel and flexible data repository to facilitate the exchange of data;
  • To develop more effective knowledge sharing tools for community and stakeholder engagement;
  • To develop novel methods and protocols for interaction and effective data exchange between laboratory and numerical models and field observations. Achieving these objectives will lead to the creation of a free market in open data, which can be used, reused and redistributed by anyone (OpenDefinition.org).

Task 10.1 Critical review of data flux between laboratory models, numerical models and field case studies

An effective mix and combination of laboratory modelling, numerical modelling and field case study is required for the effective advancement of environmental hydraulics. Each of the communities is methodologically strong, with considerable advances in the quantification possible in both the field and the laboratory and the power and effectiveness of numerical simulation also having grown significantly over recent decades. However, the links between the three methodological approaches are weak. This is important as the links between laboratory, field and numerical model are critical for substantive advancement in our understanding of complex systems and thus interdisciplinary-based prediction. This critical review will build on previous work by the HYDRALAB group on data standards and sharing and examine the state of the art in the flux of data between laboratory, field and numerical simulations. The review will examine and assess the effectiveness of validation and verification processes that drive comparisons and confidence in predictions and will highlight how these processes can be developed and improved. A series of protocols will be developed with the aim to maximise the effectiveness of data flux in the future.

Task 10.2 Data Standards and licenses

The purpose of this work package is to ensure that the data produced is interoperable, which will allow data to be exchanged between organisations. One of the first tasks of this JRA will be to develop standards appropriate for the ecohydraulic community and how they should be applied. We will start from the most commonly-occurring data types (such as regular time series of real values). Unusual, large (i.e. image-based) datasets may be left in native format, while the edition of ecohydraulic test data will present a challenge in terms of the type of data collected. The choice of appropriate meta-data will allow external users to discover and assess the data produced. This work-package will consist of three subtasks:

  • Task 10.2.1. Develop common data structures and their technical implementations. This may include the development of a standard data format for common data types (similar or based on NetCDF or XTF formats used in the oceanographic community) and conversion software to convert native data to that format. This must contain appropriate meta-data.
  • Task 10.2.1. Develop appropriate vocabularies and ontologies (such as the Climate and Forecasting, CF, standard names). This may require some consolidation of existing vocabularies;
  • Task 10.2.3 Selection of an open data license(s) and embargo period. Intellectual property rights vary from country to country, but many jurisdictions grant IPR rights in data that prevent it from being used and distributed without the owner’s permission. Such data only becomes open when it is distributed with an open license (http://opendefinition.org/licenses/). An approved license (or licenses) and a common embargo period, where the originators have exclusive access to their data before it is made open, will be chosen early on in the project.

Task 10.3 Repository

The purpose of this work package is to ensure that data is preserved for future use and can be uniquely identified, in line with the ‘Guidelines on Data Management in Horizon 2010’. Task 10.3.1 will develop rules for depositing publications and data in an open access repository, such as Zenodo or developing a distributed solution with a catalogue server linked to local hosting of data. In Task 10.3.2 a range of users will test this infrastructure using example publications and data from the Joint Research Activities and Transnational Access.

In Task 10.2.3 we will develop rules for assigning Digital Object Identifiers (DOIs) to datasets. This allows a dataset to receive a citation in a paper or report. The allocation of DOIs to HYDRALAB+ physical model datasets will support researchers by helping them to find, identify and cite these datasets with confidence. Citation records will provide direct evidence of the use of HYDRALAB+ data outside the HYDRALAB community and will be used as a metric for the success of HYDRALAB+.

Task 10.4 Data flux between the field and the laboratory

Data exchange between in situ field experiments and ex situ laboratory experiments is essential to gain a rich understanding of environmental hydraulics and for the development of effective climate change adaptation strategies. Laboratory experiments are vital as they offer boundary condition control over variables and produce generic outcomes. However, aligned field studies are key to ensuring that laboratory investigations and the tested boundary conditions are within the ranges found in prototype systems. Ensuring that the exchange and flux of data between these methodologies is both robust and efficient is a vital step forward to informing climate adaption strategies at the land-water interface.

The purpose of this work package is therefore to develop and standardise methods and protocols for exchange of data between field observations and laboratories. We will use case studies that require collection of field and laboratory data, and that are of value in relation to work packages 8 and 9 in the other JRAs, to establish data flux standards. Through exchanging and sharing data across common problems in both the field and flume we intend to demonstrate, and where necessary improve further, the effectiveness of flume studies for climate change adaption investigations. This will be achieved through a detailed quantification of the extent to which complex boundary conditions (e.g. biota and mixed sediment sizes), changes in forcing (e.g. storminess and sea level rise) and resultant processes in prototype field sites can be readily transposed to the laboratory. Ultimately, this will increase the utility of physical modelling data (both existing and newly generated) for both policy makers and environmental managers, due to the inherent quantification, and possible reduction, in the uncertainties of laboratory model results from this exercise.

The focus of this task will be to measure both in the field and model in the laboratory suspended sediment transport dynamics in complex environments and under changing forcing (i.e. environmental conditions). We will investigate the following:

  • The effect of seagrass patchiness on suspended sediment concentration and wave attenuation in the laboratory and at a field prototype (Links to 9.3 and 9.4)
  • Simulate in the laboratory and in the field the impact of fauna on changes to suspended sediment concentration (links to 8.4 and 8.5)
  • The impact of suspended sediment concentration on eel grass health and behaviour, and the implications for sediment suspension hydraulics (links to 8.4)
  • Through applying the same measurement techniques across field and flume studies we will quantify the differences and thus improve and refine data flux (Links to 9.1)
  • Engage with stakeholders to understand how best to use the data produced from models and highlight the robustness of the data produced via the exercise above.

Task 10.5 Data flux between the laboratory and numerical simulation

Effective links between laboratory-based data and numerical models are essential to allow long-term predictions on the influence of climate on complex aquatic systems. The majority of parameterisations within models are based on laboratory data and successfully simulating laboratory scaled experiments with a model drives the benchmarking process that underpins tests of model skill. Indeed, such comparisons are the base for the validation and verification process that is pervasive in the numerical modelling community. However, the process of linking (and comparing) laboratory data to numerical models is difficult, particularly in regards to ensuring that input boundary conditions match appropriately, the scaling and resolution of measurements in the laboratory and the translation to model time base and/or grid size, which act to further complicate this data flux. Moreover, often laboratory models developed for one specific purpose could be utilised by modellers targeting other objectives if due consideration had been given to base data and parameter quantification during the experiments. We will develop new methods, protocols and standards regarding the validation and verification of models and develop specific guidance for laboratory modellers that will ensure that during the design phase the use of the results in future modelling is explicitly considered to ensure that experiments map to the needs of the modelling community and the duplication of similar experiments is minimised. These protocols will be tested using wave flume data from one participant with numerical models from another.

Task 10.6 Knowledge transfer

This work package will develop freely available interactive tools for knowledge transfer to the wider community, consisting of visualisation tools for the display of meta-data and interactive tools for the transfer of expertise. The two tasks are described below.

Task 10.6.1 will develop tolls for interrogating data stored in the common format and displaying meta-data. This will allow potential users to display information (such as instrument position) for different tests, helping them to choose suitable data for use or re-use.

Task 10.6.2 will develop modular interactive tools, which go beyond classical books or scripts and provide easier and more efficient transfer of expertise on subjects such as setting up an experiment or understanding and communicating data. Spreadsheet or web-based tools will be developed because of their ubiquity (See also WP4 & 5). It is most important that the tools are systematically organized and self-explaining but also contain different levels of presentation and information in order to address the different user groups and provide clear options and links for those users who are interested in additional information about the theoretical background of the applied formulae, models, techniques and methodologies. A modular structure will allow for easy updates.

Task 10.8 Management and Dissemination

Clear communication of the project aims, activities, progress and results will be important for engaging with end the wider communities targeted: students, early-career researchers, scientists / engineers, industry and policy makers. This task will cover the organisation of progress meetings and workshops (including agenda and minutes) reporting to the EC, the population of web pages (within the HYDRALAB+ website) the monitoring of deliverables, dissemination to the wider international community and monitoring of publications.


 Back