Over the last ten years the amount of data produced by life science research has increased dramatically. This has created collections of data that require significant resources to maintain. We have also seen growth in the number of small, specialised data resources.
The increasing number and size of data resources means that:
- it can be challenging for researchers to find the data they need
- it has become difficult for researchers to quickly assess the quality of each data resource and choose the best ones for their needs
- it can be problematic to compare and repurpose data because it is often described and formatted in different ways
- there is increasing demand on finite resources, with implications for long-term sustainability.
To address these issues, we need to establish criteria and markers of quality for data resources, identify the key data resources to fund, and make the data resources easier to find and access.
What the Data Platform does
ELIXIR has developed a process to identify European data resources that are of fundamental importance to research in the life sciences and are committed to the long term preservation of data. These resources are called ELIXIR Core Data Resources.
The Platform also:
- Creates formal indicators and quality criteria for the Core Data Resources. These will mean that researchers can choose data resources that follow best practices, and that best suit their requirements.
- Increases the linkage between publications and the data they are based on, and also between data and other resources. Increasing these linkages helps the data resource providers demonstrate impact and obtain funding.
- Collaborates internationally on improving the long term sustainability of data resources, notably by exploring alternative funding models.
- The Platform is involved in the following Implementation Studies :
- Apple as a Model for Genomic Information Exchange
- Establishment of an ELIXIR Contextual Data Clearinghouse
- Extending open proteomics data analysis pipelines in the cloud: Additional tools and focus on scalability, supporting the dramatic growth of public proteomics data
- FAIRness of the current ELIXIR Core resources: Application (and test) of newly available FAIR metrics, and identification of steps to increase interoperability
- Integration and standardization of intrinsically disordered protein data
- Increasing Interoperability between ELIXIR Protein Structure and Sequence Resources and Expanding these Resources with 3D-Models of CATH Domains, built by SWISS-MODEL
- Integrating reference taxonomic databases for metabarcoding and metagenomics identification
- A microbial metabolism resource for Systems Biology
- Mining the proteome: Enabling automated processing and analysis of large-scale proteomics data
- Towards a distributed Ensembl
For all ELIXIR Implementation Studies, including completed ones, see the Implementation Studies page.
- Data Resources and Services, Work Package 3 within the ELIXIR-EXCELERATE project. This work includes identifying Core Data Resources and recommending Deposition Databases for life science data.
- Global collaboration: the Platform is part of an international initiative ‘to collectively support...data resources deemed essential to the work of life science researchers, educators, and innovators worldwide'.
Find out more
- Contact Rachel Drysdale (rachel.drysdale[at]elixir-europe[dot]org) for questions about the Platform’s work.
- Learn about the Core Data Resources work:
- Durinx C, McEntyre J, Appel R et al. Identifying ELIXIR Core Data Resources. F1000Research 2017, 5(ELIXIR):2422 (doi: 10.12688/f1000research.9656.2)
- Gabella C, Durinx C and Appel R. Funding knowledgebases: Towards a sustainable funding model for the UniProt use case. F1000Research 2017, 6(ELIXIR):2051 (doi: 10.12688/f1000research.12989.1)
- Learn more about the global data resource initiative:
- See the current Database Services listing. This list is updated as Nodes finalise or review their Service Delivery Plans (see How countries join).