Marine metagenomics is a rapidly expanding field with the potential to provide new insight into the biodiversity of marine organisms. However, along with the expansion comes a massive increase in metagenomics data (the EMBL-EBI metagenomics portal in 2015 analysed 3 trillion nucleotides - an increase of 98% as compared to the previous year). Unless there is a sustainable data infrastructure and analysis pipelines to manage this data deluge, the impact of marine metagenomics on academic research and industrial innovation will be limited.
ELIXIR Norway and EMBL-EBI carried out in 2014-2015 a pilot action to gain more insight into these challenges working towards harmonising their existing metagenomics pipelines: EMBL-EBI’s metagenomics portal and META-pipe operated by ELIXIR Norway. While the two platforms have different foci and specific strengths - EMBL-EBI’s metagenomics portal provides insights into the functional and phylogenetic diversity within any sample, Norwegian META-pipe focuses on marine bioprospecting – it is clear that the marine research community benefits greatly from harmonisation of the two platforms.
The main goal of the pilot was to ensure interoperability between the two platforms and to explore the potential of EMBL-EBI Embassy Cloud technology in marine metagenomics. As the number of metagenomics projects and data increased exponentially during the course of the project, its focus in the final stages shifted towards understanding how to transfer and replicate data, optimise pipelines algorithms, and assess the need for compute and storage capacities.
Throughout the project, the EMBL-EBI and ELIXIR Norway’s pipelines were harmonised and enhanced by several new applications. The conclusions also include recommendations for developing ELIXIR services for marine metagenomics in four areas: standardisation of metagenomics data, establishing marine metagenomics databases, gold standard pipelines for metagenomics analysis, and exploring HPC and storage technologies.
The outcomes of the pilot feed into the ELIXIR-EXCELERATE Marine metagenomics Use Case and will help define the requirements and specifications for ELIXIR metagenomics infrastructure. The results will be also published in ELIXIR Reports, the ELIXIR Channel on the F1000Research publication platform.