The data deluge

.

One of the biggest breakthroughs in the life sciences has been the development of DNA sequencing technologies to reveal the genetic code of life. Knowledge of DNA sequences has become indispensable for:

  • basic biological research
  • diagnostics
  • drug development
  • biotechnology
  • forensic biology
  • systems biology.

High-throughput sequencing technologies now produce billions of bases of nucleotide data per experiment, at a relatively low cost.

Modern sequencers are capable of analysing the equivalent of a human genome every 14 minutes at a cost of US $5,000. This rate is 400 times greater than in the year 2000, when the draft human genome was first published.

It is expected that further developments will result in sequencers between 1000 times and 1,000,00 times more productive over the next ten years.

Next-generation DNA sequencing has become so widely used that it is believed to be a disruptive technology - it is so much better than the technology that it replaces that users have difficulty adapting to it.

Making sense of it all

Life science research is becoming increasingly collaborative and complex, using several different technologies to understand organisms and diseases at the systems level. It is a significant challenge to integrate the wide variety of data coming out of life science experiments in meaningful, research-supportive ways.

ELIXIR’s vision for the future is to provide researchers in academia and industry with seamless access to biological information that will revolutionise discovery in the life sciences. This requires integrating data on many levels, from molecular biology to clinical practice.

The enormity of these tasks necessitates cross-border cooperation on a scale unprecedented in biological and biomedical research. It demands a coordinated pooling of resources.

ELIXIR will, for the first time, enable pan-European coordination of all scientific and technical issues related to handling the biomolecular data resources.

Data growth at EMBL-EBI

Capacity needed to store biological data at
EMBL-EBI, 1996-2010 (in terabytes - a terabyte
is a million million bytes). This trend is expected
not only to persist but to become steeper still,
posing a serious challenge to existing bioinformatics infrastructures in Europe.

 

Data growth - Guy Cochrane EMBL-EBI
Impact of high-throughput technology on data growth. © 2011 Guy Cochrane, ENA, EMBL-EBI.

 

 

 

 

 


Policy regarding use: EMBL press and picture releases, including photographs, graphics, movies and videos, are copyrighted by EMBL. They may be freely reprinted and distributed for non-commercial use via print, broadcast and electronic media, provided that proper attribution to authors, photographers and designers is made.