Friday 30 September 2016

bioinformatics - How much Open Access Data is there in genetics?



Type of data of interest


I would like to consider




  • Genetics data (SNP, microsatelites, whole genome sequencing, RFLP, ...)

  • Genetic - phenotype data (disease-related data, QTL, etc...)

  • Sequence annotation and function

  • Transcriptomic data


I would like to include data on any living thing (including data from fossils) and not only human data. To avoid issues of semantic I would leave out epigenetic data.


Question


How much (in bytes) of such data is available in Open Access online?


Difficulties



I realize getting to such estimate might be hard and the estimate may be very inaccurate. Also, the format used for storing these data will definitely affect the relationship between information content and storage usage. But if someone can give just a rough order of magnitude, a vague intuition, it would already help. Is it a few terabytes or a few petabyte or even more?


I would welcome as well a detail of how you got to this estimate. I am particularly interested in what fraction of it is human data (if you happen to get to such fine detail).




No comments:

Post a Comment

evolution - Are there any multicellular forms of life which exist without consuming other forms of life in some manner?

The title is the question. If additional specificity is needed I will add clarification here. Are there any multicellular forms of life whic...