I am searching for an un-encoded data file with common and scientific names for example of a few hundred species or tens of thousands, where I can search the common and scientific labels of organisms.
Answer
Uniprot has a list of the controlled vocabulary for common and scientific names of species listed here.
An example entry:
ACAER E 111511: N=Acanthodactylus erythrurus
C=Spanish fringe-toed lizard
S=Lacerta erythrura
In the example the N
is the scientific binomial name (Canthodactylus erythrurus), C
is the common name (Spanish fringe-toed lizard).
ACAER
is the id code, 111511
is the code for the taxonomic node, E
means it is a eukaryote, and S
is a synonym of either name.
The list contains 25336 scientific names currently, which falls short of the ~2.5m species in GBIF, or the 10s, or 100s of millions that are estimated to exist. The Uniprot list does, however, represent every organism included in Uniprot, which is widely regarded as being among the most comprehensive protein databases that exist today.
No comments:
Post a Comment