Beginning with this latest data release, the structure of the metadata in MyGene.info has changed slightly. It used to be that users could retrieve information on the data version number by calling the “src_version” field from the metadata json object, which would contain the version information for each data resource and looked something like this:
"src_version": {
"PantherDB": "2017-12-11",
"cpdb": "34",
"ensembl": "96",
"ensembl_fungi": "43",
"ensembl_genomic_pos_hg19": null,
…
…
"wikipedia": null
}
Similarly, the number of annotations from each resource could be retrieved from the “stats” json object.
"stats": {
"total_ensembl_genes": 31358764,
"total_ensembl_genes_mapped_to_entrez": 3209199,
"total_ensembl_only_genes": 7465711,
"total_entrez_genes": 23740786,
"total_genes": 31206497,
"total_species": 25238
}
From this update on, “src_version” has been deprecated and removed. This is because the metadata is (and has been for awhile now) contained in a “src” nested json object with each resource containing properties such as “code”, “stats”, and “version” along with the corresponding values as seen in this example:
"src": {
"PantherDB": {
"code": {
"branch": "v3",
"commit": "2a4aeca",
"folder": "src/plugins/PantherDB",
"repo": "https://github.com/biothings/mygene.info.git",
"url": "https://github.com/biothings/..."
},
"stats": {
"PantherDB": 156054
},
"version": "2017-12-11"
}
Users interested in retrieving information on the latest stats or version of the data resources in MyGene.info should adjust their code accordingly.
While we’re on the subject of the latest stats and versions...these are the most recent updates to the data:
Resource | build_version: "20190421" | build_version: "20190428" |
entrez_accession | 23623571 | 23833656 |
entrez_gene | 23740786 | 23950933 |
entrez_genomic_pos | 2596024 | 2593132 |
entrez_go | 204547 | 204413 |
entrez_refseq | 23586607 | 23796228 |
entrez_retired | 249767 | 250635 |
entrez_unigene | 544868 | 544835 |
generif | 97870 | 97929 |
entrez_ec | 19906 | 19909 |
entrez_genesummary | 27722 | 28153 |
total_ensembl_genes | 31358764 | 31568911 |
total_ensembl_genes_mapped_to_entrez | 3209199 | 3209192 |
total_ensembl_only_genes | 7465711 | 7465713 |
total_entrez_genes | 23740786 | 23950933 |
total_genes | 31206497 | 31416646 |
total_species | 25238 | 25257 |