As of the most recent data update on April 24th to build version 20180422, the MyGene.info db grew to contain 22,132,511 documents. As a valuable service that has seen over 20 million requests in the last 30 days, MyGene.info was fortunate enough to receive renewed support for improving its offerings.
The MyGene.info landing page will be overhauled with a sleeker, more attractive, intuitive, responsive, cohesive, and user-friendly design. The updated landing site will reflect the ongoing improvements that were made to the website's architecture in the last few months as well as the expected changes in store for MyGene.info.
What's in store for MyGene.info
MyGene.info will expand to include highly-requested annotation sources such as the species and annotations available from Ensembl Genomes. Currently, Ensemble is already one of MyGene.info's ~7+ data resources, and contributes annotations for 1.6 million genes in >80 species. The inclusion of Ensemble Genomes can potentially add annotations for over 145 million genes from thousands of bacteria, fungi, plant, metazoa, and protists species!
In addition to this large and widely requested resource, MyGene.info will also import gene annotations from several smaller, more specialized data resources with the goal of making data from all of these resources more Findable, Accessible, Interoperative, and Reusable!
A FAIRly important note
Both the Su Lab and Wu Lab (to whom the renewal grant was awarded) are strong proponents of data re-use and have a strong interest in data FAIRness. How can MyGene.info, MyVariant.info, and both labs' related efforts make existing biological data more FAIR?
MyGene.info makes gene annotation data more Findable by providing a centralized resource that enables simple community contribution. All data sources included in MyGene.info are heavily indexed using existing identifiers, allowing that data to be acccessed via our simple search API. Since the data sources are pre-integrated into gene-specific JSON objects in MyGene.info, data from the included resources are standardized in structure and Accessible via our REST-based JSON API.
As part of the BioThings APIs, gene annotation data included in MyGene.info will be more Interoperable thanks to the compatibility with Linked Open Data resources using JSON-LD and standard vocabularies. Read more about the value of interoperable data in our recent paper on Cross-linking BioThings APIs via JSON-LD. By allowing community editing of the JSON-LD context files, we'll empower the community to iteratively improve the interoperability of the data.
Lastly, MyGene.info (and its sister BioThings APIs) will continue to help make data more Reusable by providing a high-performance, continuously-updated API with no authentication, registration, or usage limits. By providing R and python client libraries and encouraging the development of 3rd party clients, MyGene.info increases the accessibility and utility of the data.
Excited about what's in store for MyGene.info? If not, check be sure to check out the BioThings paper for a taste of the possibilities that are coming!