Pathway data updated in MyGene.info

by Chunlei Wu

Just a quick highlight of our recent update on all pathway data in MyGene.info API. Thanks to the excellent pathway/interaction database from ConsensusPathDB, we updated all 10 pathway sub-fields (listed below) for three species (human, mouse and yeast). The underlying data were updated up to Jan 2017 (ConsensusPathDB release 32). Here are some more details.

Pathway data from MyGene.info

With MyGene.info API, users can access 10 pathway related fields, nested under "pathway" field:

field in MyGene.info # of genes with the pathway field
previous version current version
pathway.biocarta 1386 1414
pathway.humancyc 1175 1191
pathway.kegg 16281 16495
pathway.mousecyc 1225 1225
pathway.pharmgkb 878 877
pathway.pid 2618 2656
pathway.reactome 14797 17170
pathway.smpdb 1047 1064
pathway.wikipathways 11095 11955
pathway.yeastcyc 641 641

As you can tell, Reactome and WikiPathways are the two still actively updated, while the rest of pathway DBs are pretty much unchanged. Minor changes are mostly due to the changes of underlying gene annotations. These DBs are either no longer updated or closed up due to the license restrictions.

Query examples with pathway data

  • Retrieve specific pathway field(s) for a given gene
  curl 'http://mygene.info/v3/gene/1017?fields=pathway'
  curl 'http://mygene.info/v3/gene/1017?fields=pathway.reactome'
  curl 'http://mygene.info/v3/gene/1017?fields=pathway.reactome,pathway.wikipathways'

Or using our mygene Python client:

   import mygene
   mg = mygene.MyGeneinfo()
   mg.getgene(1017, fields='pathway)
   mg.getgene(1017, fields='pathway.reactome')
   mg.getgene(1017, fields='pathway.reactome,pathway.wikipathways')

Batch query is possible using POST:

curl -X POST \
  http://mygene.info/v3/gene \
  -H 'content-type: application/x-www-form-urlencoded' \
  -d 'ids=1017,1018&fields=pathway.reactome'

or

mg.getgenes([1017,1018], fields='pathway.reactome')
  • Query pathways for matching genes
# Return genes from Reactome pathway "R-HSA-3700989" (Transcriptional Regulation by TP53):
curl 'http://mygene.info/v3/query?q=pathway.reactome.id:R-HSA-3700989'
# or in Python:
mg.query('pathway.reactome.id:R-HSA-3700989')

# By default, top 10 genes are returned, but you can use size and from parameter to page through all matching genes:
curl 'http://mygene.info/v3/query?q=pathway.reactome.id:R-HSA-3700989&size=500'
# or in Python:
mg.query('pathway.reactome.id:R-HSA-3700989', size=500)

curl 'http://mygene.info/v3/query?q=pathway.reactome.id:R-HSA-3700989&size=50&from=50'
# or in Python:
mg.query('pathway.reactome.id:R-HSA-3700989', size=50, skip=50)

# Return matching genes from any wikipathways pathways mention "apoptosis":
curl 'http://mygene.info/v3/query?q=pathway.wikipathways.name:apoptosis&fields=pathway.wikipathways,name,symbol'
# or in Python:
mg.query('pathway.wikipathways.name:apoptosis', fields='pathway.wikipathways,name,symbol')

As always, feel free to reach us at help@mygene.info or @mygeneinfo if you have any questions or feedback.