The new myvariant v0.3.1 python client was just released last week. This release, together with the very recent v0.3.0, includes these changes:
- a fix for MyVariantInfo.query method when using the fetch_all parameter - required to address a server upgrade.
- a new as_generator parameter to MyVariantInfo.getvariants method, which when True, will return a generator to the results (rather than all results in a list).
- a fix for get_hgvs_from_vcf helper function when the ALT column has multiple alleles in an input VCF file.
To install/upgrade to the newest version:
pip install myvariant -U
To verify you have the latest version installed:
In [1]: import myvariant
In [2]: myvariant.__version__
Out[2]: '0.3.1'
The new as_generator parameter for MyVariantInfo.getvariants method can be handy for iterating through very large lists of variants (e.g. vcf file annotation) without requiring the entire list to be held in memory at once. For example, you can annotate the ~ 4000 variants in this vcf file (1000genome vcf file for chrMT) like this:
In [3]: mv = myvariant.MyVariantInfo()
In [4]: vars = []
In [5]: for variant in mv.getvariants(
myvariant.get_hgvs_from_vcf('chrMT.vcf'),
as_generator=True):
...: vars.append(variant)
...:
querying 1-1000...done.
querying 1001-2000...done.
querying 2001-3000...done.
querying 3001-4000...done.
querying 4001-4242...done.
This only keeps 1 batch (1000) of variants in memory at a time, transparently querying the myvariant server for a new batch when the current one is finished.
Of the 4242 variants parsed from the vcf file, myvariant.info has annotations for 3603, as shown here:
In [6]: vars_found = [v for v in vars if 'notfound' not in v]
In [7]: len(vars_found)
Out[7]: 3603
In [8]: vars_found[0]
Out[8]:
{'_id': 'chrMT:g.41C>T',
'_score': 1.0,
'chrom': 'MT',
'hg19': {'end': 41, 'start': 41},
'query': 'chrMT:g.41C>T',
'snpeff': {'ann': {'effect': 'intergenic_region',
'putative_impact': 'MODIFIER'}},
'vcf': {'alt': 'T', 'position': '41', 'ref': 'C'},
'wellderly': {'alleles': [{'allele': 'C', 'freq': 0.99},
{'allele': 'T', 'freq': 0.01}],
'alt': 'T',
'chrom': 'MT',
'genotypes': [{'count': 2, 'freq': 0.01, 'genotype': 'T/T'},
{'count': 198, 'freq': 0.99, 'genotype': 'C/C'}],
'hg19': {'end': 41, 'start': 41},
'pos': 41,
'ref': 'C',
'vartype': 'snp'}}
Learn more: