Wherefore art thou mouse dbSNP VCF file?

Every once in a while I come across a problem that surprises me. Getting a VCF file for the Mouse genome (mm9) is one of those problems. We use the GATK extensively internally, and it has standardized around the VCF format (rightfully so), so when validating, annotating, and recalibrating variants, one requires a VCF file.

DNANexus Hosts SRA on Google Cloud

One of the cooler announcements out of ICHG/ASHG 2011 this year in Montreal is that DNANexus, a cloud based NGS analysis platform, will be mirroring the SRA, concurrently hosted by NCBI.

Variant Calling on Ion Torrent Data

Variant calling, the detection of SNPs and INDELs, plays a particularly important role in Ion Torrent data due to its propensity for homopolymer errors.  It’s particularly challenging to sort the insertions and deletions that occur through sequencing errors from true differences from the reference sequence.  A variant calling plugin is included in the Ion Torrent analysis pipeline that assists in the identification of SNPs and INDELs.  Utilizing SAMtools the plugin produces a variant sample report using settings adjusted to match the error model in Ion Torrent data.  Using recently sequenced E. coli DH10B data from a 316 chip and an artificially mutated E. coli genome the variant analysis plugin settings were compared to other samtools settings to try to find settings that produce the most true variants while avoiding false positives.  This mutated genome and a genome comprised of only homopolymer errors were also used to compare Illumina’s MiSeq technology to Ion Torrent in terms of variant analysis.

