Wherefore art thou mouse dbSNP VCF file?
Posted by Pedja Grujic on Dec 16, 2011Every once in a while I come across a problem that surprises me. Getting a VCF file for the Mouse genome (mm9) is one of those problems. We use the GATK extensively internally, and it has standardized around the VCF format (rightfully so), so when validating, annotating, and recalibrating variants, one requires a VCF file.
Up until recently, we had been focusing on human exome re-sequencing, but now have several mouse exome projects on going here at EdgeBio. So we needed a mm9 dbSNP VCF file to integrate into our exome pipline using the GATK. Easy right? Not really. I assumed there was one available from NCBI dbSNP FTP download since there is one for human. Nope. OK, well then, I can just download the txt files from NCBI and convert to VCF right? Nope. They don't have any genotype information. One could also download the associated XML files, and try and munge the two files into a VCF. Maybe as a last resort. Maybe. I was familiar with VariantstoVCF from the GATK, a walker to convert several file types to VCF. They have a bed to vcf converter, but any download from the UCSC table browser variant track didn't contain the genotypes either. Not very helpful.
I then came across vcfutils.pl which is part of the samtools package. It can take the raw txt UCSC database dump and convert it to VCF. Worked like a charm. I assume this will work with any genome at UCSC. SO, if you need a VCF file for mouse (mm9) you can:
wget http://hgdownload.cse.ucsc.edu/goldenPath/mm9/database/snp128.txt.gz
gunzip snp128.txt.gz
vcfutils.pl ucscsnp2vcf snp128.txt > snp128.vcf
Tags
Categories
Archives
- April 2013 (1)
- February 2013 (1)
- January 2013 (1)
- December 2012 (1)
- November 2012 (7)
- October 2012 (3)
- September 2012 (1)
- August 2012 (3)
- June 2012 (2)
- May 2012 (2)
- April 2012 (6)
- March 2012 (3)
- February 2012 (4)
- January 2012 (4)
- December 2011 (2)
- November 2011 (3)
- October 2011 (3)
- September 2011 (2)
- August 2011 (1)
- June 2011 (4)
- May 2011 (1)
- November 2010 (2)
- October 2010 (1)
- September 2010 (3)
- August 2010 (2)
