VCF Bulk Export

This form provides filtering of existing VCF files and export into common formats. Most of the filter criteria and many of the formats are provided by VCFtools+.
Choose your VCF File.
The following table contains all the available VCF Files. Choose the one you would like to filter and export by selecting the circle at the beginning of the appropriate row.
NameAssemblyNumber of SNPs
No VCF Files available.
Specify filter criteria.
If you check this checkbox, only SNPs with 2 alleles across all individuals will be kept. For example, in the example data below, SNP Chr3p34567 would be removed.
Only include SNP calls that have at least the specified number of reads to support the call. For example, if you specify 5 for this filter then for SNP Chr2p25678 in the example table below, only the call for Germplasm4 will be set to missing data.
Only include SNP positions with a minor allele frequency greater than or equal to this value. Allele frequency is defined as the number of times an allele appears over all individuals at that site, divided by the total number of non-missing alleles at that site. For example, consider Chr1p12344 in the example table below: the minor allele frequency for A is 2/5=40%, thus if you enter 50% this SNP position will be removed.
Exclude SNPs with more than this number of missing genotypes over all individuals/germplasm. For example, if you enter 1 for this filter for the example data below, only SNP Chr4p48765 would be removed.
Exclude SNPs based on the proportion of missing data. For example, if you enter 25% for this filter then for the example data below, only SNP Chr4p48765 would be removed since it has a missing data frequency of 2/6=33%.
Example Table: Example Data for Filter Explanation.
SNP NameSNP BackboneSNP PositionGerm1Germ2Germ3Germ4Germ5Germ6

* The above example will be referred to in the description of each filter criteria to aid in the explanation of how it will affect your data. NOTE: the cell for each SNP by germplasm combination contains the call and the read depth seperated by a colon (:). For example, AA:5 means a call of AA with a read depth of 5.

Pick your Export format.
Select one of the formats listed below and the filtered VCF will be converted accordingly. Keep in mind that if you choose a format with no quality information, you should have been stringent with your filtering criteria to ensure you are working with good data.
FormatHas Quality Info?Description
A/B FormatNoAlleles are coded as A/B based on the parents. This format is only suitable for biparental crosses
Quality MatrixYesVariant by Germplasm matrix of Read Depth per call.
Variant Call Format (VCF)YesA variant by germplasm matrix with each cell containing a combination of SNP call and quality information. See the Specification for more information.
Haplotype Map (Hapmap)NoA Hapmap file is a tab-separated values(TSV) format for storing genotypic data. Hapmap format is easier to edit and handle but less informative than VCF format.
Bgzipped VCFYesAn archive containing a bgzipped VCF file and a Tabix file. This combination is required by various programs such as the R package VariantAnnotation. See the tabix manual for more information.
+ The Variant Call Format and VCFtools, Petr Danecek, Adam Auton, Goncalo Abecasis, Cornelis A. Albers, Eric Banks, Mark A. DePristo, Robert Handsaker, Gerton Lunter, Gabor Marth, Stephen T. Sherry, Gilean McVean, Richard Durbin and 1000 Genomes Project Analysis Group, Bioinformatics, 2011.