Using IMPUTE2 with Public Reference DataIMPUTE2 can use customized reference panels (e.g., SNP genotypes from a fine-mapping study) as well as publicly available reference datasets. In the latter category, we currently recommend using a combination of reference haplotypes from the 1,000 Genomes Project and HapMap Phase 3. The 1,000 Genomes dataset provides wide coverage of the genome, in that it contains many more SNPs than the HapMap (with high enrichment for rare mutations), while HapMap 3 provides deep coverage, in that it contains a greater sampling of chromosomes from human populations. We have designed IMPUTE2 to integrate these wide and deep panels into a single analysis framework, as shown in this example.To download the data needed to impute from a combined HapMap 3 + 1,000 Genomes reference panel, please click the appropriate link under the Download packages heading below: HapMap 3 + 1,000 Genomes haplotypes (filtered) -- NCBI Build 36 --HapMap 3 files are from release #2 (Feb 2009) --1,000 Genomes files are from Pilot 1 genotypes released Mar 2010; phased haplotypes released Jun 2010 |
|
Haplotype, legend, sample, and genetic map files
These downloads contain the data needed to impute genotypes using reference panels from HapMap 3 and the 1,000 Genomes Project. Each dataset includes the latest haplotypes from the 1,000 Genomes panel of interest, along with all available HapMap 3 haplotypes, except those present in the relevant 1,000 Genomes panel. We remove these duplicate haplotypes so that the two datasets can be combined without causing "double counting" of haplotypes during imputation. Both sets of haplotypes have also been filtered to remove SNPs with apparent quality issues. When using these combined panels, you should set the To see an example command that combines HapMap 3 and 1,000 Genomes haplotypes in a single imputation analysis, go here. To see our rationale for using all HapMap 3 haplotypes together, rather than focusing on population-matched subsets, go here. To learn more about our scheme for filtering out low-quality SNPs, go here. If you prefer unfiltered 1,000 Genomes haplotypes, you can download them from here; similarly, you can download unfiltered HapMap 3 haplotypes from here. |
Download packages (warning: large files)
[CEU] [YRI] [CHB+JPT (coming soon)] |
|
NOTE: When combining datasets in an imputation analysis, you should always take great care
to ensure that they have been aligned to the same strand convention. In this case, we have already
aligned the HapMap 3 and 1,000 Genomes data to the '+' strand of the human reference sequence, and
we have removed SNPs with unresolvable strand flips between panels. Consequently, you just need to
make sure that your dataset is correctly aligned before imputing from
the combined panel.
While we prefer the reference panels linked above, we recognize that some people may want to download the original, unfiltered HapMap 3 and 1,000 Genomes datasets. These can be obtained below: |
|
1,000 Genomes haplotypes (unfiltered) -- NCBI Build 36
--1,000 Genomes files are from Pilot 1 genotypes released Mar 2010; phased haplotypes released Jun 2010 |
|
Haplotype, legend, sample, and genetic map files
These downloads contain the data needed to impute genotypes using reference panels from the 1,000 Genomes Project. The files are unfiltered, in the sense that we have not modified them from the official release versions. These haplotypes are generally of high quality, but they may contain a small fraction of poorly genotyped SNPs. When using one of these panels, you should set the |
Download packages (warning: large files)
[CEU] [YRI] [CHB+JPT (coming soon)] |
|
HapMap 3 haplotypes (unfiltered) -- NCBI Build 36
--HapMap 3 files are from release #2 (Feb 2009) |
|
Haplotype, legend, sample, and genetic map files
These downloads contain the data needed to impute genotypes using reference panels from HapMap Phase 3. The files are unfiltered, in the sense that we have only modified them minimally from the official release versions. These haplotypes are generally of high quality, but they may contain a small fraction of poorly genotyped SNPs. In HapMap 3, the most common problem is that an allele will "drop out" of the genotyping assay, thereby making every individual appear homozygous for the same allele. When using this combined panel, you should set the |
Download packages (warning: large files)
[ALL PANELS] |
|
You can also download HapMap Phase 2 haplotypes in the format used by IMPUTE2;
to access them, please click
here.
We are continually working to distribute the most up-to-date and comprehensive reference datasets available. We will post them here in IMPUTE2 format as we process them. |