IMPUTE

IMPUTE is a program for estimating ("imputing") unobserved genotypes in SNP association studies. The program is designed to work seamlessly with the output of the genotype calling program CHIAMO and the population genetic simulator HAPGEN, and it produces output that can be analyzed using the program SNPTEST. There are currently three different versions of the IMPUTE software available for download: version 0.5 implements the methodology described in Marchini et al. (2007); version 1 is essentially the same as version 0.5, with a couple of added features; and version 2 implements a major extension that was introduced in Howie et al. (2009). The situations in which each version of the program can be applied are discussed below.

Version 0.5
Version 1
Version 2
Registration and Updates
References
Contact Information



Version 0.5 (top)

IMPUTE v0.5 has now been superseded by IMPUTE v1.0, although we are keeping the website and software available for posterity. The description of IMPUTE v1 below is equally applicable to IMPUTE v0.5.

Read more about IMPUTE v0.5

Download IMPUTE v0.5


Version 1 (top)



IMPUTE v1 is designed to be used with a reference panel of known haplotypes, such as those provided by the International HapMap Project, and a study sample genotyped at a subset of the SNPs in the reference panel. IMPUTE v1 fills in missing genotypes (shown as red ?'s in the figure above) by extrapolating linkage disequilibrium patterns from the reference panel to the study individuals. This analysis scheme is referred to as Scenario A by Howie et al. (2009) and in the figure above.

The basic method underlying IMPUTE v1 (which was described by Marchini et al. [2007]) has been widely used to improve power in genome-wide association studies, although until recently the software implementing this method was called IMPUTE v0.X.Y, where X is an integer between 1 and 5. We are still supporting and developing IMPUTE v1, and we expect it to be a useful tool for years to come. The method does have some limitations, however, and these led us to design a major revision of the modeling framework and software; this revised approach is implemented in IMPUTE v2, which is described below.

Read more about IMPUTE v1

Download IMPUTE v1


Version 2 (top)



IMPUTE v2 is based on the same population genetic model as IMPUTE v1, but IMPUTE v2 embeds this model in a more flexible statistical framework. This framework allows IMPUTE v2 to increase accuracy (by using more of the information in the data) and to handle a broader variety of imputation datasets.

One important kind of dataset to which IMPUTE v2 can be applied is depicted above. In this example, we still want to impute the missing genotypes in a set of study individuals (those in the bottom panel), but we now have two different reference panels that can inform the imputation: a set of known haplotypes (top panel), as in Scenario A, and a set of unphased genotypes (middle panel) observed at a subset of the SNPs in the haplotype panel. This kind of dataset, which is becoming increasingly common, is referred to as Scenario B by Howie et al. (2009).

IMPUTE v2 is uniquely suited to handle Scenario B in a unified, integrated analysis framework. It can also be applied in other imputation datasets that pose problems for IMPUTE v1, including: Full details of IMPUTE v2, including computational considerations and accuracy comparisons with other imputation programs, are provided in Howie et al. (2009). We are still actively developing the method and would appreciate any feedback that you would care to provide.

Read more about IMPUTE v2

Download IMPUTE v2


Registration and Updates (top)

To sign up for e-mail reminders about updates to all versions of IMPUTE, just fill out the registration form.


References (top)

[1] J. Marchini, B. Howie, S. Myers, G. McVean and P. Donnelly (2007) A new multipoint method for genome-wide association studies via imputation of genotypes. Nature Genetics 39: 906-913 [Free Access PDF] [Supplementary Material] [News and Views Article]

[2] B. N. Howie, P. Donnelly and J. Marchini (2009) A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genetics 5(6): e1000529 [Open Access Article]


Contact Information (top)

If you have any questions regarding the use of these programs, please send an e-mail to both of the following people:

Dr. Bryan Howie (
howie <at> stats <dot> ox <dot> ac <dot> uk).
Dr. Jonathan Marchini ( marchini <at> stats <dot> ox <dot> ac <dot> uk).

It is a good idea to include a copy of the screen output (which is printed to the ./summary file) with your e-mail to help us identify any problems.