|
These options control some basic processing that the program does to prepare input data for inference. |
Flag | Default | Description |
REQUIRED |
none |
Genomic interval to use for inference, as specified by
<lower>
and
<upper>
boundaries in base pair position. The boundaries can be expressed either in long form (e.g.,
IMPUTE2 requires that you specify an analysis interval in order to prevent accidental whole-chromosome analyses. If you want to impute a region larger than 7 Mb (which is not generally recommended), you must activate the |
|
250 kb |
Length of buffer region (in kb) to include on each side of the analysis interval specified by the -int option. SNPs in the buffer regions inform the inference but do not appear in output files (unless you activate the Using a buffer region helps prevent imputation quality from deteriorating near the edges of the analysis interval. Larger buffers may improve accuracy for low-frequency variants (since such variants tend to reside on long haplotype backgrounds) at the cost of longer running times. |
|
Allows the analysis of regions larger than 7 Mb. If this flag is not activated and the analysis interval plus buffer region exceeds 7 Mb, the program will quit with an error. The rationale for this flag is described here. | |
|
Tells the program to include SNPs from the |
|
|
20000 |
"Effective size" of the population (commonly denoted as Ne in the population genetics literature) from which your dataset was sampled. This parameter scales the recombination rates that IMPUTE2 uses to guide its model of linkage disequilibrium patterns. When most imputation runs were conducted with reference panels from HapMap Phase 2, we suggested values of
11418
for imputation from HapMap CEU,
17469
for YRI, and
14269
for CHB+JPT.
Modern imputation analyses typically involve reference panels with greater ancestral diversity, which can make it hard to determine the "ideal" |
|
0.9 |
Threshold for calling genotypes in the -g file. For each individual at each SNP, the program will use the genotype with the maximum probability if that probability exceeds the threshold; otherwise, the genotype will be treated as missing.
NOTE: This threshold applies only to input genotypes. If you want to apply a calling threshold to IMPUTE2's output probabilities, you will have to do it yourself. However, it is usually not a good idea to treat imputation output this way; see the webpage of our association-testing software SNPTEST for better suggestions. |
|
# of indiv in |
Number of individuals from the -g file to include in the analysis. For example, to impute only the first five individuals, set |
|
Print detailed output about the progress of imputation. By default, IMPUTE2 prints only the number of the current MCMC iteration when performing imputation, but this flag tells it to print more detailed updates. |