========
 README
========

1,000 Genomes reference data (official release; Mar2010 genotypes; Jun2010 haplotypes) download for use with IMPUTE version 2. This download includes the following files:

1. Haplotype files (pilot1.jun2010.b36.[panel].chr[chr].official.haps) -- These files are composed entirely of 0's and 1's. Each column is a haplotype, and each row is a SNP. The true alleles (A/C/G/T) underlying the 0/1 coding are shown in the corresponding legend files. Pass the haplotype file to the -h argument of IMPUTE2 on the command line.

2. Legend files (pilot1.jun2010.b36.[panel].chr[chr].official.legend) -- Each file has one row for each line in the corresponding haplotype file, plus a header line. The second column gives the chromosomal position of a SNP; IMPUTE2 uses this information to match SNPs across data panels. The third and fourth columns specify which of a SNP's alleles is coded '0' and which is coded '1' (respectively) in the haplotypes file.  Pass the legend file to the -l argument of IMPUTE2 on the command line.

3. Sample file (pilot1.jun2010.b36.[panel].samples) -- The entries in this file specify the samples from which the haplotypes in the .haps files were obtained. All of these sample IDs appear twice in succession, denoting that two haplotypes were obtained from a single diploid individual. The sample file is intended mainly for your information since IMPUTE2 does not use it directly.

4. Genetic map files (genetic_map_chr[chr]_combined_b36.txt) -- Each chromosome has a genetic map file containing LD-based recombination rates in HapMap format. IMPUTE2 requires this file in order to fit its model. Pass the genetic map file to the -m argument of IMPUTE2 on the command line.
