Platform |
File |
Linux (x86_64) Static
Executable |
We provide two versions build on two
different Linux machines snptest_v2.4.1_Linux_x86_64_static.tgz snptest_v2.4.1_Linux_x86_64_static2.tgz |
Linux (x86_64) Dynamic
Executable |
snptest_v2.4.1_Linux_x86_64.tgz |
Linux (i686) |
snptest_v2.4.1_Linux_i686_dynamic.tgz snptest_v2.4.1_Linux_i686_static.tgz |
Mac OS X 10.4-10.7.3 Intel |
snptest_v2.4.1_MacOSX_Intel.tgz |
Solaris 5.10 (AMD Opterons) |
snptest_v2.4.0_Solaris5.10_Opteron.tgz |
SLES 10 (Intel Itanium2) |
snptest_v2.4.0_Linux_ia64.tgz |
Windows MS-DOS (Intel) |
snptest_v2.4.0_Windows_Intel.tgz |
tar zxvf snptest_v2.4.0_Linux_x86_64.tgz |
./snptest
-help |
./snptest
\ -summary_stats_only \ -data ./example/cohort1.gen ./example/cohort1.sample ./example/cohort2.gen ./example/cohort2.sample \ -o ./example/ex.out |
id |
SNP ID (taken from input files) |
rsid |
RS ID of the SNP (taken from
input files) |
chromosome |
A 2-letter chromosome
identifier (if SNPTEST
can determine it) or the value NA. See the section
on chromosomes. |
pos |
Base pair position of the SNP |
allele_A
allele_B |
The two alleles at the SNP.
allele_A is coded 0 and allele_B is coded 1. |
average_maximum_posterior_call |
The average maximum posterior
probability across all individuals in the sample that are
used for the test at each SNP.This is a measure of how much
uncertainty there is at each SNP. Samples excluded
will be (a) those excluded using the -exclude_samples
option, (b) samples with a missing phenotype or covariate
relevant to the test, (c) samples without genotypes if the
-method threshold option is used, (d) samples where the sum
of the genotype probabilities is less than 0.1. |
info |
A measure of the observed
statistical information for the estimate of allele frequency
of the SNP using all individuals in the sample that are used
for the test at each SNP. This measure has a maximum value
of 1 that indicates that perfect information. Samples
excluded will be (a) those excluded using the -exclude_samples option, (b)
samples with a missing phenotype or covariate relevant to
the test, (c) samples without genotypes if the -method threshold option is
used, (d) samples where the sum of the genotype
probabilities is less than the value set by the option -total_prob_limit (default
0.1). |
cohort_1_AA
cohort_1_AB cohort_1_BB cohort_1_NULL |
Counts of AA, AB, BB and NULL
genotypes in the 1st cohort. See Note below which details
exactly how genotype counts are calculated in SNPETST v2. |
cohort_2_AA cohort_2_AB cohort_2_BB cohort_2_NULL | Counts of AA, AB, BB and NULL genotypes for the 2nd cohort (see details above). Subsequent cohorts will be included in a similar way. See Note below which details exactly how genotype counts are calculated in SNPETST v2. |
all_AA all_AB all_BB all_NULL | Counts of AA, AB, BB and NULL thresholded genotypes across all cohorts. See Note below which details exactly how genotype counts are calculated in SNPETST v2. |
all_maf |
Minor allele frequencies (MAF)
in the combined controls, combined cases and combined across
all cohorts. |
missing_data_proportion |
The proportion of missing data
across all cohorts. |
controls_AA controls_AB controls_BB controls_NULL | Counts of AA, AB, BB and NULL genotypes across all case cohorts. See Note above which details exactly how genotype counts are calculated in SNPETST v2. |
cases_AA cases_AB cases_BB cases_NULL | Counts of AA, AB, BB and NULL genotypes across all case cohorts. See Note above which details exactly how genotype counts are calculated in SNPETST v2. |
cases_maf
controls_maf |
Minor allele frequencies (MAF)
in the controls and cases across all cohorts. |
het_OR het_OR_lower het_OR_upper | Estimated odds ratios and lower
and upper 95% confidence limits for the heterozygote
genotype AB versus the (baseline) AA genotype. |
hom_OR hom_OR_lower hom_OR_upper | Estimated odds ratios and lower
and upper 95% confidence limits for the homozygote genotype
BB versus the (baseline) AA genotype. |
all_OR, all_OR_lower all_OR_upper | Estimated allelic odds ratios
and lower and upper 95% confidence limits for the B allele
versus the (baseline) A allele. |
-pheno
<name>
|
This specifies which phenotype
you wish to test. The <name> should match one of the
phenotypes in the sample file. If the phenotype in the
sample file is binary (B) then a case-control test is
carried out. If the phenotypes in the sample file is
continuous (P) then a quantitative trait test (i.e. F-test
for a linear model) is carried out. See FILE
FORMAT WEBPAGE for more details about how to specify a
phenotype in the sample file. If no phenotype is specified
then the first phenotype in the sample file is used. |
-frequentist
<t1>...<tn>
|
This option controls the model you wish to test at each SNP versus a model of no association. The five different models are coded as 1=Additive, 2=Dominant, 3=Recessive, 4=General and 5=Heterozygote. When using this option the output file will have a column for each test that contains the p-value for the test as well as estimates of the model parameters (beta's) and their standard errors. SNPTEST codes allele_A as 0 and allele_B as 1 and this defines the meaning of the beta's and there se's. For example, when using the additive model the beta estimates the increase in log-odds that can be attributed to each copy of allele_B. When a model cannot be fitted to the data the p-value is set to -1. |
-quantile_normalise_phenotypes |
Quantile normalize the
phenotypes. This is done AFTER
samples have been excluded. |
-use_raw_phenotypes |
By default phenotypes are mean
centered and scaled to have variance 1. This feature can be
turned off with this option. |
-method threshold
|
Use thresholded genotypes. The
calling threshold is controlled by the flag -call_thresh. The default
calling threshold is 0.9. This is the same as the default
option in previous versions. |
-method expected |
Use expected genotype counts
(aka genotype dosages). |
-method score |
Use a missing data likelihood
score test. This is equivalent to the -proper option in previous
versions, except that if the score test experiences problems
at a SNP (usually due to a rare SNP and/or high uncertainty)
then -method em is used for this
SNP. |
-method ml |
Use multiple Newton-Raphson
iterations to estimate the parameters in the missing data
likelihood for the model. |
-method em |
Use an EM algorithm to estimate
the parameters in the missing data likelihood for the model. |
-renorm
|
The methods described above
to deal with genotype uncertainty were developed for the use
with imputed SNPs. This implies that the genotype
probabilities will sum to 1. If probabilistic genotype calls
from an algorithm like CHIAMO are used then the
probabilities might sum to less than one and any left over
probability is the probability of a NULL call. The -renorm option renormalizes
the genotype probabilities to sum to 1. The default is not
to renormalize the probablities unless the -method expected option is
chosen in which case it is automatically turned on. |
-total_prob_limit
<x>
|
There is an internal lower
limit set on the sum of genotype probabilities. The default
is 0.1. If this threshold is not met then that genotype is
not included in the test. This protects against SNPs with a
high proportion of NULL genotypes. |
<phenotype_name(s)> | The name (or names if -mpheno is used) of the phenotypes used in the test. |
<test_type> | frequentist or bayesian |
<genetic_model> | add, dom, rec, gen or het |
<covariate_name(s)> | The name (or names) of the covariates being conditioned upon in the test |
<summary_measure> | One of pvalue, info, beta_X, se_X or log10_bf depending on the column |
./snptest \ -data ./example/cohort1.gen ./example/cohort1.sample ./example/cohort2.gen ./example/cohort2.sample \ -o ./example/ex.out \ -frequentist 1 \ -method score \ -pheno bin1 |
./snptest \ -data ./example/cohort1.gen ./example/cohort1.sample ./example/cohort2.gen ./example/cohort2.sample \ -o ./example/ex.out \ -method score \ -frequentist 1 \ -pheno pheno1 |
-bayesian
<t1>...<tn>
|
This option controls the model you wish to test at each SNP versus a model of no association. The five different models are coded as 1=Additive, 2=Dominant, 3=Recessive, 4=General and 5=Heterozygote. When using this option the output file will have a column for each test that contains the log10 Bayes Factor for the test as well as posterior mean estimates of the model parameters (beta's) and their standard errors. SNPTEST codes allele_A as 0 and allele_B as 1 and this defines the meaning of the beta's and there se's. For example, when using the additive model the beta estimates the increase in log-odds that can be attributed to each copy of allele_B. A Bayes factor will always be calculated at a SNP. |
Model |
Linear
Predictor |
Priors |
Default |
Coding |
Command line
option |
Additive |
log(pi/(1-pi))
=
µ
+
ßGi |
µ~N(a0, a12)
ß~N(b0, b12) |
a0=0, a1=1 b0=0, b1=0.2 |
Gi is the additive
coding of the SNP i.e. AA -> 0, AB ->1, BB -> 2. |
-prior_add a0 a1 b0 b1 |
Dominant |
log(pi/(1-pi))
=
µ
+
ßDi |
µ~N(a0, a12)
ß~N(b0, b12) |
a0=0, a1=1
b0=0, b1=0.5 |
Di is the dominant
coding of the SNP i.e. AA -> 0, AB -> 1, BB -> 1. |
-prior_dom a0 a1 b0 b1 |
Recessive |
log(pi/(1-pi))
=
µ
+
ßRi |
µ~N(a0, a12)
ß~N(b0, b12) |
a0=0, a1=1 b0=0, b1=0.5 |
Ri is the recessive
coding of the SNP i.e. AA -> 0, AB -> 0, BB -> 1. |
-prior_rec a0 a1 b0 b1 |
General |
log(pi/(1-pi)) = µ + ßGi + qHi | µ~N(a0, a12)
ß~N(b0, b12) q~N(c0, c12) |
a0=0, a1=1 b0=0, b1=0.2 c0=0, c1=0.5 |
Gi is the additive
coding of the SNP i.e. AA -> 0, AB ->1, BB -> 2. Hi is the heterozygote coding of the SNP i.e. AA -> 0, AB ->1, BB -> 0. |
-prior_gen a0 a1 b0 b1 c0 c1 |
Heterozygote |
log(pi/(1-pi))
=
µ
+
ßHi |
µ~N(a0, a12)
ß~N(b0, b12) |
a0=0, a1=1 b0=0, b1=0.5 |
Hi is the
heterozygote coding of the SNP i.e. AA -> 0, AB ->1, BB -> 0. |
-prior_het a0 a1 b0 b1 |
-t_prior
|
Specfies the use of t-distribution priors on the genetic effects. Effectively, this option modifies the priors described in the table above i.e. the mean and variance of the t-distributions are specified by the options given in the table above, but the normal distributon is replaced by the t-distribution. NOTE : a t-distribution is only used for the genetic effects i.e. the parameters ß and q in the models above. For example, -bayesian add -t_prior would specify the linear predictor log(pi/(1-pi)) = µ + ßGi and the priors would be µ~N(a0, a12) and ß~t(b0, b12, df = 3). |
-t_df
<x> |
The degrees of freedom
parameter of the t-distribution.
The default value is 3. When this parameter is set very
large the prior converges to the normal distribution
prior. |
./snptest \ -data ./example/cohort1.gen ./example/cohort1.sample ./example/cohort2.gen ./example/cohort2.sample \ -o ./example/ex.out \ -bayesian 1 \ -method score \ -pheno bin1 |
./snptest \ -data ./example/cohort1.gen ./example/cohort1.sample ./example/cohort2.gen ./example/cohort2.sample \ -o ./example/ex.out \ -bayesian 1 \ -method expected \ -pheno pheno1 \ -prior_qt_mean_b 0 \ -prior_qt_V_b 0.02 \ -prior_qt_a 3 \ -prior_qt_b 2 |
Model
name |
Model |
Priors |
Command
line options needed |
Additive |
yi = ßGi + ei, ei ~ N(0, σ2) | ß~N(b0, Vßσ2)
σ2 ~ IG(a,b) |
-prior_qt_mean_b b0
-prior_qt_V_b Vß -prior_qt_a a -prior_qt_a b |
Dominant |
yi = ßDi + ei, ei ~ N(0, σ2) | ß~N(b0, Vßσ2)
σ2 ~ IG(a,b) |
-prior_qt_mean_b b0
-prior_qt_V_b Vß -prior_qt_a a -prior_qt_a b |
Recessive |
yi = ßRi + ei, ei ~ N(0, σ2) | ß~N(b0, Vßσ2)
σ2 ~ IG(a,b) |
-prior_qt_mean_b b0
-prior_qt_V_b Vß -prior_qt_a a -prior_qt_a b |
General |
yi = ßGi + qHi + ei, ei ~ N(0, σ2) | ß~N(b0, Vßσ2)
ß~N(b1, Vqσ2) σ2 ~ IG(a,b) |
-prior_qt_mean_b b0
-prior_qt_V_b Vß -prior_qt_mean_q b1 -prior_qt_V_q Vq -prior_qt_a a -prior_qt_a b |
Heterozygote |
yi = ßHi + ei, ei ~ N(0, σ2) | ß~N(b0, Vßσ2)
σ2 ~ IG(a,b) |
-prior_qt_mean_b b0
-prior_qt_V_b Vß -prior_qt_a a -prior_qt_a b |
-mean_bf <w1>...<wn>
|
Specify that a log10 Bayes
factor for a weighted average over the models specified by -bayesian with weights
given by
<w1>....<wn>. For example, -bayesian 1 4 -mean_bf 9 1
would calculate a Bayes factor for a weighted average of the
additive and general models where the additive model is
given weight 9 and the general model is given weight 1. The
log10 Bayes factor will be written in a column with the
label mean_bf. |
./snptest \ -data ./example/cohort1.gen ./example/cohort1.sample ./example/cohort2.gen ./example/cohort2.sample \ -o ./example/ex.out \ -bayesian 1 \ -method expected \ -mpheno pheno1 pheno2 \ -prior_qt_mean_b 0 \ -prior_qt_V_b 0.02 \ -prior_mqt_c 6 \ -prior_mqt_Q 4 |
-cov_names
<name_1> ... <name_n> |
Condition upon the covariates
in the sample files with names name_1,...., name_n. |
-cov_all |
Condition upon all the
covariates in the sample files. |
-cov_all_discrete |
Condition upon all the discrete
covariates (D) in the sample files. |
-cov_all_continuous |
Condition upon all the
continuous covariates (C) in the sample files. |
-condition_on
<snp_1>
<model_1>
...
<snp_n> <model_n> |
Condition upon a list of SNPs
with IDs given by snp_1,...,snp_n. For each SNP a list of models can be supplied; the choices are add, dom, rec, het, or gen. Here "gen" is shorthand for "add het", i.e. condition on additive and heterozygote dosages. If no model is supplied, the default "add" is used. These covariates are internally added to the sample file as continuous (type C) covariates and appear in the covariate summary in the screen output. |
./snptest \ -data ./example/cohort1.gen ./example/cohort1.sample ./example/cohort2.gen ./example/cohort2.sample \ -o ./example/ex.out \ -frequentist 1 \ -method score \ -pheno bin2 \ -cov_names cov1 |
./snptest \ -data ./example/cohort1.gen ./example/cohort1.sample ./example/cohort2.gen ./example/cohort2.sample \ -o ./example/ex.out \ -frequentist 1 \ -method score \ -pheno bin1 \ -cov_names cov3 cov4 |
./snptest \ -data ./example/cohort1.gen ./example/cohort1.sample ./example/cohort2.gen ./example/cohort2.sample \ -o ./example/ex.out \ -frequentist 1 \ -method score \ -pheno bin1 \ -condition_on RSID_10 add RSID_20 gen |
./snptest \ -data ./example/cohort1.gen ./example/cohort1.sample ./example/cohort2.gen ./example/cohort2.sample \ -o ./example/ex.out \ -frequentist 1 \ -method score \ -pheno bin1 \ -range 20000-30000 40000-50000 |
./snptest \ -data ./example/cohort1.gen ./example/cohort1.sample ./example/cohort2.gen ./example/cohort2.sample \ -o ./example/ex.out \ -frequentist 1 \ -method score \ -pheno bin1 \ -snpid RSID_4 SNPID_7 |
./snptest \ -data ./example/cohort1.gen ./example/cohort1.sample ./example/cohort2.gen ./example/cohort2.sample \ -o ./example/ex.out \ -frequentist 1 \ -method score \ -pheno bin1 \ -exclude_snps ./example/snps.list |
./snptest \ -data ./example/cohort1.gen ./example/cohort1.sample ./example/cohort2.gen ./example/cohort2.sample \ -o ./example/ex.out \ -frequentist 1 \ -method score \ -pheno bin1 \ -exclude_samples ./example/samples.list |
./snptest \ -data ./example/cohort1.gen ./example/cohort1.sample ./example/cohort2.gen ./example/cohort2.sample \ -o ./example/ex.out \ -frequentist 1 \ -method score \ -pheno bin1 \ -miss_thresh 0.01 |
./snptest \ -data ./example/cohort1.gen ./example/cohort1.sample ./example/cohort2_partial.gen ./example/cohort2.sample \ -overlap \ -o ./example/ex.out \ -frequentist 1 \ -method score \ -pheno bin1 \ |
-hwe |
This will produce an output file with columns that contain the p-values for an exact test of HWE in each cohort. If a test for a binary phenotype is carried out then HWE for all the case individuals and all the control individuals are also reported. |
-chunk
<x>
|
The program works by reading in, analyzing and writing output for chunks of the data at a time. This option is included to control the maximum amount of RAM used by the program at any one time. The default chunk size is 100 SNPs. |
-log
<filename> |
Copy all screen output to the
specified log file. |
qctool \ -g example/cohort1.gen \ -og example/cohort1.bgen \ -force |
./snptest \ -data example/cohort1.bgen example/cohort1.sample \ -o example/ex.out \ -frequentist 1 \ -method score \ -pheno bin2 |
qctool \ -g example/cohort1_#.gen \ -og example/cohort1_#.bgen \ -force |
./snptest \ -data example/cohort1.vcf example/cohort1.sample \ -genotype_field GT \ -o example/ex.out \ -frequentist 1 \ -method score \ -pheno bin2 |
2.4.0 |
13.04.2012 |
|
2.3.0 |
16.12.2011 |
This release can be found here.
|
2.2.0 |
07.12.2010 |
This release can be found here. This is a substantial update on the previous version that implements a number of new features
|
2.1.1 |
01.04.2010 |
Minor update. This release can
be found here. |
2.1.0 |
19.03.2010 |
This is major change to SNPTEST
from previous versions. Please read the following carefully
|
1.1.5 |
28.05.2008 |
This release can be found here
|