#### DMCA

## Population structure and cryptic relatedness in genetic association studies (2009)

Venue: | Statistical Science |

Citations: | 20 - 1 self |

### Citations

1819 | PLINK: A tool set for whole-genome association and population-based linkage analyses - Purcell, Neale, et al. |

1752 | Inference of population structure using multilocus genotype data. Genetics 155(2): 945–959 - JK, Stephens, et al. - 2000 |

455 | Genomic control for association studies
- Devlin, Roeder
- 1999
(Show Context)
Citation Context ...ationally fast method for reducing the inflation of test statistics caused by population structure or cryptic relatedness. It can be applied to data of any family structure or none. GC was developed (=-=Devlin and Roeder, 1999-=-) for the Armitage test statistic, which is asymptotically equivalent to a score statistic under logistic regression (Agresti, 2002) and, in the absence of confounding, has an asymptotic χ21 null dist... |

296 |
Categorical Data Analysis, 2nd ed
- Agresti
- 2002
(Show Context)
Citation Context ...ber of heterozygote parents na+nA, the test statistic na has a Binomial(na+nA, 1/2) null distribution, but McNemar’s statistic (na− nA)2 na + nA ,(3.3) which has an approximate χ21 null distribution (=-=Agresti, 2002-=-), is widely used instead. The TDT can be derived from the score test of a logistic regression model in which transmission is the outcome variable, and the parental genotypes are predictors (Dudbridge... |

238 | The correlation between relatives on the supposition of mendelian inheritance - Fisher - 1918 |

236 | D.: Population structure and eigenanalysis - Patterson, Price, et al. |

201 |
A unified mixed-model method for association mapping that accounts for multiple levels of relatedness
- Yu
- 2006
(Show Context)
Citation Context ...ciple positive, because of bias arising from estimation of the pl, off-diagonal entries of (2.2) can be negative, a property that has caused some authors to shun such estimators of K (Milligan, 2003; =-=Yu et al., 2006-=-; Zhao et al., 2007). Rousset (2002) also criticized the model underlying (2.1) in the context of certain population genetics models, but did not propose an alternative estimator of genetic covariance... |

183 |
That BLUP is a good thing: The estimation of random effects
- Robinson
- 1991
(Show Context)
Citation Context ... predict the phenotype under the null hypothesis, ŷ = α̂+ δ̂, where δ̂ is the best linear unbiased predictor (BLUP) of δ, which is equivalent to the empirical Bayes estimate for δ with prior (3.11) (=-=Robinson, 1991-=-). This prediction only needs to be made once for the whole data set. The next step is to use the residuals from the prediction as the outcome in a linear regression, y − ŷ = 1µ+ xβ + ε,(3.12) and te... |

177 | Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM - Spielman, McGinnis, et al. - 1993 |

161 |
Logistic disease incidence models and case-control studies
- Prentice, Pyke
- 1979
(Show Context)
Citation Context ...his is a prospective model, treating case-control status as the outcome, but inferences about β are typically the same as for the retrospective model, which is more appropriate for case-control data (=-=Prentice and Pyke, 1979-=-; Seaman and Richardson, 2004). However, in some settings ascertainment effects are not correctly modeled prospectively, and it is necessary to consider retrospective models of the type g(E[xi]) = α+ ... |

142 | Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nature Reviews Genetics - McCarthy, Abecasis, et al. - 2008 |

103 |
Linkage disequilibrium in humans: models and data
- Pritchard, Przeworski
- 2001
(Show Context)
Citation Context ...Figure 3). LD is a doubleedged sword: the stronger the LD around a causal variant, the easier it is to detect, because the greater the probability it is in high LD with at least one genotyped marker (=-=Pritchard and Przeworski, 2001-=-). However, in a region of high LD it is hard to finemap a causal variant because there will be multiple highly-correlated markers each showing a similar strength of association with the phenotype. 1.... |

79 |
Estimators for pairwise relatedness and individual inbreeding coefficients
- Ritland
- 1996
(Show Context)
Citation Context ...led study of MLEs under the Jacquard model. These MLEs can be prone to bias when the number of markers is small and can be computationally intensive to obtain particularly from genome-wide data sets (=-=Ritland, 1996-=-; Milligan, 2003). Method of moments estimators (MMEs) are typically less precise than MLEs, but are computationally efficient and can be unbiased if the ancestral allele fractions are known (Milligan... |

72 |
Interpreting principal component analyses of spatial population genetic variation
- Novembre, Stephens
- 2008
(Show Context)
Citation Context ... protecting association test statistics from inflation. As for any other regression covariate, there is an argument for only including a PC in the model if it shows an association with the phenotype (=-=Novembre and Stephens, 2008-=-; Lee, Wright and Zou, 2010). Experience seems to suggest that between 2 and 15 PCs are typically sufficient, and in large studies for which nmay be several thousand, these will correspond to a small ... |

70 | The power of genomic control - SA, Devlin, et al. |

70 |
An Arabidopsis example of association mapping in structured samples
- Zhao, MJ, et al.
- 2007
(Show Context)
Citation Context ...ecause of bias arising from estimation of the pl, off-diagonal entries of (2.2) can be negative, a property that has caused some authors to shun such estimators of K (Milligan, 2003; Yu et al., 2006; =-=Zhao et al., 2007-=-). Rousset (2002) also criticized the model underlying (2.1) in the context of certain population genetics models, but did not propose an alternative estimator of genetic covariance in actual populati... |

68 | Rosenherg N. Use of unlinked genetic markers to detect population stratification in associa- tion studies - Pritchard |

56 | Efficient control of population structure in model organism association mapping. Genetics 178 - Kang, Zaitlen, et al. - 2008 |

56 | Correlation between genetic and geographic structure in Europe - Lao, TT, et al. - 2008 |

52 | Population stratification and spurious allelic associations - Cardon, Palmer, et al. - 2003 |

52 |
Case-control studies of association in structured or admixtured populations
- Pritchard, Donnelly
- 2001
(Show Context)
Citation Context ...opulation structure, and assume that the ancestry of each individual is drawn from one or more of the “islands.” Popular software packages include ADMIXMAP (Hoggart et al., 2003) and STRUCTURE/STRAT (=-=Pritchard and Donnelly, 2001-=-; Falush, Stephens and Pritchard, 2003). These approaches model variation in ancestral subpopulation along a chromosome as a Markov process. Stratified tests for association (Clayton, 2007), such as t... |

50 | A method for quantifying differentiation between populations at multi-allelic loci and its implications for investigating identity and paternity - Balding, Nichols - 1995 |

41 |
Accurate inference of relationships in Sib-pair linkage studies
- Boehnke, Cox
- 1997
(Show Context)
Citation Context ... about kinship inherent from the lengths of genomic regions shared between two individuals from a recent common ancestor (Browning, 2008). Hidden Markov models provide one approach to account for LD (=-=Boehnke and Cox, 1997-=-; Epstein, Duren and Boehnke, 2000). In outbred populations, the IBD status along a pair of chromosomes, one taken from each of a pair of individuals in a sibling, half sib or parent-child relationshi... |

35 | Assessing the impact of population stratification on genetic association studies - Freedman, Reich, et al. - 2004 |

34 |
Demonstrating stratification in a European American population
- Campbell, Ogburn, et al.
- 2005
(Show Context)
Citation Context ...d) African Americans and estimated a similar reduction in the smallest p-values. Another study of European-Americans found a SNP in the lactase gene significantly associated with variation in height (=-=Campbell et al., 2005-=-). When the subjects were stratified according to North/West or South/East European ancestry, the association disappeared. Since we expect connections among lactase tolerance, diet and height, the ass... |

32 | Data and theory point to mainly additive genetic variance for complex traits - Hill, Goddard, et al. - 2008 |

32 | An Analysis of
- Baldwin, Steagall
- 1994
(Show Context)
Citation Context ...ver, retrospective ascertainment of individuals on the basis of phenotype, as in case-control study designs, is more common in human genetics, and we will focus on such designs here. Linkage studies (=-=Thompson, 2007-=-) form the other major class of study designs in genetic epidemiology. These seek loci at which there is correlation between the phenotype of interest and the pattern of transmission of DNA sequence o... |

29 | An Icelandic example of the impact of population structure on association studies - Helgason, Yngvadóttir, et al. - 2005 |

29 |
Control of confounding of genetic associations in stratified populations
- Hoggart
- 2003
(Show Context)
Citation Context ... methods are based on the island model of population structure, and assume that the ancestry of each individual is drawn from one or more of the “islands.” Popular software packages include ADMIXMAP (=-=Hoggart et al., 2003-=-) and STRUCTURE/STRAT (Pritchard and Donnelly, 2001; Falush, Stephens and Pritchard, 2003). These approaches model variation in ancestral subpopulation along a chromosome as a Markov process. Stratifi... |

27 | Genomewide rapid association using mixed model and regression: a fast and simple method for genomewide pedigree-based quantitative trait loci association analysis - Aulchenko, Koning, et al. - 2007 |

27 | Novel case-control test in a founder population identifies P-selectin as an atopysusceptibility locus - Bourgain, Hoffjan, et al. - 2003 |

24 |
Going the distance: human population genetics in a clinal world. Trends in Genetics 23
- Handley, Manica, et al.
- 2007
(Show Context)
Citation Context ...on often occurs in waves and is influenced by geographic and cultural factors. Such processes are expected to lead to clinal patterns of genetic variation rather than a partition into subpopulations (=-=Handley et al., 2007-=-). Modern humans are known to have evolved in Africa with the first wave of human migration from Africa estimated to have been approximately 60,000 years ago. Reflecting this history, current human ge... |

23 |
A geographically explicit genetic model of worldwide human-settlement history
- Liu, Prugnolle, et al.
- 2006
(Show Context)
Citation Context ... human migration from Africa estimated to have been approximately 60,000 years ago. Reflecting this history, current human genetic diversity decreases roughly linearly with distance from East Africa (=-=Liu et al., 2006-=-). Within Europe, Lao et al. (2008) found that the first two principal components of genome-wide genetic variation accurately reflect latitude and longitude: there is population structure at a Europe-... |

22 | Boehnke M. Improved inference of relationship for pairs of individuals - Epstein, Duren |

22 |
Inbreeding and relatedness coefficients: what do they measure? Heredity 88: 371–380
- Rousset
(Show Context)
Citation Context ...tral allele fractions are known (Milligan, 2003). Under many population genetics models, if two alleles are not IBD, then they are regarded as random draws from some mutation operator or allele pool (=-=Rousset, 2002-=-), which corresponds to the notion of “unrelated.” The kinship coefficient Kij is then a correlation coefficient for variables indicating whether alleles drawn from each of i and j are some given alle... |

21 | The estimation of pairwise relationships - Thompson - 1975 |

20 | On a semiparametric test to detect associations between quantitative traits and candidate genes using unrelated individuals - Zhang - 2003 |

19 | CR: Methods for detection of parent-of-origin effects in genetic studies of case-parents triads - Weinberg - 1999 |

18 | Confounding from cryptic relatedness in case-control association studies. PLoS Genet 1: e32
- BF, JK
- 2005
(Show Context)
Citation Context ...(1999) argued that cryptic relatedness could pose a more serious confounding problem than population structure. A subsequent theoretical investigation of plausible demographic and sampling scenarios (=-=Voight and Pritchard, 2005-=-) showed that the effect of cryptic relatedness in well-designed studies of outbred populations should be negligible, but it can be noticeable for small and isolated populations. Using pedigree and em... |

13 | Estimation of the inbreeding coefficient through use of genomic data - Leutenegger, Prum, et al. - 2003 |

12 | Some methods of estimating the inbreeding coefficient - LI, HoRvrrz - 1953 |

11 |
Likelihood-based inference for genetic correlation coefficients
- Balding
- 2003
(Show Context)
Citation Context ...ncestral population allele fraction p has subpopulation allele fractions that are independent draws from Beta ( 1−F F p, 1−F F (1− p) ) , where F is Wright’s FST , a measure of population divergence (=-=Balding, 2003-=-). In order to discriminate among the methods, we simulated a high level of population structure, F = 0.1, which is close to between-continent levels of human differentiation; this is larger than is t... |

11 |
Equivalence of prospective and retrospective models in the Bayesian analysis of case-control studies
- Seaman, Richardson
- 2004
(Show Context)
Citation Context ...l, treating case-control status as the outcome, but inferences about β are typically the same as for the retrospective model, which is more appropriate for case-control data (Prentice and Pyke, 1979; =-=Seaman and Richardson, 2004-=-). However, in some settings ascertainment effects are not correctly modeled prospectively, and it is necessary to consider retrospective models of the type g(E[xi]) = α+ yiβ,(3.2) where g is typicall... |

11 | A (2006) Genetic relatedness analysis: modern data and new challenges. Nature Reviews Genetics 7: 771–780 - Weir, Anderson, et al. |

10 |
A calculus for statistico genetics
- Cotterman
- 1940
(Show Context)
Citation Context ...ed, then just eight identity coefficients (Jacquard, 1970) are required (Figure 4). An assumption of no within-individual IBD (no inbreeding) allows these eight coefficients to be collapsed into two (=-=Cotterman, 1940-=-), specifying probabilities for the two individuals to share exactly one and two alleles IBD. POPULATION STRUCTURE 7 Fig. 4. Schematic illustration of the nine relatedness classes for two individuals,... |

8 | Gm3;5,13,14 and type 2 diabetes mellitus: an association in American Indians with genetic admixture - Knowler - 1988 |

7 |
Population association
- CLAYTON
- 2001
(Show Context)
Citation Context ...Current Address: Institute of Genetics, University College London, 5 Gower Place, London, WC1E 6BT, UK. 1. CONFOUNDING IN GENETIC EPIDEMIOLOGY 1.1 Association and Linkage Genetic association studies (=-=Clayton, 2007-=-) are designed to identify genetic loci at which the allelic state is correlated with a phenotype of interest. The associations of interest are causal, arising at loci whose different alleles have dif... |

6 | A critical evaluation of genomic control methods for genetic association studies - Dadd, Weale, et al. - 2009 |

6 | F (2011) Control of population stratification by correlation-selected principal components - Lee, FA, et al. |

5 | A kinship-based modification of the armitage trend test to address hidden population structure and small differential genotyping errors - Rakovski, Stram - 2009 |

4 |
Estimation of pairwise identity by descent from dense genetic marker data in a population sample of haplotypes
- Browning
- 2008
(Show Context)
Citation Context ...do not take account of LD between markers, nor do they exploit the information about kinship inherent from the lengths of genomic regions shared between two individuals from a recent common ancestor (=-=Browning, 2008-=-). Hidden Markov models provide one approach to account for LD (Boehnke and Cox, 1997; Epstein, Duren and Boehnke, 2000). In outbred populations, the IBD status along a pair of chromosomes, one taken ... |

4 | K (2004) Genomic control to the extreme. Nat Genet 36:1129–1130 - Devlin, Bacanu, et al. |

4 |
Linkage disequilibrium, recombination and selection,” in Handbook of statistical genetics, 3rd Edn. eds D
- McVean
- 2007
(Show Context)
Citation Context ...librium In a large, panmictic population, and in the absence of selection, pairs of genetic loci that are not tightly linked (close together on a chromosome) are unassociated at the population level (=-=McVean, 2007-=-). Fig. 3. Illustration of the role of linkage disequilibrium in generating phenotypic association with a noncausal genotyped marker due to a tightly-linked ungenotyped causal locus. Such linkage equi... |

4 |
The age of alleles
- Slatkin
- 2002
(Show Context)
Citation Context ...t is typically more precise than (2.3). Sharing a rare allele suggests closer kinship than sharing a common allele, because the rare allele is likely to have arisen from a more recent mutation event (=-=Slatkin, 2002-=-). To illustrate the increased precision of (2.2) over (2.3), we simulated 500 genetic data sets comprising 200 idealized cousin pairs (no mutation, and the alleles not IBD from the common grandparent... |

3 |
Structures génétiques des populations
- Jacquard
- 1970
(Show Context)
Citation Context ...dividuals requires 15 IBD probabilities, one for each nonempty subset of four alleles, but if we regard the pair of alleles within each individual as unordered, then just eight identity coefficients (=-=Jacquard, 1970-=-) are required (Figure 4). An assumption of no within-individual IBD (no inbreeding) allows these eight coefficients to be collapsed into two (Cotterman, 1940), specifying probabilities for the two in... |

3 | Evaluating bias due to population stratification in case-control association studies of admixed populations - Wang, Localio, et al. - 2004 |

3 | Bias correction with a single null marker for population stratification in candidate gene association studies - Wang, Localio, et al. - 2005 |

3 | Robust genomic control for association studies - Zheng, Freidlin, et al. |

2 |
Family-based association
- Dudbridge
- 2007
(Show Context)
Citation Context ...sti, 2002), is widely used instead. The TDT can be derived from the score test of a logistic regression model in which transmission is the outcome variable, and the parental genotypes are predictors (=-=Dudbridge, 2007-=-). In Section 3.3 we outline a test which can exploit between-family as well as within-family information when it is available, while retaining protection from population structure. Tiwari et al. (200... |

2 | Effect of population stratification on case-control association studies. ii. False-positive rates and their limiting behavior as number of subpopulations increases - Gorroochurn, Hodge, et al. - 2004 |

2 | The Mathematics of - Malécot - 1969 |

2 |
Whole genome association
- Morris, Cardon
- 2007
(Show Context)
Citation Context ...ossible genome-wide association studies (GWAS) which investigate most of the common genetic variation in a population, and obtain orders of magnitude finer resolution than a comparable linkage study (=-=Morris and Cardon, 2007-=-; Altshuler, Daly and Lander, 2008). GWAS are preferred for detecting common causal variants (say, population fraction > 0.05), which typically have only a weak effect on phenotype, whereas linkage st... |

2 | Y chromosome evidence for Anglo-Saxon mass migration - Thomas - 2002 |

1 |
Inferences from mixed models in quantitative genetics
- Gianola
- 2007
(Show Context)
Citation Context ...ariance of a quantitative trait into independent genetic and environmental components, and derivation of the genetic correlation of trait values of a pair of relatives assuming Mendelian inheritance (=-=Gianola, 2007-=-). It is conventional to introduce the 2 in (3.11) because 2K reduces to I in the limiting case of completely unrelated and completely outbred individuals, in which case h2 becomes inestimable. The mo... |

1 | Mapping quantitative trait loci in outbred pedigrees - Höschele - 2007 |

1 | Reply to “Genomic control to the extreme.” Nat. Genet. 36 1129–1130; author reply 1131 - Marchini, Cardon, et al. - 2004 |

1 |
Population admixture and stratification in genetic epidemiology
- McKeigue
- 2007
(Show Context)
Citation Context ... ADMIXMAP was primarily designed for admixture mapping, in which the genomes of admixed individuals are scanned for loci at which cases show an excess of ancestry from one of the founder populations (=-=McKeigue, 2007-=-). Because of the limited number of generations since the admixture event, this approach has features in common with linkage as well as association study designs. 3.5 Regression Control Wang, Localio ... |

1 | MR2391785 NHGRI GWAS Catalog (2009). A catalog of published genome-wide association studies. Available at http://www.genome.gov/gwastudies - Wiley |