TITLE:
Assessing Likelihood of Having False Positives Caused by Population Stratification
AUTHORS:
Renfang Jiang, Jianping Dong
KEYWORDS:
Population Stratification, Case-Control Studies, Linkage Disequilibrium, Genome-Wide Association Studies
JOURNAL NAME:
Open Journal of Genetics,
Vol.9 No.1,
March
28,
2019
ABSTRACT:
Population stratification is always a concern in association
analysis. There is a debate on the extent of the problem in less extreme
situations (Thomas and Witte [1], Wacholder et al. [2]). Wacholder et al. [3] and Ardlie et al. [4] showed that hidden population structure is not a serious threat to case-control designs. We propose a method of assessing the seriousness of the
population stratification before designing association studies. If population stratification
is not a serious problem, one may consider using case-control study instead of
family-based design to get more power. In a case-control design, we compare
chi-square statistics from a structured population (a union of two
subpopulations) and a homogeneous population with the same prevalence and
allele frequencies. We provide an explicit formula to calculate the chi-square
statistics from 17 parameters, such as proportions of subpopulation, allele
frequencies in subpopulations, etc. We choose these factors because they have
potential to cause false associations. Each parameter takes a random value in a
chosen range. We then calculate the likelihood of getting opposite conclusions
in the structured and the homogeneous populations. This is the likelihood of
having false positives caused by population stratification. The advantage of
this method is to provide a cost effective way to choose between using
case-control data and using family data before
actually collecting those data. We conclude
that sample sizes have a significant effect on the likelihood of false positive
caused by population stratification. The larger the sample size is, the more
likely to have false positive if the population structure is ignored. If the
sample size will be smaller than 200 by budget constraints, then case-control
study may be a better choice because of its power.