TFE 2009-2010 (final year project)

Power of genomewide association studies

Most common human diseases with a genetic component are likely to have complex etiologies. Both family-based and population-based methods have become popular for association analysis, not in the least because of ever reducing genotyping costs.

An important concept in study design and interpretation of analysis results is power. This concept is closely related to the concept of type I error. In the context of genetic association studies, it is the probability that the test statistic indicates that the observed marker loci are near an (unobserved) disease locus. Many factors may affect power in an association study, such as disease allele frequency, marker choice, effect size, misclassification errors, phenotype errors, genotype errors, study design, linkage disequilibrium between disease and marker locus, the presence of epistatis, etc. (Gordon and Finch 2005). Whereas type I error can technically be controlled by setting appropriate thresholds for the test statistic (which is particularly relevant when thousands of markers are being tested for association), it is far less straightforward to control the power.

Many software packages or scripts are available for pre- and/or post-hoc power calculations. This project aims to perform a literature review on the latest techniques to compute power under several scenarios and a web search on available code for power calculations that can be used in the context of genomic association analysis. There is a clear need for such an overview together with a checklist of pros and cons for each available tool. Hence, when carried out well, this project can result in a first scientific publication.

See vansteen_power.pdf for references and figures.

Renseignements, Promoteur:

Kristel Van Steen (Kristel.VanSteen@ulg.ac.be)