top of page

A tutorial on conducting genome-wide association studies: Quality control and statistical analysis

Updated: Sep 21, 2021


Abstract

Objectives: Genome-wide association studies (GWAS) have become increasingly popular to identify associations between single nucleotide polymorphisms (SNPs) and phenotypic traits. The GWAS method is commonly applied within the social sciences. However, statistical analyses will need to be carefully conducted and the use of dedicated genetics software will be required. This tutorial aims to provide a guideline for conducting genetic analyses. Methods: We discuss and explain key concepts and illustrate how to conduct GWAS using example scripts provided through GitHub (https://github.com/MareesAT/GWA_tutorial/). In addition to the illustration of standard GWAS, we will also show how to apply polygenic risk score (PRS) analysis. PRS does not aim to identify individual SNPs but aggregates information from SNPs across the genome in order to provide individual-level scores of genetic risk. Results: The simulated data and scripts that will be illustrated in the current tutorial provide hands-on practice with genetic analyses. The scripts are based on PLINK, PRSice, and R, which are commonly used, freely available software tools that are accessible for novice users. Conclusions: By providing theoretical background and hands-on experience, we aim to make GWAS more accessible to researchers without formal training in the field.

Keywords: GitHub; PLINK; genome-wide association study (GWAS); polygenic risk score (PRS); tutorial.


Recent Posts

See All

Comments


© Copyright 2016-2022 

Dana-Farber Cancer Institute.

Use of MADCaP is subject to our terms of use and our privacy policy.

MADCaP Network
Dana-Farber Cancer Institute
450 Brookline Avenue
Boston, MA 02215

Supported by Dana-Farber Cancer Institute

MADCaP is grant-funded by the National Cancer Institute

NCI_Stacked_COLOR.png
img-logo-2x-new.png
bottom of page