Polygenic risk prediction: why and when out-of-sample prediction R2 can exceed SNP-based heritability

  • Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium

Research output: Contribution to journalArticlepeer-review

12 Citations (Scopus)

Abstract

In polygenic score (PGS) analysis, the coefficient of determination (R-2 ) is a key statistic to evaluate efficacy. R-2 is the proportion of phenotypic variance explained by the PGS, calculated in a cohort that is independent of the genome-wide association study (GWAS) that provided estimates of allelic effect sizes. The SNP-based heritability (h(2) (SNP), the proportion of total phenotypic variances attributable to all common SNPs) is the theoretical upper limit of the out-of-sample prediction R-2. However, in real data analyses R-2 has been reported to exceed h(SNP)(2), which occurs in parallel with the observation that h(SNP)(2) estimates tend to decline as the number of cohorts being meta-analyzed increases. Here, we quantify why and when these observations are expected. Using theory and simulation, we show that if heterogeneities in cohort-specific h(SNP)(2) exist, or if genetic correlations between cohorts are less than one, h(SNP)(2) estimates can decrease as the number of cohorts being meta-analyzed increases. We derive conditions when the out-of-sample prediction R-2 will be greater than h(SNP)(2) and show the validity of our derivations with real data from a binary trait (major depression) and a continuous trait (educational attainment). Our research calls for a better approach to integrating information from multiple cohorts to address issues of between-cohort heterogeneity.
Original languageEnglish
Pages (from-to)1207-1215
Number of pages10
JournalAmerican Journal of Human Genetics
Volume110
Issue number7
DOIs
Publication statusPublished - 6 Jul 2023
Externally publishedYes

Keywords

  • Explain
  • Human height
  • Large proportion

Fingerprint

Dive into the research topics of 'Polygenic risk prediction: why and when out-of-sample prediction R2 can exceed SNP-based heritability'. Together they form a unique fingerprint.

Cite this