Load R Packages
library(tidyverse) # Data wrangling
library(GenomicSEM)
library(tidyverse) # Data wrangling
library(GenomicSEM)
R, GenomicSEM, LDSC, HDL
Genetic correlation (rg) refers to the degree to which the genetic determinants of two traits overlap - the proportion of variance that two traits share due to genetic causes. A positive genetic correlation between two traits implies that the same genetic variants are influencing both traits in the same direction. Conversely, a negative genetic correlation implies that the genetic variants influencing one trait are having the opposite effect on the other trait.
LDSC: Linkage disequilibrium score regression (LDSC) leverages linkage disequilibrium (LD), the non-random association of alleles at different loci, to estimate genetic correlations between two traits. This method operates on the premise that single nucleotide polymorphisms (SNPs) with a higher count of LD partners (thus having a higher LD score) are typically more associated with a trait due to polygenicity, a condition where numerous genetic variants each exert a minor effect.
HDL: High-definition likelihood (HDL) provides genetic correlation estimates that have higher accuracy and precision compared to LDSC. HDL achives this by using a full likelihood-based method that leverages LD information across the whole genome, where as LDSC only use partial information.
## Summary statistics
= "resources/Willer2013ldl.chrall.CPRA_b37.tsv.gz"
Willer2013ldl = "resources/Graham2021ldl.chrall.CPRA_b37.tsv.gz"
Graham2021ldl = "resources/Kunkle2019load_stage123.chrall.CPRA_b37.tsv.gz"
KunkleAD = "resources/Bellenguez2022load.chrall.CPRA_b37.tsv.gz"
BellenguezAD
## LD Structure
= "resources/eur_w_ld_chr/"
ld_path
## HAPMAP3 SNPs
= "resources/w_hm3.snplist"
hm3_path
::munge(
GenomicSEMfiles = c(Willer2013ldl, Graham2021ldl, KunkleAD, BellenguezAD),
hm3 = hm3_path,
trait.names = c("Willer2013ldl", "Graham2021ldl", "KunkleAD", "BellenguezAD"),
maf.filter = 0.05,
column.names = list(
SNP='DBSNP_ID',
MAF='AF',
A1='ALT',
A2='REF',
effect='BETA',
N = "N"
), overwrite=FALSE
)
<- GenomicSEM::ldsc(
ldsc.covstruct traits = c("Willer2013ldl.sumstats.gz", "Graham2021ldl.sumstats.gz", "BellenguezAD.sumstats.gz", "KunkleAD.sumstats.gz"),
trait.names = c("Willer2013ldl", "Graham2021ldl", "BellenguezAD", "KunkleAD"),
sample.prev = c(NA, NA, 0.18, 0.37),
population.prev = c(NA, NA, 0.31, 0.31),
ld = ld_path,
wld = ld_path,
stand = TRUE
)
<- GenomicSEM::hdl(
hdl.covstruct traits = c("Willer2013ldl.sumstats.gz", "Graham2021ldl.sumstats.gz", "BellenguezAD.sumstats.gz", "KunkleAD.sumstats.gz"),
trait.names = c("Willer2013ldl", "Graham2021ldl", "BellenguezAD", "KunkleAD"),
sample.prev = c(NA, NA, 0.18, 0.37),
population.prev = c(NA, NA, 0.31, 0.31),
LD.path="resources/UKB_imputed_hapmap2_SVD_eigen99_extraction/",
method = "piecewise"
)