Coding-variant Allelic Series Test

2023-05-01

Data

To run an allelic series test, there are 4 key inputs:

The example data used below were generated using the DGP function provided with the package. The data set includes 100 subjects, 300 variants, and a continuous phenotype. The true effect sizes follow an allelic series, with magnitudes proportional to c(1, 2, 3) for BMVs, DMVs, and PTVs respectively.

set.seed(101)
n <- 100
data <- AllelicSeries::DGP(
  n = n,
  snps = 300,
  beta = c(1, 2, 3) / sqrt(n),
)

# Annotations.
anno <- data$anno
head(anno)
## [1] 0 0 0 1 1 0
# Covariates.
covar <- data$covar
head(covar)
##      int       age sex        pc1        pc2        pc3
## [1,]   1 0.9227292   0 -0.9179036 -1.6327648 -0.1540658
## [2,]   1 0.5888415   0  0.9207933 -1.3529452 -1.3514130
## [3,]   1 1.2530388   1  0.6014231 -1.4208441 -1.1318932
## [4,]   1 0.8227599   0 -0.7964197  1.1976729  0.7436693
## [5,]   1 0.1778293   1 -0.5023568 -0.8110003 -1.7642698
## [6,]   1 1.3477782   0  1.2106348  0.6395943  0.4450291
# Genotypes.
geno <- data$geno
head(geno[,1:5])
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    0    0    0    0    0
## [2,]    0    0    0    0    0
## [3,]    0    0    0    0    0
## [4,]    0    0    0    0    0
## [5,]    0    0    0    0    0
## [6,]    0    1    0    0    1
# Phenotype.
pheno <- data$pheno
head(pheno)
## [1] 2.1541823 0.4310216 2.2026698 4.0619662 2.0134196 2.0927631

The example data generated by the preceding are available under vignettes/vignette_data.

Running the alleic series test

The COding-variant Allelic Series Test (COAST) is run using the COAST function. By default, p-values for the component tests, as well as the overall omnibus test (p_omni), are returned. Inspection of the component p-values is useful for determining which model(s) drove an association. In the presence case, the association was most evident via the baseline count model (p_count).

results <- AllelicSeries::COAST(
  anno = anno,
  geno = geno,
  pheno = pheno,
  covar = covar
)
show(results)
##        p_count          p_ind    p_max_count      p_max_ind    p_sum_count 
##   1.992602e-19   6.426260e-05   1.816140e-08   4.870339e-06   4.915204e-17 
##      p_sum_ind p_allelic_skat         p_omni 
##   2.274531e-06   3.105850e-07   2.381468e-18

Test options

AllelicSeries::COAST(
  anno = anno,
  geno = geno,
  pheno = pheno,
  covar = covar,
  apply_int = TRUE
)
AllelicSeries::COAST(
  anno = anno,
  geno = geno,
  pheno = pheno,
  covar = covar,
  include_orig_skato_all = TRUE,
  include_orig_skato_ptv = TRUE,
)
AllelicSeries::COAST(
  anno = anno,
  geno = geno,
  pheno = 1 * (pheno > 0),
  covar = covar,
  is_pheno_binary = TRUE
)
AllelicSeries::COAST(
  anno = anno,
  geno = geno,
  pheno = pheno,
  covar = covar,
  return_omni_only = TRUE
)
AllelicSeries::COAST(
  anno = anno,
  geno = geno,
  pheno = pheno,
  covar = covar,
  score_test = TRUE
)
AllelicSeries::COAST(
  anno = anno,
  geno = geno,
  pheno = pheno,
  covar = covar,
  weights = c(1, 2, 3)
)