The function estimates the allelic fraction of the SNVs for a specific profile and add the information to the associated Profile GDS file. The allelic fraction estimation method is adapted to the type of study (DNA or RNA).

estimateAllelicFraction(
  gdsReference,
  gdsProfile,
  currentProfile,
  studyID,
  chrInfo,
  studyType = c("DNA", "RNA"),
  minCov = 10L,
  minProb = 0.999,
  eProb = 0.001,
  cutOffLOH = -5,
  cutOffHomoScore = -3,
  wAR = 9,
  cutOffAR = 3,
  gdsRefAnnot = NULL,
  blockID = NULL,
  verbose = FALSE
)

Arguments

gdsReference

an object of class gds.class (a GDS file), the opened Reference GDS file.

gdsProfile

an object of class gds.class (a GDS file), the opened Profile GDS file.

currentProfile

a character string corresponding to the sample identifier as used in pruningSample function.

studyID

a character string corresponding to the name of the study as used in pruningSample function.

chrInfo

a vector of integer values representing the length of the chromosomes. See 'details' section.

studyType

a character string representing the type of study. The possible choices are: "DNA" and "RNA". The type of study affects the way the estimation of the allelic fraction is done. Default: "DNA".

minCov

a single positive integer representing the minimum required coverage. Default: 10L.

minProb

a single numeric between 0 and 1 representing the probability that the calculated genotype call is correct. Default: 0.999.

eProb

a single numeric between 0 and 1 representing the probability of sequencing error. Default: 0.001.

cutOffLOH

a single numeric representing the cutoff, in log, for the homozygote score to assign a region as LOH. Default: -5.

cutOffHomoScore

a single numeric representing the cutoff, in log, that the SNVs in a block are called homozygote by error. Default: -3.

wAR

a single positive integer representing the size-1 of the window used to compute an empty box. Default: 9.

cutOffAR

a single numeric representing the cutoff, in log score, that the SNVs in a gene are allelic fraction different 0.5 Default: 3.

gdsRefAnnot

an object of class gds.class (a GDS file), the opened Reference SNV Annotation GDS file. This parameter is RNA specific. Default: NULL.

blockID

a character string corresponding to the block identifier in gdsRefAnnot. This parameter is RNA specific. Default: NULL

verbose

a logicial indicating if the function should print message when running. Default: FALSE.

Value

The integer 0L when successful.

Details

The chrInfo parameter contains the length of the chromosomes. The length of the chromosomes can be obtain through the seqlengths library.

As example, for hg38 genome:


if (requireNamespace("GenomeInfoDb", quietly=TRUE) &&
     requireNamespace("BSgenome.Hsapiens.UCSC.hg38", quietly=TRUE)) {
     chrInfo <- GenomeInfoDb::seqlengths(BSgenome.Hsapiens.UCSC.hg38::Hsapiens)[1:25]
}

Author

Pascal Belleau, Astrid Deschênes and Alexander Krasnitz

Examples


## Required library for GDS
library(gdsfmt)

## Path to the demo 1KG GDS file located in this package
dataDir <- system.file("extdata/tests", package="RAIDS")
fileGDS <- file.path(dataDir, "ex1_good_small_1KG.gds")

## Profile GDS file for one profile
fileProfile <- file.path(tempdir(), "ex1.gds")

## Copy the Profile GDS file demo that has been pruned and annotated
## into current directory
file.copy(file.path(dataDir, "ex1_demo_with_pruning_and_1KG_annot.gds"),
                            fileProfile)
#> [1] TRUE

## Open the reference GDS file (demo version)
gds1KG <- snpgdsOpen(fileGDS)

## Profile GDS file for one profile
profileGDS <- openfn.gds(fileProfile, readonly=FALSE)

## Required library for this example to run correctly
if (requireNamespace("GenomeInfoDb", quietly=TRUE) &&
     requireNamespace("BSgenome.Hsapiens.UCSC.hg38", quietly=TRUE)) {

    ## Chromosome length information
    ## chr23 is chrX, chr24 is chrY and chrM is 25
    chrInfo <- GenomeInfoDb::seqlengths(BSgenome.Hsapiens.UCSC.hg38::Hsapiens)[1:25]

    ## Estimate the allelic fraction of the pruned SNVs
    estimateAllelicFraction(gdsReference=gds1KG, gdsProfile=profileGDS,
        currentProfile="ex1", studyID="MYDATA", chrInfo=chrInfo,
        studyType="DNA", minCov=10L, minProb=0.999, eProb=0.001,
        cutOffLOH=-5, cutOffHomoScore=-3, wAR=9, cutOffAR=3,
        gdsRefAnnot=NULL, blockID=NULL)

    ## The allelic fraction is saved in the 'lap' node of Profile GDS file
    ## The 'lap' entry should be present
    profileGDS

    ## Close both GDS files (important)
    closefn.gds(profileGDS)
    closefn.gds(gds1KG)

    ## Remove Profile GDS file (created for demo purpose)
    unlink(fileProfile, force=TRUE)

}