R/allelicFraction.R
estimateAllelicFraction.Rd
The function estimates the allelic fraction of the SNVs for a specific profile and add the information to the associated Profile GDS file. The allelic fraction estimation method is adapted to the type of study (DNA or RNA).
estimateAllelicFraction(
gdsReference,
gdsProfile,
currentProfile,
studyID,
chrInfo,
studyType = c("DNA", "RNA"),
minCov = 10L,
minProb = 0.999,
eProb = 0.001,
cutOffLOH = -5,
cutOffHomoScore = -3,
wAR = 9,
cutOffAR = 3,
gdsRefAnnot = NULL,
blockID = NULL,
verbose = FALSE
)
an object of class gds.class
(a GDS file), the opened Reference GDS file.
an object of class gds.class
(a GDS file), the opened Profile GDS file.
a character
string corresponding to
the sample identifier as used in pruningSample
function.
a character
string corresponding to the name of
the study as
used in pruningSample
function.
a vector
of integer
values representing
the length of the chromosomes. See 'details' section.
a character
string representing the type of study.
The possible choices are: "DNA" and "RNA". The type of study affects the
way the estimation of the allelic fraction is done. Default: "DNA"
.
a single positive integer
representing the minimum
required coverage. Default: 10L
.
a single numeric
between 0 and 1 representing the
probability that the calculated genotype call is correct.
Default: 0.999
.
a single numeric
between 0 and 1 representing the
probability of sequencing error. Default: 0.001
.
a single numeric
representing the cutoff, in log,
for the homozygote score to assign a region as LOH.
Default: -5
.
a single numeric
representing the cutoff, in
log, that the SNVs in a block are called homozygote by error.
Default: -3
.
a single positive integer
representing the size-1 of
the window used to compute an empty box. Default: 9
.
a single numeric
representing the cutoff, in
log score, that the SNVs in a gene are allelic fraction different 0.5
Default: 3
.
an object of class gds.class
(a GDS file), the opened Reference SNV Annotation GDS file.
This parameter is RNA specific.
Default: NULL
.
a character
string corresponding to the block
identifier in gdsRefAnnot
. This parameter is RNA specific.
Default: NULL
a logicial
indicating if the function should print
message when running. Default: FALSE
.
The integer 0L
when successful.
The chrInfo
parameter contains the length of the chromosomes. The
length of the chromosomes can be obtain through the
seqlengths
library.
As example, for hg38 genome:
## Required library for GDS
library(gdsfmt)
## Path to the demo 1KG GDS file located in this package
dataDir <- system.file("extdata/tests", package="RAIDS")
fileGDS <- file.path(dataDir, "ex1_good_small_1KG.gds")
## Profile GDS file for one profile
fileProfile <- file.path(tempdir(), "ex1.gds")
## Copy the Profile GDS file demo that has been pruned and annotated
## into current directory
file.copy(file.path(dataDir, "ex1_demo_with_pruning_and_1KG_annot.gds"),
fileProfile)
#> [1] TRUE
## Open the reference GDS file (demo version)
gds1KG <- snpgdsOpen(fileGDS)
## Profile GDS file for one profile
profileGDS <- openfn.gds(fileProfile, readonly=FALSE)
## Required library for this example to run correctly
if (requireNamespace("GenomeInfoDb", quietly=TRUE) &&
requireNamespace("BSgenome.Hsapiens.UCSC.hg38", quietly=TRUE)) {
## Chromosome length information
## chr23 is chrX, chr24 is chrY and chrM is 25
chrInfo <- GenomeInfoDb::seqlengths(BSgenome.Hsapiens.UCSC.hg38::Hsapiens)[1:25]
## Estimate the allelic fraction of the pruned SNVs
estimateAllelicFraction(gdsReference=gds1KG, gdsProfile=profileGDS,
currentProfile="ex1", studyID="MYDATA", chrInfo=chrInfo,
studyType="DNA", minCov=10L, minProb=0.999, eProb=0.001,
cutOffLOH=-5, cutOffHomoScore=-3, wAR=9, cutOffAR=3,
gdsRefAnnot=NULL, blockID=NULL)
## The allelic fraction is saved in the 'lap' node of Profile GDS file
## The 'lap' entry should be present
profileGDS
## Close both GDS files (important)
closefn.gds(profileGDS)
closefn.gds(gds1KG)
## Remove Profile GDS file (created for demo purpose)
unlink(fileProfile, force=TRUE)
}