R/synthetic.R
syntheticGeno.Rd
The functions uses one cancer profile in combination with one 1KG reference profile to generate an synthetic profile that is saved in the Profile GDS file.
When more than one 1KG reference profiles are specified, the function recursively generates synthetic profiles for each cancer profile + 1KG reference profile combination.
The number of synthetic profiles generated by combination is specified by the number of simulation requested.
syntheticGeno(
gdsReference,
gdsRefAnnot,
fileProfileGDS,
profileID,
listSampleRef,
nbSim = 1L,
prefix = "",
pRecomb = 0.01,
minProb = 0.999,
seqError = 0.001
)
an object of class gds.class
(a GDS file), the opened 1KG GDS file.
an object of class gds.class
(a GDS file), the opened 1KG SNV Annotation GDS file.
a character
string representing the file name
of Profile GDS file containing the information about the sample.
The file must exist.
a character
string representing the unique
identifier of the cancer profile.
a vector
of character
strings
representing the sample identifiers of the 1KG selected reference samples.
a single positive integer
representing the number of
simulations that will be generated per sample + 1KG reference combination.
Default: 1L
.
a character
string that represent the prefix that will
be added to the name of the synthetic profiles generated by the function.
Default: ""
.
a single positive numeric
between 0 and 1 that
represents the frequency of phase switching in the synthetic profiles,
Default: 0.01
.
a single positive numeric
between 0 and 1 that
represents the probability that the genotype is correct.
Default: 0.999
.
a single positive numeric
between 0 and 1
representing the sequencing error rate. Default: 0.001
.
The integer OL
when the function is successful.
## Required library
library(gdsfmt)
## Path to the demo 1KG GDS file is located in this package
dataDir <- system.file("extdata/tests", package="RAIDS")
## Profile GDS file (temporary)
fileNameGDS <- file.path(tempdir(), "ex1.gds")
## Copy the Profile GDS file demo that has been pruned and annotated
file.copy(file.path(dataDir, "ex1_demo_with_pruning_and_1KG_annot.gds"),
fileNameGDS)
#> [1] TRUE
## Information about the synthetic data set
syntheticStudyDF <- data.frame(study.id="MYDATA.Synthetic",
study.desc="MYDATA synthetic data", study.platform="PLATFORM",
stringsAsFactors=FALSE)
## Add information related to the synthetic profiles into the Profile GDS
prepSynthetic(fileProfileGDS=fileNameGDS,
listSampleRef=c("HG00243", "HG00150"), profileID="ex1",
studyDF=syntheticStudyDF, nbSim=1L, prefix="synthTest",
verbose=FALSE)
#> [1] 0
## The 1KG files
gds1KG <- snpgdsOpen(file.path(dataDir,
"ex1_good_small_1KG.gds"))
gds1KGAnnot <- openfn.gds(file.path(dataDir,
"ex1_good_small_1KG_Annot.gds"))
## Generate the synthetic profiles and add them into the Profile GDS
syntheticGeno(gdsReference=gds1KG, gdsRefAnnot=gds1KGAnnot,
fileProfileGDS=fileNameGDS, profileID="ex1",
listSampleRef=c("HG00243", "HG00150"), nbSim=1,
prefix="synthTest",
pRecomb=0.01, minProb=0.999, seqError=0.001)
#> [1] 0
## Open Profile GDS file
profileGDS <- openfn.gds(fileNameGDS)
tail(read.gdsn(index.gdsn(profileGDS, "sample.id")))
#> [1] "NA20872" "NA20906"
#> [3] "NA20875" "ex1"
#> [5] "synthTest.ex1.HG00243.1" "synthTest.ex1.HG00150.1"
## Close GDS files (important)
closefn.gds(profileGDS)
closefn.gds(gds1KG)
closefn.gds(gds1KGAnnot)
## Remove Profile GDS file (created for demo purpose)
unlink(fileNameGDS, force=TRUE)