This function add entries related to synthetic profiles into a Profile GDS file. The entries are related to two types of information: the synthetic study and the synthetic profiles.

The study information is appended to the Profile GDS file "study.list" node. The "study.platform" entry is always set to 'Synthetic'.

The profile information, for all selected synthetic profiles, is appended to the Profile GDS file "study.annot" node. Both the "Source" and the "Sample.Type" entries are always set to 'Synthetic'.

The synthetic profiles are assigned unique names by combining: prefix.data.id.profile.listSampleRef.simulation number(1 to nbSim)

prepSynthetic(
  fileProfileGDS,
  listSampleRef,
  profileID,
  studyDF,
  nbSim = 1L,
  prefix = "",
  verbose = FALSE
)

Arguments

fileProfileGDS

a character string representing the file name of the Profile GDS file containing the information about the reference profiles used to generate the synthetic profiles.

listSampleRef

a vector of character string representing the identifiers of the selected 1KG profiles that will be used as reference to generate the synthetic profiles.

profileID

a character string representing the profile identifier present in the fileProfileGDS that will be used to generate synthetic profiles.

studyDF

a data.frame containing the information about the study associated to the analysed sample(s). The data.frame must have those 2 columns: "study.id" and "study.desc". Those 2 columns must be in character strings (no factor). Other columns can be present, such as "study.platform", but won't be used.

nbSim

a single positive integer representing the number of simulations per combination of sample and 1KG reference. Default: 1L.

prefix

a single character string representing the prefix that is going to be added to the name of the synthetic profile. The prefix enables the creation of multiple synthetic profile using the same combination of sample and 1KG reference. Default: "".

verbose

a logical indicating if messages should be printed to show how the different steps in the function. Default: FALSE.

Value

0L when successful.

Author

Pascal Belleau, Astrid Deschênes and Alexander Krasnitz

Examples


## Required library
library(gdsfmt)

## Path to the demo 1KG GDS file is located in this package
dataDir <- system.file("extdata/tests", package="RAIDS")

## Temporary Profile GDS file
fileNameGDS <- file.path(tempdir(), "ex1.gds")

## Copy the Profile GDS file demo that has been pruned and annotated
file.copy(file.path(dataDir, "ex1_demo_with_pruning_and_1KG_annot.gds"),
                 fileNameGDS)
#> [1] TRUE

## Information about the synthetic data set
syntheticStudyDF <- data.frame(study.id="MYDATA.Synthetic",
        study.desc="MYDATA synthetic data", study.platform="PLATFORM",
        stringsAsFactors=FALSE)

## Add information related to the synthetic profiles into the Profile GDS
prepSynthetic(fileProfileGDS=fileNameGDS,
        listSampleRef=c("HG00243", "HG00150"), profileID="ex1",
        studyDF=syntheticStudyDF, nbSim=1L, prefix="synthetic",
        verbose=FALSE)
#> [1] 0

## Open Profile GDS file
profileGDS <- openfn.gds(fileNameGDS)

## The synthetic profiles should be added in the 'study.annot' entry
tail(read.gdsn(index.gdsn(profileGDS, "study.annot")))
#>                     data.id case.id sample.type diagnosis    source
#> 154                 NA20908 NA20908   Reference Reference      IGSR
#> 155                 NA20872 NA20872   Reference Reference      IGSR
#> 156                 NA20906 NA20906   Reference Reference      IGSR
#> 157                 NA20875 NA20875   Reference Reference      IGSR
#> 158 synthetic.ex1.HG00243.1 HG00243   Synthetic    Cancer Synthetic
#> 159 synthetic.ex1.HG00150.1 HG00150   Synthetic    Cancer Synthetic
#>             study.id
#> 154          Ref.1KG
#> 155          Ref.1KG
#> 156          Ref.1KG
#> 157          Ref.1KG
#> 158 MYDATA.Synthetic
#> 159 MYDATA.Synthetic

## The synthetic study information should be added to
## the 'study.list' entry
tail(read.gdsn(index.gdsn(profileGDS, "study.list")))
#>           study.id                          study.desc        study.platform
#> 1           MYDATA                         Description              PLATFORM
#> 2          Ref.1KG Unrelated samples from 1000 Genomes GRCh38 1000 genotypes
#> 3 MYDATA.Synthetic               MYDATA synthetic data             Synthetic

## Close GDS file (important)
closefn.gds(profileGDS)

## Remove Profile GDS file (created for demo purpose)
unlink(fileNameGDS, force=TRUE)