This function merge all the genotyping files associated to one specific sample into one file. That merged VCF file will be saved in a specified directory and will have the name of the sample. It will also be compressed (bzip). The function will merge the files for all samples present in the input directory.
groupChr1KGSNV(pathGenoChr, pathOut)
a character
string representing the path where
the genotyping files for each sample and chromosome are located. The path
must contains sub-directories (one per chromosome) and the genotyping files
must be present in those sub-directories.
The path must exists.
a character
string representing the path where
the merged genotyping files for each sample will be created.
The path must exists.
The integer 0L
when successful or FALSE
if not.
## Path to the demo vcf files in this package
dataDir <- system.file("extdata", package="RAIDS")
pathGenoTar <- file.path(dataDir, "demoGenoChr", "demoGenoChr.tar")
## Path where the chromosomes files will be located
pathGeno <- file.path(tempdir(), "tempGeno")
dir.create(pathGeno, showWarnings=FALSE)
## Untar the file that contains the VCF files for 3 samples split by
## chromosome (one directory per chromosome)
untar(tarfile=pathGenoTar, exdir=pathGeno)
## Path where the output VCF file will be created is
## the same where the split VCF are (pathGeno)
## The files must not exist
if (!file.exists(file.path(pathGeno, "NA12003.csv.bz2")) &&
!file.exists(file.path(pathGeno, "NA12004.csv.bz2")) &&
!file.exists(file.path(pathGeno, "NA12005.csv.bz2"))) {
## Return 0 when successful
## The files "NA12003.csv.bz2", "NA12004.csv.bz2" and
## "NA12005.csv.bz2" should not be present in the current directory
groupChr1KGSNV(pathGenoChr=pathGeno, pathOut=pathGeno)
## Validate that files have been created
file.exists(file.path(pathGeno, "NA12003.csv.bz2"))
file.exists(file.path(pathGeno, "NA12004.csv.bz2"))
file.exists(file.path(pathGeno, "NA12005.csv.bz2"))
}
#> [1] TRUE
## Remove temporary directory
unlink(pathGeno, recursive=TRUE, force=TRUE)