R/processStudy.R
computePCAMultiSynthetic.Rd
The function projects the synthetic profiles onto existing principal component axes generated using the reference 1KG profiles. The reference profiles used to generate the synthetic profiles have previously been removed from the set of reference profiles.
computePCAMultiSynthetic(
gdsProfile,
listPCA,
sampleRef,
studyIDSyn,
verbose = FALSE
)
an object of class gds.class (a GDS file), an opened Profile GDS file.
a list
containing the PCA object
generated
with the 1KG reference profiles (excluding the ones used to generate the
synthetic data set) in an entry called "pca.unrel"
.
a vector
of character
strings representing
the identifiers of the 1KG reference profiles that have been used to
generate the synthetic profiles
that are going to be analysed here. The sub-continental
identifiers are used as names for the vector
.
a character
string corresponding to the study
identifier.
The study identifier must be present in the Profile GDS file.
a logical
indicating if messages should be printed
to show how the different steps in the function. Default: FALSE
.
a list
containing 3 entries:
a vector
of character
strings representing
the identifiers of the synthetic profiles that have been projected onto
the 1KG PCA.
a matrix
of numeric
with the
eigenvectors of the 1KG reference profiles used to generate the PCA.
a matrix
of numeric
with the
eigenvectors of the synthetic profiles projected onto the 1KG PCA.
## Required library
library(gdsfmt)
## Loading demo PCA on subset of 1KG reference dataset
data(demoPCA1KG)
## Path to the demo Profile GDS file is located in this package
dataDir <- system.file("extdata/demoKNNSynthetic", package="RAIDS")
# The name of the synthetic study
studyID <- "MYDATA.Synthetic"
samplesRM <- c("HG00246", "HG00325", "HG00611", "HG01173", "HG02165",
"HG01112", "HG01615", "HG01968", "HG02658", "HG01850", "HG02013",
"HG02465", "HG02974", "HG03814", "HG03445", "HG03689", "HG03789",
"NA12751", "NA19107", "NA18548", "NA19075", "NA19475", "NA19712",
"NA19731", "NA20528", "NA20908")
names(samplesRM) <- c("GBR", "FIN", "CHS","PUR", "CDX", "CLM", "IBS",
"PEL", "PJL", "KHV", "ACB", "GWD", "ESN", "BEB", "MSL", "STU", "ITU",
"CEU", "YRI", "CHB", "JPT", "LWK", "ASW", "MXL", "TSI", "GIH")
## Open the Profile GDS file
gdsProfile <- snpgdsOpen(file.path(dataDir, "ex1.gds"))
## Projects synthetic profiles on 1KG PCA
results <- computePCAMultiSynthetic(gdsProfile=gdsProfile,
listPCA=demoPCA1KG,
sampleRef=samplesRM, studyIDSyn=studyID, verbose=FALSE)
## The eigenvectors for the synthetic profiles
head(results$eigenvector)
#> [,1] [,2] [,3] [,4] [,5]
#> 1.ex1.HG00246.1 0.06469191 -0.004653002 -0.088967175 -0.060291452 0.031812132
#> 1.ex1.HG00325.1 0.09115431 -0.004743912 -0.071907454 0.044100229 -0.041522503
#> 1.ex1.HG00611.1 0.01381479 -0.017912275 0.062489822 0.003760931 0.061062765
#> 1.ex1.HG01173.1 0.02193600 0.016635981 -0.123902643 -0.104948054 0.021495661
#> 1.ex1.HG02165.1 0.15626481 -0.003083060 -0.004768404 0.100910618 -0.002478422
#> 1.ex1.HG01112.1 0.07699686 -0.015195235 -0.102489655 -0.088759431 -0.031143964
#> [,6] [,7] [,8] [,9] [,10]
#> 1.ex1.HG00246.1 -0.087084898 -0.002319659 -0.01327109 -0.06600164 0.011422042
#> 1.ex1.HG00325.1 0.092420817 -0.150333993 0.04064972 0.06185469 0.045882592
#> 1.ex1.HG00611.1 0.006946086 0.037328105 0.01327867 -0.03629313 0.015773551
#> 1.ex1.HG01173.1 0.026114397 -0.045510755 -0.02727912 0.12362088 0.005003337
#> 1.ex1.HG02165.1 -0.213193715 -0.205975416 -0.33167160 0.01689627 0.415307845
#> 1.ex1.HG01112.1 0.140457360 -0.129621812 0.18058981 0.09950699 -0.058262338
#> [,11] [,12] [,13] [,14] [,15]
#> 1.ex1.HG00246.1 0.1135730541 0.120776205 0.05104489 0.10009894 0.08226945
#> 1.ex1.HG00325.1 -0.0213545024 0.017612760 -0.12065696 0.11956275 -0.05519839
#> 1.ex1.HG00611.1 0.0002207075 0.008803511 0.06544166 0.02462076 0.00500031
#> 1.ex1.HG01173.1 -0.0877004088 -0.023906580 -0.11278874 -0.02636037 -0.09290133
#> 1.ex1.HG02165.1 0.1492485496 -0.094433444 -0.10994232 0.06307073 0.06593791
#> 1.ex1.HG01112.1 -0.1773609745 -0.105883583 -0.15020718 0.04382861 -0.04644355
## Close Profile GDS file (important)
closefn.gds(gdsProfile)