This function generates a PCA using the know reference profiles. Them, it projects the specified profile onto the PCA axes.

computePCARefSample(
  gdsProfile,
  currentProfile,
  studyIDRef = "Ref.1KG",
  np = 1L,
  algorithm = c("exact", "randomized"),
  eigenCount = 32L,
  missingRate = NaN,
  verbose = FALSE
)

Arguments

gdsProfile

an object of class gds.class, an opened Profile GDS file.

currentProfile

a single character string representing the profile identifier.

studyIDRef

a single character string representing the study identifier.

np

a single positive integer representing the number of CPU that will be used. Default: 1L.

algorithm

a character string representing the algorithm used to calculate the PCA. The 2 choices are "exact" (traditional exact calculation) and "randomized" (fast PCA with randomized algorithm introduced in Galinsky et al. 2016). Default: "exact".

eigenCount

a single integer indicating the number of eigenvectors that will be in the output of the snpgdsPCA function; if 'eigen.cnt' <= 0, then all eigenvectors are returned. Default: 32L.

missingRate

a numeric value representing the threshold missing rate at with the SNVs are discarded; the SNVs are retained in the snpgdsPCA with "<= missingRate" only; if NaN, no missing threshold. Default: NaN.

verbose

a logical indicating if messages should be printed to show how the different steps in the function. Default: FALSE.

Value

a list containing 3 entries:

sample.id

a character string representing the unique identifier of the analyzed profile.

eigenvector.ref

a matrix of numeric representing the eigenvectors of the reference profiles.

eigenvector

a matrix of numeric representing the eigenvectors of the analyzed profile.

References

Galinsky KJ, Bhatia G, Loh PR, Georgiev S, Mukherjee S, Patterson NJ, Price AL. Fast Principal-Component Analysis Reveals Convergent Evolution of ADH1B in Europe and East Asia. Am J Hum Genet. 2016 Mar 3;98(3):456-72. doi: 10.1016/j.ajhg.2015.12.022. Epub 2016 Feb 25.

Author

Pascal Belleau, Astrid Deschênes and Alexander Krasnitz

Examples


## Required library
library(gdsfmt)

## Path to the demo Profile GDS file is located in this package
dataDir <- system.file("extdata/demoAncestryCall", package="RAIDS")

## Open the Profile GDS file
gdsProfile <- snpgdsOpen(file.path(dataDir, "ex1.gds"))

## Project a profile onto a PCA generated using reference profiles
## The reference profiles come from 1KG
resPCA <- computePCARefSample(gdsProfile=gdsProfile,
    currentProfile=c("ex1"), studyIDRef="Ref.1KG", np=1L, verbose=FALSE)
resPCA$sample.id
#> [1] "ex1"
resPCA$eigenvector
#>            [,1]      [,2]       [,3]        [,4]        [,5]        [,6]
#> ex1 -0.03917926 0.0290796 -0.1861643 -0.05760641 -0.01053691 -0.08274071
#>          [,7]       [,8]         [,9]     [,10]      [,11]       [,12]
#> ex1 0.0777924 -0.2437205 -0.008855972 0.2156765 -0.1139829 -0.08007963
#>          [,13]    [,14]     [,15]      [,16]    [,17]      [,18]     [,19]
#> ex1 -0.1452985 0.233155 0.5753156 -0.1938115 0.504467 -0.8293339 0.5437238
#>          [,20]      [,21]      [,22]     [,23]      [,24]      [,25]     [,26]
#> ex1 -0.1480745 0.03492421 -0.2146903 0.1610501 -0.3487348 -0.2806519 0.4095053
#>          [,27]     [,28]     [,29]      [,30]      [,31]      [,32]
#> ex1 -0.1480394 -1.001517 0.2316207 -0.3235428 -0.3843232 -0.3291498

## Close the GDS files (important)
closefn.gds(gdsProfile)