The function calculates the AUROC of the inferences for specific values of D and K using the inferred ancestry results from the synthetic profiles. The calculations are done on each super-population separately as well as on all the results together.

computeSyntheticROC(
  matKNN,
  matKNNAncestryColumn,
  pedCall,
  pedCallAncestryColumn,
  listCall = c("EAS", "EUR", "AFR", "AMR", "SAS")
)

Arguments

matKNN

a data.frame containing the inferred ancestry results for fixed values of D and K. On of the column names of the data.frame must correspond to the matKNNAncestryColumn argument.

matKNNAncestryColumn

a character string representing the name of the column that contains the inferred ancestry for the specified synthetic profiles. The column must be present in the matKNN argument.

pedCall

a data.frame containing the information about the super-population information from the 1KG GDS file for profiles used to generate the synthetic profiles. The data.frame must contained a column named as the pedCallAncestryColumn argument. The row names must correspond to the sample identifiers (mandatory).

pedCallAncestryColumn

a character string representing the name of the column that contains the known ancestry for the reference profiles in the Reference GDS file. The column must be present in the pedCall argument.

listCall

a vector of character strings representing the list of all possible ancestry assignations. Default: c("EAS", "EUR", "AFR", "AMR", "SAS").

Value

list containing 3 entries:

matAUROC.All

a data.frame containing the AUROC for all the ancestry results.

matAUROC.Call

a data.frame containing the AUROC information for each super-population.

listROC.Call

a list containing the output from the roc function for each super-population.

Author

Pascal Belleau, Astrid Deschênes and Alexander Krasnitz

Examples


## Loading demo dataset containing pedigree information for synthetic
## profiles and known ancestry of the profiles used to generate the
## synthetic profiles
data(pedSynthetic)

## Loading demo dataset containing the inferred ancestry results
## for the synthetic data
data(matKNNSynthetic)

## The inferred ancestry results for the synthetic data using
## values of D=6 and K=5
matKNN <- matKNNSynthetic[matKNNSynthetic$K == 6 & matKNNSynthetic$D == 5, ]

## Compile statistics from the
## synthetic profiles for fixed values of D and K
results <- RAIDS:::computeSyntheticROC(matKNN=matKNN,
    matKNNAncestryColumn="SuperPop",
    pedCall=pedSynthetic, pedCallAncestryColumn="superPop",
    listCall=c("EAS", "EUR", "AFR", "AMR", "SAS"))

results$matAUROC.All
#>   pcaD K   ROC.AUC ROC.CI  N NBNA
#> 1    5 6 0.6883929      0 52    0
results$matAUROC.Call
#>   pcaD K Call         L       AUC         H
#> 1    5 6  EAS 0.5197913 0.6904762 0.8611611
#> 2    5 6  EUR 0.4807257 0.6547619 0.8287981
#> 3    5 6  AFR 0.8168697 0.9154135 1.0000000
#> 4    5 6  AMR 0.4009287 0.5681818 0.7354350
#> 5    5 6  SAS 0.4729463 0.6404762 0.8080061
results$listROC.Call
#> $EAS
#> 
#> Call:
#> roc.formula(formula = fCur ~ predMat[, j], ci = TRUE, quiet = TRUE)
#> 
#> Data: predMat[, j] in 42 controls (fCur 0) < 10 cases (fCur 1).
#> Area under the curve: 0.6905
#> 95% CI: 0.5198-0.8612 (DeLong)
#> 
#> $EUR
#> 
#> Call:
#> roc.formula(formula = fCur ~ predMat[, j], ci = TRUE, quiet = TRUE)
#> 
#> Data: predMat[, j] in 42 controls (fCur 0) < 10 cases (fCur 1).
#> Area under the curve: 0.6548
#> 95% CI: 0.4807-0.8288 (DeLong)
#> 
#> $AFR
#> 
#> Call:
#> roc.formula(formula = fCur ~ predMat[, j], ci = TRUE, quiet = TRUE)
#> 
#> Data: predMat[, j] in 38 controls (fCur 0) < 14 cases (fCur 1).
#> Area under the curve: 0.9154
#> 95% CI: 0.8169-1 (DeLong)
#> 
#> $AMR
#> 
#> Call:
#> roc.formula(formula = fCur ~ predMat[, j], ci = TRUE, quiet = TRUE)
#> 
#> Data: predMat[, j] in 44 controls (fCur 0) < 8 cases (fCur 1).
#> Area under the curve: 0.5682
#> 95% CI: 0.4009-0.7354 (DeLong)
#> 
#> $SAS
#> 
#> Call:
#> roc.formula(formula = fCur ~ predMat[, j], ci = TRUE, quiet = TRUE)
#> 
#> Data: predMat[, j] in 42 controls (fCur 0) < 10 cases (fCur 1).
#> Area under the curve: 0.6405
#> 95% CI: 0.4729-0.808 (DeLong)
#>