data.frame
containing the information related to synthetic profiles. The ancestry of the profiles used to generate the synthetic profiles must be present.R/RAIDS.R
pedSynthetic.Rd
The object is a data.frame
with 7 columns. The row names of
the data.frame
must be the profile unique identifiers.
data(pedSynthetic)
The data.frame
containing the information about the
synthetic profiles. The row names of
the data.frame
correspond to the profile unique identifiers.
The data.frame
contains 7 columns:
data.id
a character
string representing the unique
synthetic profile identifier.
case.id
a character
string representing the unique
profile identifier that was used to generate the synthetic profile.
sample.type
a character
string representing the type
of profile.
diagnosis
a character
string representing the
diagnosis of profile that was used to generate the synthetic profile.
source
a character
string representing the
source of the synthetic profile.
study.id
a character
string representing the
name of the study to which the synthetic profile is associated.
superPop
a character
string representing the
super population of the profile that was used to generate the synthetic
profile.
The data.frame
containing the information about the
synthetic profiles. The row names of
the data.frame
correspond to the profile unique identifiers.
The data.frame
contains 7 columns:
data.id
a character
string representing the unique
synthetic profile identifier.
case.id
a character
string representing the unique
profile identifier that was used to generate the synthetic profile.
sample.type
a character
string representing the type
of profile.
diagnosis
a character
string representing the
diagnosis of profile that was used to generate the synthetic profile.
source
a character
string representing the
source of the synthetic profile.
study.id
a character
string representing the
name of the study to which the synthetic profile is associated.
superPop
a character
string representing the
super population of the profile that was used to generate the synthetic
profile.
This dataset can be
used to test the computeSyntheticROC
function.
computeSyntheticROC
for calculating the AUROC of the inferences for specific values of D and K using the inferred ancestry results from the synthetic profiles
## Loading demo dataset containing pedigree information for synthetic
## profiles
data(pedSynthetic)
## Loading demo dataset containing the inferred ancestry results
## for the synthetic data
data(matKNNSynthetic)
## Retain one K and one D value
matKNN <- matKNNSynthetic[matKNNSynthetic$D == 5 & matKNNSynthetic$K == 4, ]
## Compile statistics from the
## synthetic profiles for fixed values of D and K
results <- RAIDS:::computeSyntheticROC(matKNN=matKNN,
matKNNAncestryColumn="SuperPop",
pedCall=pedSynthetic, pedCallAncestryColumn="superPop",
listCall=c("EAS", "EUR", "AFR", "AMR", "SAS"))
results$matAUROC.All
#> pcaD K ROC.AUC ROC.CI N NBNA
#> 1 5 4 0.6227679 0 52 0
results$matAUROC.Call
#> pcaD K Call L AUC H
#> 1 5 4 EAS 0.4807257 0.6547619 0.8287981
#> 2 5 4 EUR 0.4064737 0.5666667 0.7268596
#> 3 5 4 AFR 0.8168697 0.9154135 1.0000000
#> 4 5 4 AMR 0.3743226 0.5056818 0.6370411
#> 5 5 4 SAS 0.3609393 0.5047619 0.6485845
results$listROC.Call
#> $EAS
#>
#> Call:
#> roc.formula(formula = fCur ~ predMat[, j], ci = TRUE, quiet = TRUE)
#>
#> Data: predMat[, j] in 42 controls (fCur 0) < 10 cases (fCur 1).
#> Area under the curve: 0.6548
#> 95% CI: 0.4807-0.8288 (DeLong)
#>
#> $EUR
#>
#> Call:
#> roc.formula(formula = fCur ~ predMat[, j], ci = TRUE, quiet = TRUE)
#>
#> Data: predMat[, j] in 42 controls (fCur 0) < 10 cases (fCur 1).
#> Area under the curve: 0.5667
#> 95% CI: 0.4065-0.7269 (DeLong)
#>
#> $AFR
#>
#> Call:
#> roc.formula(formula = fCur ~ predMat[, j], ci = TRUE, quiet = TRUE)
#>
#> Data: predMat[, j] in 38 controls (fCur 0) < 14 cases (fCur 1).
#> Area under the curve: 0.9154
#> 95% CI: 0.8169-1 (DeLong)
#>
#> $AMR
#>
#> Call:
#> roc.formula(formula = fCur ~ predMat[, j], ci = TRUE, quiet = TRUE)
#>
#> Data: predMat[, j] in 44 controls (fCur 0) < 8 cases (fCur 1).
#> Area under the curve: 0.5057
#> 95% CI: 0.3743-0.637 (DeLong)
#>
#> $SAS
#>
#> Call:
#> roc.formula(formula = fCur ~ predMat[, j], ci = TRUE, quiet = TRUE)
#>
#> Data: predMat[, j] in 42 controls (fCur 0) < 10 cases (fCur 1).
#> Area under the curve: 0.5048
#> 95% CI: 0.3609-0.6486 (DeLong)
#>