Apply a mask to a table of copy number events.

A mask is applied to amplified or deleted segments as tabulated in segTable. A decision whether to mask a segment is taken based on what portion of the segment is covered by the mask. A position is chosen at random within a segment to be masked, the flanking segments are extended to that position and the segment to be masked is indicated as such in the value returned.

applyCNPmask(
  segTable,
  chrom,
  startPos,
  endPos,
  startProbe,
  endProbe,
  eventIndex,
  maskTable,
  maskChrom,
  maskStart,
  maskEnd,
  maskIndex,
  minCover = 1,
  indexVals = c(-1, 1)
)

Arguments

segTable	a `matrix` or a `data.frame` with columns named or enumerated by the values of `chrom, startPos, endPos, startProbe, endProbe, eventIndex`.
chrom	a `character` string specifying the name for the column in `segTable` tabulating the (integer) chromosome number for each segment.
startPos	a `character` string or integer specifying the name or number of columns in `segTable` that tabulates the (integer) genomic start coordinate of each segment.
endPos	a `character` string or integer specifying the name or number of columns in `segTable` that tabulates the (integer) genomic end coordinate of each segment.
startProbe	a `character` string specifying the names of columns in `segTable` that tabulates the (integer) start postion of each segment in internal units such as probe numbers for data of CGH microarray origin.
endProbe	a `character` string specifying the names of columns in `segTable` that tabulates the (integer) end postion of each segment in internal units such as probe numbers for data of CGH microarray origin.
eventIndex	a `character` string giving the name of a column in `segTable` where copy number variation status of the segments is tabulated.
maskTable	a `matrix` or a `data.frame` with columns named or enumerated as given by `maskChrom, maskStart, maskEnd, maskIndex` and with rows corresponding to genomic intervals that comprise the mask.
maskChrom	a `character` string or `integer` specifying the name or number of columns in `maskTable` that tabulates the chromosome number of the intervals comprising the mask.
maskStart	a `character` string or `integer` specifying the name or number of columns in `maskTable` that tabulates the genomic start coordinates of the intervals comprising the mask.
maskEnd	a `character` string or `integer` specifying the name or number of columns in `maskTable` that tabulates the genomic end coordinates of the intervals comprising the mask.
maskIndex	a `numeric` `vector` corresponding to `eventIndex`, specifying copy number events status for measuring units.
minCover	a `numeric` value specifying the minimal portion of the segment that must be covered by the mask in order to trigger masking. Default: `1`.
indexVals	a `numeric` `vector` of length 2 specifying the two values in `maskIndex` to be matched with values in `eventIndex` to determine the events that are to be masked. Default: `c(-1, 1)`.

Value

a matrix with same number of observations/rows as segTable and with following three columns:

StartProbe an numeric, used as integer, for the start position of the segments after masking.
EndProbe an numeric, used as integer for the end position of the segments after masking.
toremove an numeric vector used as integer whose values are 1 if the segment is masked and 0 otherwise.

Details

Masking is performed separately for each value in indexVals. Segments (rows of segTable) with that value of eventIndex are examined for coverage by mask intervals with that same value of maskIndex in maskTable. If the coverage is at least minCover, the segment is slated for masking, while its flanking segments are extended to a random point within the segment being masked.

Author

Alexander Krasnitz

Examples


## Load datasets
data(segexample)
data(ratexample)
data(normsegs)
data(cnpexample)

## Create a table with segment information (table of copy number events)
segtable <- CNpreprocessing(segall = segexample[segexample[,"ID"] == "WZ1",],
    ratall = ratexample, idCol = "ID", startCol = "start", endCol = "end",
    chromCol = "chrom", bpStartCol = "chrom.pos.start", 
    bpEndCol = "chrom.pos.end", blsize = 50, minJoin = 0.25, cWeight = 0.4,
    bsTimes = 50, chromRange = 1:22, 
    modelNames = "E", normalLength = normsegs[,1], 
    normalMedian = normsegs[,2])

## Add an eventIndex column to segtable that identifies the 
## amplication (marked as 1) and deletion (marked as -1) events
eventIndex <- rep(0, nrow(segtable))
eventIndex[segtable[,"marginalprob"] < 1e-4 & segtable[,"negtail"] > 0.999 & 
    segtable[,"mediandev"] < 0] <- -1
eventIndex[segtable[,"marginalprob"] < 1e-4 & segtable[,"negtail"] > 0.999 &
    segtable[,"mediandev"] > 0] <- 1
segtable <- cbind(segtable, eventIndex)

## Create a mask table using amplification and deletion regions as input
namps17 <- cnpexample[cnpexample[,"copy.num"] == "amp",]
aCNPmask <- makeCNPmask(imat=namps17, chromCol=2, startCol=3, 
    endCol=4, nProf=1203, uThresh=0.02, dThresh=0.008)
ndels17 <- cnpexample[cnpexample[,"copy.num"] == "del",]
dCNPmask <- makeCNPmask(imat=ndels17, chromCol=2, startCol=3, 
    endCol=4, nProf=1203, uThresh=0.02, dThresh=0.008)
maskTable <- rbind(cbind(aCNPmask, cnpindex=1), 
    cbind(dCNPmask, cnpindex=-1))

## Apply a mask to a table of copy number events
myCNPtable <- applyCNPmask(segTable=segtable, chrom="chrom",
    startPos="chrom.pos.start", endPos="chrom.pos.end", 
    startProbe="start", endProbe="end", eventIndex="eventIndex",
    maskTable=maskTable, maskChrom="chrom", maskStart="start", 
    maskEnd="end", maskIndex="cnpindex", minCover=0.005,
    indexVals=c(-1, 1))

## Show some results
tail(myCNPtable)
#>       StartProbe EndProbe toremove
#> [85,]      79696    81843        0
#> [86,]      81844    81854        0
#> [87,]      81855    81873        0
#> [88,]      81874    82039        0
#> [89,]      82040    82055        0
#> [90,]      82056    83055        0