A mask is applied to amplified or deleted segments as tabulated in segTable. A decision whether to mask a segment is taken based on what portion of the segment is covered by the mask. A position is chosen at random within a segment to be masked, the flanking segments are extended to that position and the segment to be masked is indicated as such in the value returned.

applyCNPmask(
  segTable,
  chrom,
  startPos,
  endPos,
  startProbe,
  endProbe,
  eventIndex,
  maskTable,
  maskChrom,
  maskStart,
  maskEnd,
  maskIndex,
  minCover = 1,
  indexVals = c(-1, 1)
)

Arguments

segTable

a matrix or a data.frame with columns named or enumerated by the values of chrom, startPos, endPos, startProbe, endProbe, eventIndex.

chrom

a character string specifying the name for the column in segTable tabulating the (integer) chromosome number for each segment.

startPos

a character string or integer specifying the name or number of columns in segTable that tabulates the (integer) genomic start coordinate of each segment.

endPos

a character string or integer specifying the name or number of columns in segTable that tabulates the (integer) genomic end coordinate of each segment.

startProbe

a character string specifying the names of columns in segTable that tabulates the (integer) start postion of each segment in internal units such as probe numbers for data of CGH microarray origin.

endProbe

a character string specifying the names of columns in segTable that tabulates the (integer) end postion of each segment in internal units such as probe numbers for data of CGH microarray origin.

eventIndex

a character string giving the name of a column in segTable where copy number variation status of the segments is tabulated.

maskTable

a matrix or a data.frame with columns named or enumerated as given by maskChrom, maskStart, maskEnd, maskIndex and with rows corresponding to genomic intervals that comprise the mask.

maskChrom

a character string or integer specifying the name or number of columns in maskTable that tabulates the chromosome number of the intervals comprising the mask.

maskStart

a character string or integer specifying the name or number of columns in maskTable that tabulates the genomic start coordinates of the intervals comprising the mask.

maskEnd

a character string or integer specifying the name or number of columns in maskTable that tabulates the genomic end coordinates of the intervals comprising the mask.

maskIndex

a numeric vector corresponding to eventIndex, specifying copy number events status for measuring units.

minCover

a numeric value specifying the minimal portion of the segment that must be covered by the mask in order to trigger masking. Default: 1.

indexVals

a numeric vector of length 2 specifying the two values in maskIndex to be matched with values in eventIndex to determine the events that are to be masked. Default: c(-1, 1).

Value

a matrix with same number of observations/rows as segTable and with following three columns:

  • StartProbe an numeric, used as integer, for the start position of the segments after masking.

  • EndProbe an numeric, used as integer for the end position of the segments after masking.

  • toremove an numeric vector used as integer whose values are 1 if the segment is masked and 0 otherwise.

Details

Masking is performed separately for each value in indexVals. Segments (rows of segTable) with that value of eventIndex are examined for coverage by mask intervals with that same value of maskIndex in maskTable. If the coverage is at least minCover, the segment is slated for masking, while its flanking segments are extended to a random point within the segment being masked.

Author

Alexander Krasnitz

Examples

## Load datasets data(segexample) data(ratexample) data(normsegs) data(cnpexample) ## Create a table with segment information (table of copy number events) segtable <- CNpreprocessing(segall = segexample[segexample[,"ID"] == "WZ1",], ratall = ratexample, idCol = "ID", startCol = "start", endCol = "end", chromCol = "chrom", bpStartCol = "chrom.pos.start", bpEndCol = "chrom.pos.end", blsize = 50, minJoin = 0.25, cWeight = 0.4, bsTimes = 50, chromRange = 1:22, modelNames = "E", normalLength = normsegs[,1], normalMedian = normsegs[,2]) ## Add an eventIndex column to segtable that identifies the ## amplication (marked as 1) and deletion (marked as -1) events eventIndex <- rep(0, nrow(segtable)) eventIndex[segtable[,"marginalprob"] < 1e-4 & segtable[,"negtail"] > 0.999 & segtable[,"mediandev"] < 0] <- -1 eventIndex[segtable[,"marginalprob"] < 1e-4 & segtable[,"negtail"] > 0.999 & segtable[,"mediandev"] > 0] <- 1 segtable <- cbind(segtable, eventIndex) ## Create a mask table using amplification and deletion regions as input namps17 <- cnpexample[cnpexample[,"copy.num"] == "amp",] aCNPmask <- makeCNPmask(imat=namps17, chromCol=2, startCol=3, endCol=4, nProf=1203, uThresh=0.02, dThresh=0.008) ndels17 <- cnpexample[cnpexample[,"copy.num"] == "del",] dCNPmask <- makeCNPmask(imat=ndels17, chromCol=2, startCol=3, endCol=4, nProf=1203, uThresh=0.02, dThresh=0.008) maskTable <- rbind(cbind(aCNPmask, cnpindex=1), cbind(dCNPmask, cnpindex=-1)) ## Apply a mask to a table of copy number events myCNPtable <- applyCNPmask(segTable=segtable, chrom="chrom", startPos="chrom.pos.start", endPos="chrom.pos.end", startProbe="start", endProbe="end", eventIndex="eventIndex", maskTable=maskTable, maskChrom="chrom", maskStart="start", maskEnd="end", maskIndex="cnpindex", minCover=0.005, indexVals=c(-1, 1)) ## Show some results tail(myCNPtable)
#> StartProbe EndProbe toremove #> [85,] 79696 81843 0 #> [86,] 81844 81854 0 #> [87,] 81855 81873 0 #> [88,] 81874 82039 0 #> [89,] 82040 82055 0 #> [90,] 82056 83055 0