Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 7 additions & 15 deletions R/bambu-extendAnnotations-utilityExtend.R
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ filterTranscriptsByAnnotation <- function(rowDataCombined, annotationGrangesList
} else if(is.null(NDR)) {
NDR <- 0.5
}
filterSet <- (rowDataCombined$NDR <= NDR | rowDataCombined$readClassType == "equal:compatible")
filterSet <- ((!is.na(rowDataCombined$NDR) & rowDataCombined$NDR <= NDR) | rowDataCombined$readClassType == "equal:compatible")
lowConfidenceTranscripts <- combindRowDataWithRanges(
Comment on lines +115 to 116
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change alters the edge-case semantics for NDR = 1 vs NDR = 0.999 by excluding transcripts with NDR = NA (subset / min.sampleNumber-filtered). There isn’t currently a regression test covering this behavior (i.e., that NDR = 1 no longer re-includes subset/low-confidence transcripts, while NDR = 0.999 matches previous results). Consider adding a test that runs isore.extendAnnotations() with NDR = 1 and asserts subset/low-confidence transcripts are still excluded (e.g., via metadata(... )$lowConfidenceTranscripts / counts).

Copilot uses AI. Check for mistakes.
rowDataCombined[!filterSet,],
exonRangesCombined[!filterSet])
Expand Down Expand Up @@ -224,7 +224,7 @@ calculateNDROnTranscripts <- function(combinedTranscripts, useTxScore = FALSE){
} else {
combinedTranscripts$NDR <- calculateNDR(combinedTranscripts$maxTxScore, equal)
}
combinedTranscripts$NDR[combinedTranscripts$maxTxScore==-1] <- 1
combinedTranscripts$NDR[combinedTranscripts$maxTxScore==-1] <- NA
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing filtered transcripts' NDR from 1 to NA means downstream thresholding logic must be NA-safe. In setNDR() (same file), the includeRef = FALSE branch builds toRemove/toAdd without !is.na(...), so transcripts with NDR = NA in metadata(extendedAnnotations)$lowConfidenceTranscripts can yield NA logical indices and break or mis-subset GRanges/GRangesList objects. Update those conditions to treat NA as FALSE (mirror the includeRef = TRUE branch by adding !is.na(mcols(...)$NDR) guards).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a similar comment as copilot, if set NDR equal to NA, then following commands related to NDR need to be NA aware. Need to check on this, and make sure all following commands pass.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than this, different scenarios have been tested, and all works as expected:
testing for isore.extendAnnotations reported set of transcripts are expectedly corrected to have subsetTranscripts kept even if rm.subsetTx = TRUE and all filtering pre-set thresholds
testing for whole bambu shows that quantification results impacted, cause reporting set of transcripts changed at last

Also one thing is that this fix does not only fix the override of return.subsetTx and min.SampleNumber, and also other base filtering, maybe can consider to make this more explicit

return(combinedTranscripts)
}

Expand Down Expand Up @@ -827,15 +827,14 @@ addGeneIdsToReadClassTable <- function(readClassTable, distTable,
#' @description This function train a model for use on other data
#' @param extendedAnnotations A GRangesList object produced from bambu(quant = FALSE) or rowRanges(se)
#' @param NDR The maximum NDR for novel transcripts to be in extendedAnnotations (0-1). If not provided a recommended NDR is calculated.
#' @param includeRef A boolean which if TRUE will also filter out reference annotations based on their NDR
#' @param prefix A string which determines which transcripts are considered novel by bambu and will be filtered (by default = 'Bambu')
#' @param baselineFDR a value between 0-1. Bambu uses this FDR on the trained model to recommend an equivilent NDR threshold to be used for the sample. By default, a baseline FDR of 0.1 is used. This does not impact the analysis if an NDR is set.
#' @param defaultModels a bambu trained model object that bambu will use when fitReadClassModel==FALSE or the data is not suitable for training, defaults to the pretrained model in the bambu package
#' Output - returns a similiar GRangesList object with entries swapped into or out of metadata(extendedAnnotations)$lowConfidenceTranscripts
#' @details
#' @return extendedAnnotations with a new NDR threshold
#' @export
setNDR <- function(extendedAnnotations, NDR = NULL, includeRef = FALSE, prefix = 'Bambu', baselineFDR = 0.1, defaultModels2 = defaultModels){
setNDR <- function(extendedAnnotations, NDR = NULL, prefix = 'Bambu', baselineFDR = 0.1, defaultModels2 = defaultModels){
#Check to see if the annotations/gtf are dervived from Bambu
if(is.null(mcols(extendedAnnotations)$NDR)){
warning("Annotations were not extended by Bambu (or the wrong prefix was provided). NDR can not be set")
Expand All @@ -852,17 +851,10 @@ setNDR <- function(extendedAnnotations, NDR = NULL, includeRef = FALSE, prefix =
message("Recommending a novel discovery rate (NDR) of: ", NDR)
}

#If reference annotations should be filtered too (note that reference annotations with no read support arn't filtered)
if(includeRef){
toRemove <- (!is.na(mcols(extendedAnnotations)$NDR) & mcols(extendedAnnotations)$NDR > NDR)
toAdd <- !is.na(mcols(metadata(extendedAnnotations)$lowConfidenceTranscripts)$NDR) &
mcols(metadata(extendedAnnotations)$lowConfidenceTranscripts)$NDR <= NDR
} else {
toRemove <- (mcols(extendedAnnotations)$NDR > NDR &
grepl(prefix, mcols(extendedAnnotations)$TXNAME))
toAdd <- (mcols(metadata(extendedAnnotations)$lowConfidenceTranscripts)$NDR <= NDR &
grepl(prefix, mcols(metadata(extendedAnnotations)$lowConfidenceTranscripts)$TXNAME))
}
toRemove <- (mcols(extendedAnnotations)$NDR > NDR &
grepl(prefix, mcols(extendedAnnotations)$TXNAME))
toAdd <- (mcols(metadata(extendedAnnotations)$lowConfidenceTranscripts)$NDR <= NDR &
grepl(prefix, mcols(metadata(extendedAnnotations)$lowConfidenceTranscripts)$TXNAME))

temp <- c(metadata(extendedAnnotations)$lowConfidenceTranscripts[!toAdd], extendedAnnotations[toRemove])
extendedAnnotations <- c(extendedAnnotations[!toRemove], metadata(extendedAnnotations)$lowConfidenceTranscripts[toAdd])
Expand Down
4 changes: 4 additions & 0 deletions R/bambu.R
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,10 @@ bambu <- function(reads, annotations = NULL, genome = NULL, NDR = NULL,
if(mode == "fusion"){
NDR <- 1
fusionMode <- TRUE
if(is.null(opt.discovery)) opt.discovery <- list()
opt.discovery$remove.subsetTx <- FALSE
opt.discovery$min.readCount <- 1
opt.discovery$min.sampleNumber <- 0
}
if(mode == "debug"){
verbose <- TRUE
Expand Down
Loading