Rna.normalize_reads#
missionbio.mosaic.rna.Rna.normalize_reads
- Rna.normalize_reads(correct_background: bool = False, negative_control_genes: Optional[Sequence[str]] = None, bg_method: str = 'max', min_reads_per_cell: int = 50, min_cells_detec_gene: float = 0.01, max_fraction: float = 0.5, exclude_highly_expressed: bool = False, use_subtraction: bool = True) None#
Normalize raw RNA counts
Function performs multi-step normalization: 1. Filters cells based on total read counts. 2. Filters genes based on detection rate across cells. 3. Normalizes counts using per-cell size factors (median scaling). 4. Optionally excludes highly expressed genes from size factor computation. 5. Optionally estimates background from negative control genes and performs subtraction-based correction. 6. Returns a log1p-transformed normalized matrix mapped back to the full gene/cell space.
- Parameters:
- correct_background: bool
Whether to correct the background using negative control genes
- negative_control_genes: Optional[Sequence[str]]
List of negative control genes to use for background correction. Default: [‘BFP’, ‘RFP’, ‘EGFP’]
- bg_method{‘mean’, ‘median’, ‘max’}
Statistic used to compute background from negative control genes.
- min_reads_per_cellint
Minimum total reads per cell required to be retained.
- min_cells_detec_genefloat
Minimum fraction of filtered cells in which a gene must be detected (non-zero) to be retained.
- max_fractionfloat
Maximum fraction of total cell reads for a gene to be considered “highly expressed.”
- exclude_highly_expressedbool
Whether to exclude highly expressed genes from size factor computation.
- use_subtractionbool
Whether to perform background subtraction (division not implemented).
- Raises:
- ValueError
If any negative control gene is not present in the assay IDs If bg_method is not one of {‘mean’, ‘median’, ‘max’}. If min_reads_per_cell is less than 1. If none of the negative control genes are present after filtering.
- NotImplementedError
If use_subtraction is False (division-based correction not yet implemented).