DnaAssignment.label_cells#
missionbio.mosaic.algorithms.dna_assignment.DnaAssignment.label_cells
- DnaAssignment.label_cells(dna: _Assay, filter_variants: bool = True, check_valid: bool = True, min_fraction_genotyped: float = 0.3, min_fraction_match: float = 0.7) ndarray #
Assign labels to cells based on given truth
This uses the demultiplexing module to assign the labels. It is a probability based model that takes into account ADO. The more variants there are available the more likely is the assignment goinng to be correct.
- Parameters:
- dna_Assay
- filter_variantsbool
Whether to filter the variants before assigning the labels
- check_validbool
Whether to check if the assignment is valid. A warning is raised if the assignment might be inaccurate.
- min_fraction_genotyped: float
The minimum fraction of variants genotyped in a cell. The variants that are included in the computation of this fraction are the ones which are present in both the DNA and the truth data. These variants must also be genotyped in at least min_fraction_genotyped cells.
- min_fraction_match: float
The minimum fraction of matches required between the truth genotype and the genotype of the cell. The mismatches are weighted. A mismatch for a HET call is weighted at 0.75, missing calls are weighted at 0.15 and all other mismatches are weighted at 1. The fraction of the sum of the weights to the number of genotyped variants is the fraction mismatched. The inverse of that is the fraction matched, which should be greater than this value. It is not recommended to have this lower than 0.65 or greater than 0.85
- Raises:
- ValueError:
When a given cell is not found in the database
Notes
This modifies the label row_attr of the assay
< Class DnaAssignment