Ge.get_target_modifications#
missionbio.mosaic.ge.Ge.get_target_modifications
- Ge.get_target_modifications(target: str) DataFrame#
Get the target modifications DataFrame for the specified target. Identify variants on allele-1 and allele-2 for selected target across all cells
Returns a dataframe with 6 columns:
allele1: A comma (,) seperated list of variants on allele-1 of the cell
allele2: A comma (,) seperated list of variants on allele-2 of the cell
dp: DP value for the cell as calculated by GATK
gq: GQ value for the cell as calculated by GATK
af1: Allele frequency for the variants on allele-1 of the cell
af2: Allele frequency for the variants on allele-2 of the cell
simple_label: A simple label for the cell based on the allele1 and allele2 values. Has possible values:
“INDEL” - Either of the alleles contain an INDEL
“NO INDEL” - Neither of the alleles contain an INDEL
descriptive_label: A descriptive label for the cell based on the allele1 and allele2 values. Has possible values:
“NO INDEL” - Neither of the alleles contain an INDEL
“Mono-allelic INDEL” - One of the alleles contain an INDEL
“Heterozygous Bi-allelic INDEL” - Both alleles contain an INDEL, but they are different INDELs
“Homozygous Bi-allelic INDEL” - Both alleles contain an INDEL, and they are the same INDEL
The index of the dataframe is the cell barcode
In case the cell has a homozygous alternate genotype the values in allele1 will match the values in allele2 and the value in af1 and af2 columns will be the same
In case the cell does not have sufficient genoptyping information for the target the values in allele1 and allele2 would be None, and the values in dp, gq, af1 and af2 would be 0
- Parameters:
- targetstr
The target name for which the modifications are to be retrieved.
- Returns:
- pd.DataFrame
DataFrame containing the target modifications for the specified target.