COMPASS#

missionbio.mosaic.algorithms.compass.COMPASS

class COMPASS(sample: Sample, somatic_variants: Sequence, min_prob_diff: float = 0.9)#

A wrapper to run COMPASS.

COMPASS can be found on GitHub - cbg-ethz/COMPASS. It must be installed and the command should be named PHYLOGENY. The name of the command can be modified by changing the command_name class variable.

Snippets#

>>> import missionbio.mosaic as ms
>>> from missionbio.mosaic.algorithms.compass import COMPASS
>>> sample = ms.load_example_dataset("2 PBMC mix")
>>> compass = COMPASS(sample, somatic_variants=["chr4:55599436:T/C", "chr5:170837457:A/G"])
>>> compass.run()

Note

The somatic variants must be present in the Dna assay and formatted as chrom:pos:ref/alt

To show the estimated phylogeny:

>>> compass.plot_tree()

If the clones were also identified using group_by_genotype or the VariantSubcloneTable workflow, then the labels can be compared between the two methods using the crosstab() method to get a DataFrame and the crosstabmap() method to plot that DataFrame as heatmap

>>> # Set labels
>>> sample.dna.group_by_genotype(compass.somatic_variants, max_ado_score=0.8)
>>> # Compare with COMPASS labels
>>> sample.dna.crosstabmap(compass.labels_).show()
>>> # The mapping of the node in labels_ to the genotype
>>> compass.node_descriptions()
... {'2': 'KIT (c.2484+78T-C)', '1': 'NPM1 (c.847-74A-G)<br>CNLOH: KIT, NPM1'}

Variables#

somatic_variants

The somatic variants used to find the clones

The outputs are stored in the attributes that end with _. They are also described in run().

nodes_

DataFrame with the node assignments for each cell

probability_

DataFrame with the probability of assignment of each cell to each node

node_genotypes_

DataFrame with the genotype of each node

tree_

The phylogentic tree as a dictionary

labels_

The labels assigned for each cell based on the node assignments, probabilities, and min_prob_diff

Functions#

run([kwargs])

Process the sample using COMPASS.

plot_tree()

Generate a tree plot for the phylogeny

node_descriptions([percent])

Generate the description for each node

Prepare to run COMPASS for the sample

Parameters:
sampleSample

The sample on which COMPASS has to be processed

somatic_variantsSequence

The list of variants used to find the clones

min_prob_difffloat[0, 1]

The minimum difference in probability of assignment of nodes for the cell to be assigned a label. If the difference is less than min_prob_diff then the cell is labeled as “Ambiguous”.