Dna.heatmap#
missionbio.mosaic.dna.Dna.heatmap
- Dna.heatmap(attribute: Union[str, ndarray, DataFrame], splitby: Union[str, ndarray] = 'label', features: Optional[Sequence] = None, x_groups: Optional[Sequence] = None, bars_order: Optional[Sequence] = None, convolve: float = 0, title: str = '', override: bool = False) FigureWidget #
Extends
heatmap()
Set specific colorscales for DNA.
Heatmap of all barcodes and ids.
Hint
The heatmap is interactable. Clicking on a column selects the id corresponding to it. These selected ids can be accessed through
selected_ids
.Example
Create a heatmap of all barcodes and ids.
>>> import missionbio.mosaic as ms >>> sample = ms.load_example_dataset("2 PBMC mix") >>> sample.protein.heatmap("NSP")
By default, the barcodes and ids are clustered using
clustered_barcodes()
andclustered_ids()
. The groups of cells (obtained from the labels) are clustered and within each group the cells are again clustered to maximize the ability to identify patterns in the data.Note
The clustering within each group is disabled when
convolve
is non-zero.>>> sample.protein.heatmap("NSP", convolve=20) # Smoothed data with no clustering within groups
When the data is convolved, the objective is to maximize the difference between the signal and the noise. This is done most effectively when the cells are sorted randomly. If they were clustered within groups then there would be patches of high and low signal which would make the smoothing of the data through convolution less effective.
However the order of the barcodes and ids can be customized using the
bars_order
andfeatures
parameters. These can also be used to show a subset of the barcodes and ids.>>> bars = sample.protein.clustered_barcodes("NSP", subcluster=True, optimal_ordering=True) >>> sample.protein.heatmap("NSP", bars_order=bars)
optimal_ordering
ensures that the barcodes are ordered such that the distance between neighboring barcodes is minimized. This is disabled by default as it can be slow for large datasets.- Parameters:
- attributestr / np.ndarray / pd.DataFrame
An attribute with the shape equal to the shape of the assay. Uses
get_attribute()
to retrieve the values constrained by row.- splitbystr / np.ndarray, default LABEL
Whether to order the barcodes based on the given labels or not. Only applicable when bars_order is None. Uses
get_attribute()
to retrieve the values constrained by row. The shape must be equal to (#cells).- featureslist
The ids that are to be shown. This also sets the order in which the ids are shown.
- x_groupslist
The group of each feature. Ticks for only the unique groups is shown. To avoid mislabeling the ticks, features must be provided when x_groups is provided.
- bars_orderlist
The barcodes that are to be plotted. The order in the plot is the same as the order in the list. Passing this sets splitby to None.
- convolvefloat [0, 100]:
The percentage of barcodes from the label with the fewest barcodes that is used to average out the signal. If 0, then no convolution is performed, and if 100, the mean per label is returned. Only applicable when splitby is not None.
- titlestr
The title to be added to the plot.
- overridebool
Passed to
clustered_barcodes()
andclustered_ids()
for cases when there are more than 1,000 ids
- Returns:
- figplotly.graph_objects.FigureWidget
- Raises:
- ValueError
Raised in the following cases.
When convolve is below 0 or above 100.
- When the number of ids is too large to hierarchically
cluster the cells and features is not provided.
x_groups is provided but features is not
< Class Dna