Protein.heatmap

Protein.heatmap#

missionbio.mosaic.protein.Protein.heatmap

Protein.heatmap(attribute: Union[str, ndarray, DataFrame], splitby: Union[str, ndarray] = 'label', features: Optional[Sequence] = None, x_groups: Optional[Sequence] = None, bars_order: Optional[Sequence] = None, convolve: float = 0, title: str = '', override: bool = False) FigureWidget#

Heatmap of all barcodes and ids.

Hint

The heatmap is interactable. Clicking on a column selects the id corresponding to it. These selected ids can be accessed through selected_ids.

Example

Create a heatmap of all barcodes and ids.

>>> import missionbio.mosaic as ms
>>> sample = ms.load_example_dataset("2 PBMC mix")
>>> sample.protein.heatmap("NSP")

By default, the barcodes and ids are clustered using clustered_barcodes() and clustered_ids(). The groups of cells (obtained from the labels) are clustered and within each group the cells are again clustered to maximize the ability to identify patterns in the data.

Note

The clustering within each group is disabled when convolve is non-zero.

>>> sample.protein.heatmap("NSP", convolve=20)  # Smoothed data with no clustering within groups

When the data is convolved, the objective is to maximize the difference between the signal and the noise. This is done most effectively when the cells are sorted randomly. If they were clustered within groups then there would be patches of high and low signal which would make the smoothing of the data through convolution less effective.

However the order of the barcodes and ids can be customized using the bars_order and features parameters. These can also be used to show a subset of the barcodes and ids.

>>> bars = sample.protein.clustered_barcodes("NSP", subcluster=True, optimal_ordering=True)
>>> sample.protein.heatmap("NSP", bars_order=bars)

optimal_ordering ensures that the barcodes are ordered such that the distance between neighboring barcodes is minimized. This is disabled by default as it can be slow for large datasets.

Parameters:
attributestr / np.ndarray / pd.DataFrame

An attribute with the shape equal to the shape of the assay. Uses get_attribute() to retrieve the values constrained by row.

splitbystr / np.ndarray, default LABEL

Whether to order the barcodes based on the given labels or not. Only applicable when bars_order is None. Uses get_attribute() to retrieve the values constrained by row. The shape must be equal to (#cells).

featureslist

The ids that are to be shown. This also sets the order in which the ids are shown.

x_groupslist

The group of each feature. Ticks for only the unique groups is shown. To avoid mislabeling the ticks, features must be provided when x_groups is provided.

bars_orderlist

The barcodes that are to be plotted. The order in the plot is the same as the order in the list. Passing this sets splitby to None.

convolvefloat [0, 100]:

The percentage of barcodes from the label with the fewest barcodes that is used to average out the signal. If 0, then no convolution is performed, and if 100, the mean per label is returned. Only applicable when splitby is not None.

titlestr

The title to be added to the plot.

overridebool

Passed to clustered_barcodes() and clustered_ids() for cases when there are more than 1,000 ids

Returns:
figplotly.graph_objects.FigureWidget
Raises:
ValueError

Raised in the following cases.

  • When convolve is below 0 or above 100.

  • When the number of ids is too large to hierarchically

    cluster the cells and features is not provided.

  • x_groups is provided but features is not


< Class Protein