Plot customizations#

Objective

This vignette contains various snippets of code
that show how plots and data can be customized
to ones requirements.

The h5 file used in this notebook can be found here

import missionbio.mosaic as ms

sample = ms.load_example_dataset('3 cell mix')
Loading, <_io.BytesIO object at 0x7fdca010d3b0>
Loaded in 0.4s.

All the interactive plotting functions return a plotly figure. In case the layout or the color
scheme is not suitable for your data type, they can be changed before creating the final figure.

The color for the plots are store either in the individual traces or the layout attributes of the plotly figure.

Mosaic also contains a list of colors that can be used to customize the plots.

# Plot the first few colors
import seaborn as sns
sns.palplot(ms.COLORS[:21])
../_images/f73103698618d17c7d809f9849043f3a8acaf8b68cfb7bdb48ad893f3a0078bd.png

Reproducible UMAPs#

UMAPs rely on an initial randomization. This leads to different projections everytime. To fix this, pass random_state to the run_umap method

# Pass a random state to fix the umap

sample.dna.run_umap(attribute='AF', random_state=42)
sample.dna.scatterplot('umap', 'label')

Updating the colors for the DNA plots#

Heatmap and scatterplot#

# The default color scale for NGT is a monochromatic blue color scheme.

fig = sample.dna.heatmap('NGT')
fig
# In case of the DNA heatmap and scatterplot the colors are
# stored in the layout.coloraxis.colorscale attribute.

# This value must be updated to customize the plot.

fig.layout.coloraxis.colorscale
((0.0, '#3b4d73'),
 (0.25, '#3b4d73'),
 (0.25, '#78a3bc'),
 (0.5, '#78a3bc'),
 (0.5, '#d7ecee'),
 (0.75, '#d7ecee'),
 (0.75, '#000000'),
 (1.0, '#000000'))
# Assuming these are new desired colors
# NGT=0 (WT) - blue
# NGT=1 (HET) - orange
# NGT=2 (HOM) - red
# NGT=3 (missing) - black

wt_col = ms.COLORS[0]
het_col = ms.COLORS[1]
hom_col = ms.COLORS[2]
miss_col = ms.COLORS[20]

sns.palplot([wt_col, het_col, hom_col, miss_col])
../_images/8d0af4280c15611dfc71b93e40c788d70d117bcd2294fc15ec2ac2df0df6aa93.png
# Update the coloraxis to make a plot with the new colors

new_colors = [(0 / 4, wt_col), (1 / 4, wt_col),
              (1 / 4, het_col), (2 / 4, het_col),
              (2 / 4, hom_col), (3 / 4, hom_col),
              (3 / 4, miss_col), (4 / 4, miss_col)]

fig.layout.coloraxis.colorscale = new_colors
fig
# The same method can be used to update scatterplot which are colored by NGT

fig = sample.dna.scatterplot('umap', colorby='NGT', features=sample.dna.ids()[:4])
fig.layout.coloraxis.colorscale = new_colors
fig

Label colors#

Now the colors in the heatmap conflict with the colors in the labels. To customize those, the palette can be changed

# This is the current palette

sample.dna.get_palette()
{'Jurkat': '#1f77b4',
 'KG-1': '#ff7f0e',
 'Mixed': '#c7c7c7',
 'TOM-1': '#d62728'}
# Update this palette. It is not required to use the built in colors
# Any hexadecimal colors can be passed.

new_palette = {
    'Jurkat': ms.COLORS[3],
    'KG-1': ms.COLORS[4],
    'Mixed': '#c7c7c7',  # Use hexadecimal colors
    'TOM-1': ms.COLORS[5]
}


sample.dna.set_palette(new_palette)
# Make the heatmap with the new colors

fig = sample.dna.heatmap('NGT')
fig.layout.coloraxis.colorscale = new_colors
fig

CNV heatmaps#

Scale the cnv heatmap to an appropriate size#

Often the cnv heatmaps contain too many genes or amplicons to fit in the default layout.
This is usually not an issue when they are interactive, but when exporting as static images
it hinder the ability to interpret them.

Mosaic has the option to convert interactive plotly figures to static matplotlib figures

# Scale the figure width and plot as a static image.
# Double click on the plot to zoom-in and improve the resolution

import missionbio.mosaic.utils as mutils

fig = sample.cnv.heatmap('ploidy', features='genes')
fig.layout.width = 1600
mutils.static_fig(fig, figsize=(20, 20))
<matplotlib.axes._subplots.AxesSubplot at 0x7fdcd3257130>
../_images/da41e734d606065bf6475516f173c1b8b90347a72bdbc6504257ebf88ad6eece.png

Updating the colorscale#

In case the color scale get skewed to high poidy, a max value can be imposed to generate a more interpretable heatmap

The colorscale can also be changed as desired. A list of color scales can be found in the plotly documentation

# The plots can also be smoothed using a moving average with the convolve parameter

fig = sample.cnv.heatmap('ploidy', features='genes', convolve=3)
fig
# Change the color scale to "magma" - other suitable options might be "viridis", "plasma", "blues", "blues_r"...
fig.layout.coloraxis.colorscale = 'magma'

# Update the separating lines to be black
for shape in fig.layout.shapes:
    shape.line.color = '#000000'

# Set the minimum value to 0 and maximum value of ploidy to 2
fig.layout.coloraxis.cmax = 2
fig.layout.coloraxis.cmin = 0

fig

Custom multi assay plot#

Often the number of amplicons in CNV might take over the sample level heatmap making the plot uninterpretable. Moreover there might be certain non-differentiating variants and protein in the panel. These can be dropped before making the final heatmap.

# This method resets all the assay with all the values before any filter

sample.reset()
# Filter the CNV with amplicons only from the relevant genes

import numpy as np

genes = sample.cnv.col_attrs['gene_name'].copy()
relevant_ids = np.isin(genes, ['EZH2', 'TET2'])

sample.cnv = sample.cnv[:, relevant_ids]
fig = sample.heatmap(clusterby='dna', sortby='protein', flatten=False)

# Update the width of the plot [See the section on CNV heatmaps]
fig.layout.width = 1600

# Change the CNV colorscale [See the section on CNV heatmaps]
fig.data[2].zmax = 2
fig.data[2].zmin = 0
fig.data[2].colorscale = 'magma'

# Updating the ticktexts to show the gene names instead
fig.layout.xaxis3.ticktext = sample.cnv.col_attrs['gene_name'].copy()

# Show as a static plot
mutils.static_fig(fig, figsize=(20, 20))
<matplotlib.axes._subplots.AxesSubplot at 0x7fdcc1f7b100>
../_images/a5f13f9caaeee84c479b62fb9ce1b583d290737ce3702390b4f48322bb0a80dd.png