# Plot customizations

<b>Objective</b>

This vignette contains various snippets of code<br>
that show how plots and data can be customized<br>
to ones requirements.

The h5 file used in this notebook can be found [here](https://github.com/MissionBio/mosaic-jupyter/tree/master/customizations)

In [None]:
import missionbio.mosaic as ms

sample = ms.load_example_dataset('3 cell mix')

All the interactive plotting functions return a [plotly figure](https://plotly.com/python/figure-structure/). In case the layout or the color<br>
scheme is not suitable for your data type, they can be changed before creating the final figure.<br>

The color for the plots are store either in the individual traces or the layout
attributes of the plotly figure.<br>

Mosaic also contains a list of colors that can be used to customize the plots.

In [None]:
# Plot the first few colors
import seaborn as sns
sns.palplot(ms.COLORS[:21])

### Reproducible UMAPs

UMAPs rely on an initial randomization. This leads to different projections everytime. To fix this, pass `random_state` to the `run_umap` method

In [None]:
# Pass a random state to fix the umap

sample.dna.run_umap(attribute='AF', random_state=42)
sample.dna.scatterplot('umap', 'label')

### Updating the colors for the DNA plots

#### Heatmap and scatterplot

In [None]:
# The default color scale for NGT is a monochromatic blue color scheme.

fig = sample.dna.heatmap('NGT')
fig

In [None]:
# In case of the DNA heatmap and scatterplot the colors are
# stored in the layout.coloraxis.colorscale attribute.

# This value must be updated to customize the plot.

fig.layout.coloraxis.colorscale

In [None]:
# Assuming these are new desired colors
# NGT=0 (WT) - blue
# NGT=1 (HET) - orange
# NGT=2 (HOM) - red
# NGT=3 (missing) - black

wt_col = ms.COLORS[0]
het_col = ms.COLORS[1]
hom_col = ms.COLORS[2]
miss_col = ms.COLORS[20]

sns.palplot([wt_col, het_col, hom_col, miss_col])

In [None]:
# Update the coloraxis to make a plot with the new colors

new_colors = [(0 / 4, wt_col), (1 / 4, wt_col),
              (1 / 4, het_col), (2 / 4, het_col),
              (2 / 4, hom_col), (3 / 4, hom_col),
              (3 / 4, miss_col), (4 / 4, miss_col)]

fig.layout.coloraxis.colorscale = new_colors
fig

In [None]:
# The same method can be used to update scatterplot which are colored by NGT

fig = sample.dna.scatterplot('umap', colorby='NGT', features=sample.dna.ids()[:4])
fig.layout.coloraxis.colorscale = new_colors
fig

#### Label colors

Now the colors in the heatmap conflict with the colors in the labels. To customize those, the palette can be changed

In [None]:
# This is the current palette

sample.dna.get_palette()

In [None]:
# Update this palette. It is not required to use the built in colors
# Any hexadecimal colors can be passed.

new_palette = {
    'Jurkat': ms.COLORS[3],
    'KG-1': ms.COLORS[4],
    'Mixed': '#c7c7c7',  # Use hexadecimal colors
    'TOM-1': ms.COLORS[5]
}


sample.dna.set_palette(new_palette)

In [None]:
# Make the heatmap with the new colors

fig = sample.dna.heatmap('NGT')
fig.layout.coloraxis.colorscale = new_colors
fig

### CNV heatmaps

#### Scale the cnv heatmap to an appropriate size

Often the cnv heatmaps contain too many genes or amplicons to fit in the default layout.<br>
This is usually not an issue when they are interactive, but when exporting as static images<br>
it hinder the ability to interpret them.

Mosaic has the option to convert interactive plotly figures to static [matplotlib](https://matplotlib.org/) figures

In [None]:
# Scale the figure width and plot as a static image.
# Double click on the plot to zoom-in and improve the resolution

import missionbio.mosaic.utils as mutils

fig = sample.cnv.heatmap('ploidy', features='genes')
fig.layout.width = 1600
mutils.static_fig(fig, figsize=(20, 20))

#### Updating the colorscale

In case the color scale get skewed to high poidy, a max value can be imposed to generate a more interpretable heatmap

The colorscale can also be changed as desired. A list of color scales can be found in the [plotly documentation](https://plotly.com/python/builtin-colorscales/)

In [None]:
# The plots can also be smoothed using a moving average with the convolve parameter

fig = sample.cnv.heatmap('ploidy', features='genes', convolve=3)
fig

In [None]:
# Change the color scale to "magma" - other suitable options might be "viridis", "plasma", "blues", "blues_r"...
fig.layout.coloraxis.colorscale = 'magma'

# Update the separating lines to be black
for shape in fig.layout.shapes:
    shape.line.color = '#000000'

# Set the minimum value to 0 and maximum value of ploidy to 2
fig.layout.coloraxis.cmax = 2
fig.layout.coloraxis.cmin = 0

fig

### Custom multi assay plot

Often the number of amplicons in CNV might take over the sample level heatmap making the plot uninterpretable. Moreover there might be certain non-differentiating variants and protein in the panel. These can be dropped before making the final heatmap.

In [None]:
# This method resets all the assay with all the values before any filter

sample.reset()

In [None]:
# Filter the CNV with amplicons only from the relevant genes

import numpy as np

genes = sample.cnv.col_attrs['gene_name'].copy()
relevant_ids = np.isin(genes, ['EZH2', 'TET2'])

sample.cnv = sample.cnv[:, relevant_ids]

In [None]:
fig = sample.heatmap(clusterby='dna', sortby='protein', flatten=False)

# Update the width of the plot [See the section on CNV heatmaps]
fig.layout.width = 1600

# Change the CNV colorscale [See the section on CNV heatmaps]
fig.data[2].zmax = 2
fig.data[2].zmin = 0
fig.data[2].colorscale = 'magma'

# Updating the ticktexts to show the gene names instead
fig.layout.xaxis3.ticktext = sample.cnv.col_attrs['gene_name'].copy()

# Show as a static plot
mutils.static_fig(fig, figsize=(20, 20))