Changelog#
v3.6.0#
Release date: 2024-08-05
Added#
filter_somatic_variants()
for automatic filtering of pathogenic somatic variants.dna.Dna.assign_from_truth()
to label the cells for a known set of clones.protein.Protein.cluster_and_label()
to find all protein clusters and label them based on the provided truth. This function can be used to novel cell types.protein.Protein.label_sticky_cells()
to mark cells which are likely to be sticky.protein.Protein.assign_from_truth()
label the cells for a known set of cell types. By default, it labels the PBMC subtypes.protein.Protein.truth()
to convert cluster signatures to a truth that can be used forassign_from_truth()
Ability to pass an external control to
compute_ploidy()
read_depth_dependence()
- a plot to quickly visualize the need and effectiveness of NSP normalization.Option to load a subset of the assays in the h5 file.
No error is raised when h5 files with unknown assays are loaded.
Raise an error when the number of variants to annotate is more than 1,000. This is a safeguard to prevent incorrect API calls.
Varsome annotations are stored locally and will not be fetched again unless the local file is deleted.
copy
parameter toget_attribute()
to return a view of the data instead of a copy.Option to pass
order
of labels toridgeplot()
.ADO score is formatted more conveniently in the
workflows.variant_subclone_table.VariantSubcloneTable
workflow. If it’s 0, then it’s shown as “-” and if <0.05 then it’s shown as “~0.0”.Ability to rename a sample using
rename()
.
Changed#
sample.Sample.name
is a now a property, and cannot be set. It returns a value according to the currentsample_name
metadata.assay._Assay.title
is a now a property, and cannot be set. It returns a value according to the current sample_name metadata.Behavior of
default_label
inassay._Assay.set_labels()
. Whendefault_label
isNone
, only the labels of the provided barcodes are updated.normalized_counts
incompute_ploidy()
is no longer used. Theread_counts
layer is used directly.ANNOTATION_COLUMNS
constant was moved tomissionbio.annotation.constants
Use
pynndescent
instead ofscikit-learn
to speed up nearest neighbors calculation during graph-community clustering. Results will not be backwards compatible.
Fixed#
Ordering of the barcodes in the heatmap when a subset of the variants are used.
Fetching of CNV amplicon gene names for regions where ensembl returns an incomplete response.
Allow custom grouping of amplicons for
cnv.CNV.heatmap()
by passing amplicons tofeatures
andx_groups
values.
v3.4.0#
Release date: 2024-04-01
Added#
Support to pass
x_groups
tosignaturemap()
andheatmap()
.Support to pass variant filters to
load()
.positions()
,amplicon_performance()
, &panel_uniformity()
to quickly get amplicon positions, performance and panel uniformity.Option to hide columns in the variants table of the
VariantSubcloneTable
workflow.Ability to filter variants through the GUI in the
VariantSubcloneTable
workflow.override
parameter for theheatmap()
function which is simply passed toclustered_ids
andclustered_barcodes
.The first column of the subclone table is frozen.
Mandate
features
whenx_groups
is provided inheatmap()
.An appropriate error is raised when any cell has 0 total reads when running
NSP
.An appropriate error is raised when the annotation API is not available.
Changed#
Increased the vertical spacing between the graph and the fishplot from 0 to 0.1.
The plotting functions in
missionbio.mosaic.plotting
were moved tomissionbio.plotting
missionbio.algorithms.nsp
was moved tomissionbio.demultiplex.protein.nsp
Unpinned
scikit-learn
andhdbscan
as their latest versions are compatible with each other.scikit-learn>1.3.1
is installed by default which results in slightly different NSP calls due to changes to its Gaussian mixture model.
Fixed#
Load the whitelist variants correctly when
filter_variants=True
is passed toload()
.Nill values of DANN score are shown as empty cells instead of
º
.name_id_by_pos()
does not filter the amplicons.Lineplot in
plot_ploidy()
does not connect the medians with a line when usinggenes+amplicons
orpositions+amplicons
.The violin plot range is fixed to (0, 100) for the
AF
andGQ
layers inVariantSubcloneTable
.Violin plots generated using
violinplot()
are equally spaced when split by labels.Fix resetting of
selected_bars
when scatterplots are created.rename_labels()
allows swapping of labels.Fishplot does not disappear when a clone and its parent both have 0 cells at some timepoint.
v3.1.1#
Release date: 2023-09-25
Added#
Relaxed missionbio.h5 requirement to >=4.13.0,<6
Changed#
Disable autouploading of tagged packages to anaconda.
Removed check for h5 file compatibility with H5Reader.
Fixed#
The
whitelist
option inload()
correctly loads exact matches of variants.
v3.1.0#
Release date: 2023-09-13
Added#
The order of the names in the legend matches the order of the traces in the ridgeplot.
Option to pass any sequence type to
get_attribute()
besides np.ndarray. This includes list, tuple, and range.features
parameter tosignature()
which allows grouping across ids, just likesplitby
allows grouping across cells. * Thefeautures
option insignaturemap()
allows plotting using grouped data fromsignature()
Support for hg38 along with all species available through Ensembl in
get_annotations()
Support for hg38 in
get_annotations()
.Sped up NSP by 2x by using
statsmodels
for the KDE and using spherical covariance with kmeans++ initialization for the GMM parameters.ANSP
- Approximate NSP to protein normalization. It runs in constant time for large datasets.get_attribute()
also accepts dataframes.heatmap()
can plot arbitrary dataframes as long as it has the expected number of cells.TreeGraph
now supports html tags like<br>
,<b>
, and<span>
in the descriptions.
Changed#
Use latest python 3.8 in installer instead of 3.8.0
Fixed#
The title of
clone_vs_analyte()
plot does not overlap with the DNA heatmap.The x-axis label order for CNV in the
clone_vs_analyte()
plot matches the order of the points in the data shown.NGT layer not modified after running
filter_variants()
“Last modified” timestamp does not change when loading an H5 file.
jitter
parameter inNSP
worksFailure of
VariantSubcloneTable
when all the variant calls are filtered.Pinned hdbscan to v0.8.29. Higher versions (>=0.8.30,<=0.8.33) have runtime issues.
heatmap()
andsignaturemap()
execute successfully when “cnv” is passed before “dna”.Fix y-compression of
TreeGraph
by checking the upwards and downwards movement of only the highest and lowest nodes respectively.
Updated#
Switched from using the depracated JupyterDash to the builtin jupyter dash in Dash v2.11. Documentation
jupyter_client
from <8 to >=8.1.0 as the ThreadedZMQStream error is fixed in it. Changelog
v3.0.1#
Release date: 2023-06-20
Added#
assay._Assay.crosstab()
to wrappandas.crosstab
for ease of use with mosaic.assay._Assay.crosstabmap()
to create heatmaps of the output ofassay._Assay.crosstab()
.assay._Assay.hierarchical_cluster()
to get the hierarchical clustering order of the rows of a DataFrame.
Changed#
Updated matplotlib dependency from
<=3.2.2
to>=3.4.0
Fixed#
assay._Assay.heatmap()
subclustering performed when convolve=0. It was disabled by default.Custom
typography.css
used in workflows is included in the package dataSetting labels using dictionaries in
assay._Assay.set_labels()
.
v3.0.0#
Release date: 2023-06-16
Added#
A wrapper for COMPASS.
New variant filters that account for missing data.
Recipe and instructions for building installers.
plot_kind
parameter todna.Dna.group_by_genotype()
to change the type of plot shown.filter_cells
toio.load()
which loads only the intersection algorithm cells.Progress bar to
io.load()
algorithms.nsp.NSP
andalgorithms.nsp.ExpressionProfile
to modularize the NSP code.x_groups
toassay._Assay.heatmap()
to group the x-axis by a given list of ids.Simplify and speedup
assay._Assay.heatmap()
by removing duplicate data. (By usingplots.heatmap.Heatmap
)assay._Assay.convolve()
to convolve the data that was earlier performed in the Heatmap.Configuration options accessible via
Config
:ms.Config.Colorscale.Dna
to change the default color palette for all DNA plots.ms.Config.Colorscale.Cnv
to change the default color palette for all CNV plots.ms.Config.Colorscale.Protein
to change the default color palette for all Protein plots.
Custom divirgent colorscale for Cnv Ploidy heatmaps
Option to return indices instead of barcodes in
assay._Asasy.clustered_barcodes()
.sample.Sample.common_barcodes()
to get the common barcodes across assays.Add
subcluster
paramter toassay._Assay.clustered_barcodes()
to prevent clustering within the labelsOption to pass n-dimensional arrays as splitby in
assay._Assay.clustered_barcodes()
Option to fetch a subset of the assays in
sample.Sample.assays()
using thenames
parametersample.Sample.clustered_barcodes()
to hierarchically cluster using multiple assaysMultiple options added to
sample.Sample.heatmap()
to sort the assays, barcodes, and the featuresassay._Assay.signature`()
accepts asplitby
parameter to get the signature for each unique label insplitby
.Improvements to
assay._Assay.signaturemap()
:labels and ids are clustered by default.
Option to pass a list of labels to
assay._Assay.signaturemap()
to order the labels.The default
features
option forcnv.Cnv.signaturemap()
is set topositions
.
Option to copy the labels and palette together by passing an
assay._Assay()
toassay._Assay.set_labels()
assay._Assay.heatmap()
setssubcluster=False
when calculating the barcode order when convolve is provided.Varsome URLs as hyperlinks on the variant name in the
VariantSubcloneTable
Add percentage of cells and amplicons present to the
CopyNumberWorkflow
dna.Dna.mutated_cells()
to get the number of cells with at least 1 mutation in each given clone. This is used insample.Sample.signaturemap()
.
Changed#
apply_filter
changed tofilter_variants
inio.load()
SubcloneTree and SubcloneTreeGraph classes are renamed to Tree and TreeGraph respectively.
show_plot
toreturn_plot
indna.Dna.group_by_genotype()
plots.heatmap.Heatmap
splits the vertical and horizontal lines on the main heatmap into two traces.The default value of
vaf_het
indna.Dna.filter_variants()
changed from 35 to 30.Flattened
sample.Sample.heatmap`()
option has been removed. A more customizable version is available under thesample.Sample.signaturemap()
function.The constant -
constants.COLORS
to have unique values.The grey values at the 10th, 20th, 30th.. positions were modified to be unique
The black (
#000000
) value was moved from the 20th position to the last position
Fixed#
Get indexes maintains the order as per
find_list
when there are duplicates in thefind_list
andorder_using_find_list
is True.DANN score in the variants subclone table is shown correctly for saved h5 files.
Overlapping of text in phylogeny trees.
Error in multiprocessing when fetching gene_names for CNV by adding a
max_workers
parameter and using threads instead of processes.Missing clone is ignored when finding ADO sisters.
Removed#
Functions to convert legacy loom files to h5 files -
io._loom_to_h5
,io._update_file
Functions to read data from csv files -
io._merge_files
,io._cnv_raw_counts
,io._protein_raw_counts
Function to merge h5 files -
io._merge
show_plot
fromprotein.Protein.normalize_reads()
. The same plot can be created in plotly usingalgorithms.nsp.NSP.plot()
show_plot
fromprotein.Protein.get_signal_profile()
. The same plot can be created in plotly usingalgorithms.nsp.ExpressionProfile.plot()
protein.Protein.get_signal_profile
function. It can be executed usingalgorithms.nsp.ExpressionProfile.fit()
if needed.protein.Protein.get_scaling_factor
function. It can be executed usingalgorithms.nsp.NSP.scaling_factor()
if needed