Protein.signature#

Protein.signature(attribute: Union[str, numpy.ndarray], kind: str = 'median', nan_value: Optional[float] = None) pandas.core.frame.DataFrame#

The chosen signature for each cluster and feature.

Generate feature signatures for each cluster/feature pair across all barcodes using the supplied assay and layer.

Parameters
attributeUnion[str, np.ndarray]

Name of the layer or row attribute to be evaluated. Uses _Assay.get_attribute() constrained by row to retrieve the values.

kind[“median”, “mode”, “std”, “mean”]

The kind of signature to return

nan_valueOptional[float]

The value in matrix that are to be converted to NaN. NaN values are removed before calculating the signatures.

Returns
pd.DataFrame

The index are the clusters and the columns are the features

Notes

  1. Signature of all NaNs is NaN

  2. Median of even numbers are the average of the middle values

  3. Multiple modes return the lowest value

  4. Standard deviation of one point is NaN

Examples

To remove all the missing NGT values before calculating the median NGT the following can be called.

>>> import missionbio.mosaic as ms
>>> sample = ms.load_example_dataset("3 cell mix")
>>> sample.dna.signature("NGT", nan_value=3)

To compute standard deviation of AF where DP is not 0

>>> sample.dna.signature("AF_MISSING", nan_value=-50)