Protein.ga

Contents

Protein.ga#

missionbio.mosaic.protein.Protein.ga

Protein.ga(attribute: Union[str, ndarray, DataFrame], constraint: Optional[str] = None, features: Optional[Sequence] = None, copy: bool = True) DataFrame#

Retrieve any attribute in the assay

Returns a pd.DataFrame which could either be a row attribute, column attribute, layer or the passed attribute itself.

Parameters:
attribute:

The attribute to be searched for in the assay. If it is a DataFrame, a copy of the DataFrame is returned with only the columns corresponding to the features.

features:

In case the attribute is a layer or col, then the subset of features to select

constraint:
One of the following is accepted.
  1. None

    Contraint is auto-determined.

  2. ‘row’ / ‘r’

    The first dimension must be equal to the number of cells.

  3. ‘col’ / ‘c’

    The second dimension must be equal to the number of ids or the number of features given in the input.

  4. ‘row+col’ / ‘rc’

    The dimension must be exactly (number of cells, number of ids). The layers have this shape.

copy: bool

Whether to return a copy of the attribute or not. Default is True. Note that modifying the values of the dataframe will lead to modifications in the assay as well if copy is False.

Returns:
pd.DataFrame

The array of the attribute with the given name found in the assay layer, row attributes, or col attributes in that order. If a constraint is provided, the array is reshaped appropriately if possible, otherwise the best possible constraint for the given input is determined. The columns and index are named based on the barcodes, ids, or sequential integers depending on the constraint.

Raises:
ValueError

When the attribute is not found or when the constraint is not satisfied.

TypeError

When the attribute is neither a str not an np.ndarray

Notes

In case the constraint can reshape the array into the expected shape then no error will be raised. Eg. An assay with 100 barcodes and 10 ids has a shape (100, 10). When the attribute ‘barcode’ is fetched constrained by ‘col’ it will not raise an error, but rather return a 10x10 dataframe of barcodes.

>>> import missionbio.mosaic as ms
>>> assay = ms.load_example_dataset("3 cell mix").dna[:100, :10]
>>> assay.shape
(100, 10)
>>> attr = assay.get_attribute('barcode', constraint='col')
>>> attr.shape
(10, 10)

Possible expected behavior >>> assay = assay[:, :9] >>> assay.shape (100, 9) >>> attr = assay.get_attribute(‘barcode’, constraint=’col’) ValueError - ‘The given attribute does not have the expected shape nor could be reshaped appropriately.’


< Class Protein