Truth.ewm

Contents

Truth.ewm#

missionbio.demultiplex.dna.truth.Truth.ewm

Truth.ewm(com: float | None = None, span: float | None = None, halflife: float | TimedeltaConvertibleTypes | None = None, alpha: float | None = None, min_periods: int | None = 0, adjust: bool_t = True, ignore_na: bool_t = False, axis: Axis = 0, times: str | np.ndarray | DataFrame | Series | None = None, method: str = 'single') ExponentialMovingWindow#

Provide exponentially weighted (EW) calculations.

Exactly one of com, span, halflife, or alpha must be provided if times is not provided. If times is provided, halflife and one of com, span or alpha may be provided.

Parameters:
comfloat, optional

Specify decay in terms of center of mass

\(\alpha = 1 / (1 + com)\), for \(com \geq 0\).

spanfloat, optional

Specify decay in terms of span

\(\alpha = 2 / (span + 1)\), for \(span \geq 1\).

halflifefloat, str, timedelta, optional

Specify decay in terms of half-life

\(\alpha = 1 - \exp\left(-\ln(2) / halflife\right)\), for \(halflife > 0\).

If times is specified, a timedelta convertible unit over which an observation decays to half its value. Only applicable to mean(), and halflife value will not apply to the other functions.

New in version 1.1.0.

alphafloat, optional

Specify smoothing factor \(\alpha\) directly

\(0 < \alpha \leq 1\).

min_periodsint, default 0

Minimum number of observations in window required to have a value; otherwise, result is np.nan.

adjustbool, default True

Divide by decaying adjustment factor in beginning periods to account for imbalance in relative weightings (viewing EWMA as a moving average).

  • When adjust=True (default), the EW function is calculated using weights \(w_i = (1 - \alpha)^i\). For example, the EW moving average of the series [\(x_0, x_1, ..., x_t\)] would be:

\[y_t = \frac{x_t + (1 - \alpha)x_{t-1} + (1 - \alpha)^2 x_{t-2} + ... + (1 - \alpha)^t x_0}{1 + (1 - \alpha) + (1 - \alpha)^2 + ... + (1 - \alpha)^t}\]
  • When adjust=False, the exponentially weighted function is calculated recursively:

\[\begin{split}\begin{split} y_0 &= x_0\\ y_t &= (1 - \alpha) y_{t-1} + \alpha x_t, \end{split}\end{split}\]
ignore_nabool, default False

Ignore missing values when calculating weights.

  • When ignore_na=False (default), weights are based on absolute positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are \((1-\alpha)^2\) and \(1\) if adjust=True, and \((1-\alpha)^2\) and \(\alpha\) if adjust=False.

  • When ignore_na=True, weights are based on relative positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are \(1-\alpha\) and \(1\) if adjust=True, and \(1-\alpha\) and \(\alpha\) if adjust=False.

axis{0, 1}, default 0

If 0 or 'index', calculate across the rows.

If 1 or 'columns', calculate across the columns.

For Series this parameter is unused and defaults to 0.

timesstr, np.ndarray, Series, default None

New in version 1.1.0.

Only applicable to mean().

Times corresponding to the observations. Must be monotonically increasing and datetime64[ns] dtype.

If 1-D array like, a sequence with the same shape as the observations.

Deprecated since version 1.4.0: If str, the name of the column in the DataFrame representing the times.

methodstr {‘single’, ‘table’}, default ‘single’

New in version 1.4.0.

Execute the rolling operation per single column or row ('single') or over the entire object ('table').

This argument is only implemented when specifying engine='numba' in the method call.

Only applicable to mean()

Returns:
ExponentialMovingWindow subclass

See also

rolling

Provides rolling window calculations.

expanding

Provides expanding transformations.

Notes

See Windowing Operations for further usage details and examples.

Examples

>>> df = pd.DataFrame({'B': [0, 1, 2, np.nan, 4]})
>>> df
     B
0  0.0
1  1.0
2  2.0
3  NaN
4  4.0
>>> df.ewm(com=0.5).mean()
          B
0  0.000000
1  0.750000
2  1.615385
3  1.615385
4  3.670213
>>> df.ewm(alpha=2 / 3).mean()
          B
0  0.000000
1  0.750000
2  1.615385
3  1.615385
4  3.670213

adjust

>>> df.ewm(com=0.5, adjust=True).mean()
          B
0  0.000000
1  0.750000
2  1.615385
3  1.615385
4  3.670213
>>> df.ewm(com=0.5, adjust=False).mean()
          B
0  0.000000
1  0.666667
2  1.555556
3  1.555556
4  3.650794

ignore_na

>>> df.ewm(com=0.5, ignore_na=True).mean()
          B
0  0.000000
1  0.750000
2  1.615385
3  1.615385
4  3.225000
>>> df.ewm(com=0.5, ignore_na=False).mean()
          B
0  0.000000
1  0.750000
2  1.615385
3  1.615385
4  3.670213

times

Exponentially weighted mean with weights calculated with a timedelta halflife relative to times.

>>> times = ['2020-01-01', '2020-01-03', '2020-01-10', '2020-01-15', '2020-01-17']
>>> df.ewm(halflife='4 days', times=pd.DatetimeIndex(times)).mean()
          B
0  0.000000
1  0.585786
2  1.523889
3  1.523889
4  3.233686

< Class Truth