gpmap.stats¶
Statistical utilities for unbiased estimators. v1 had two bugs (unbiased_var ignored its axis kwarg, plus dead coverage() stub). v2 honors axis and uses the Bessel c4 correction for unbiased standard deviation.
c4_correction¶
Bessel c4 factor for unbiased standard deviation estimation. Numerically stable via lgamma for every n.
n <= 1: returns1.0(no correction possible).n > 170: useslgammato avoid factorial overflow.
unbiased_std¶
Unbiased standard deviation: np.std(x, ddof=1) / c4(n) where n = x.shape[axis] or x.size if axis is None.
unbiased_var¶
Unbiased variance: unbiased_std(x, axis) ** 2.
v1 bug fixed
v1 had unbiased_var(x, axis=None) that always hardcoded axis=1 internally. v2 honors the axis kwarg correctly.
unbiased_sterror¶
Unbiased standard error: unbiased_std(x, axis) / sqrt(n).
corrected_std and corrected_sterror¶
def corrected_std(biased_std: float | np.ndarray, n_samples: int) -> float | np.ndarray
def corrected_sterror(biased_sterror: float | np.ndarray, n_samples: int) -> float | np.ndarray
Given a biased (ddof=0 style) estimator and the sample count, recover the c4-corrected unbiased estimator. corrected_sterror is a thin alias for corrected_std; both apply the same algebraic correction.
Use case¶
The simulator subclasses store per-genotype n_replicates. When you want to recover an unbiased per-genotype standard error from a biased measurement, pass it through corrected_std:
import numpy as np
from gpmap.stats import corrected_std
biased = np.std(samples, ddof=0)
unbiased = corrected_std(biased, n_samples=samples.size)
For an array of biased stds, the function broadcasts.