spcal.dists¶

Distributions and related code.

spcal.dists.lognormal¶

Lognormal distribution.

spcal.dists.lognormal.cdf(x: ndarray, mu: float, sigma: float) → ndarray¶

Cummulative density function of a log-normal distribution.

Parameters:

x – x values
mu – mean of underlying normal distribution
sigma – shape parameter

Returns:

CDF at all x

spcal.dists.lognormal.pdf(x: ndarray, mu: float, sigma: float) → ndarray¶

Probabilty density function of a log-normal distribution.

Parameters:

x – x values
mu – mean of underlying normal distribution
sigma – shape parameter

Returns:

PDF at all x

spcal.dists.lognormal.quantile(quantile: ndarray, mu: float, sigma: float) → ndarray¶

Quantile (inverse CDF) function of a log-normal distribution.

Parameters:

quantile – values at which to evaluate
mu – mean of underlying normal distribution
sigma – shape parameter

Returns:

quantile at all quantile

spcal.dists.normal¶

Normal distribution.

spcal.dists.normal.cdf(x: ndarray, mu: float = 0.0, sigma: float = 1.0) → ndarray¶

Cummulative density function of a normal distribution.

Parameters:

x – values
mu – mean
sigma – standard deviation

Returns:

CDF at all x

spcal.dists.normal.erf(x: float | ndarray) → float | ndarray¶

Error function approximation.

The maximum error is 1.5e-7 [1].

Parameters:: x – value
Returns:: approximation of error function

References

spcal.dists.normal.erfinv(x: float | ndarray) → float | ndarray¶

The inverse error function.

Maximum error is ~ 1.061e-9.

Parameters:: x – input (-1 - 1)
Returns:: inverse error

spcal.dists.normal.pdf(x: ndarray, mu: float = 0.0, sigma: float = 1.0) → ndarray¶

Probability density function of a normal distribution.

Parameters:

x – values
mu – mean
sigma – standard deviation

Returns:

PDF at all x

spcal.dists.normal.quantile(x: ndarray, mu: float = 0.0, sigma: float = 1.0) → ndarray¶

Quantile (inverse-CDF) function of a normal distribution.

Parameters:

x – quantile values
mu – mean
sigma – standard deviation

Returns:

quantile at all x

spcal.dists.normal.standard_quantile(p: float | ndarray) → float | ndarray¶

Approximation of the standard normal quantile.

The maximum error is 1.5e-9 [2].

Parameters:: p – quantile (0 - 1)
Returns:: quantile of the standard normal at p

References

spcal.dists.poisson¶

Poisson distribution.

spcal.dists.poisson.cdf(k: ndarray, lam: float) → ndarray¶

Poisson cummulative distribution function.

\(\sum_{j=0}^{\lfloor k \rfloor} \text{PMF}(j, \lambda)\)

Parameters:

k – index values, integer
lam – expected rate of occurences

Returns:

CDF at all k

spcal.dists.poisson.pdf(k: ndarray, lam: float) → ndarray¶

Poisson probability mass function.

\(\frac{\lambda^k e^{-k}}{k!}\)

Parameters:

k – index values, integer
lam – expected rate of occurrences

Retuns:: PMF at all k

spcal.dists.poisson.quantile(q: float, lam: float) → int¶: Poisson quantile function

spcal.dists.util¶

Compound-poisson calculation.

spcal.dists.util.compound_poisson_lognormal_quantile_approximation(q: float, lam: float, mu: float, sigma: float) → float¶

Appoximation of a compound Poisson-Lognormal quantile.

Calculates the zero-truncated quantile of the distribution by appoximating the log-normal sum for each value k given by the Poisson distribution. The CDF is calculated for each log-normal, weighted by the Poisson PDF for k. The quantile is taken from the sum of the CDFs.

<5% error for lam < 100.0; sigma < 0.5

Parameters:

q – quantile
lam – mean of the Poisson distribution
mu – log mean of the log-normal distribution
sigma – log stddev of the log-normal distribution

Returns:

the q th value of the compound Poisson-Lognormal

The quantile of a compound Poisson-Lognormal distribution.

Interpolates values from a simulation of 1e10 zero-truncated values. The lookup table spans lambda values from 0.01 to 100.0, sigmas of 0.25 to 0.95 and zt-quantiles of 1e-3 to 1.0 - 1e-7. Maximum error is ~ 0.2 %.

Parameters:

q – quantile
lam – mean of the Poisson distribution
mu – log mean of the log-normal distribution
sigma – log stddev of the log-normal distribution

Returns:

the q th value of the compound Poisson-Lognormal

spcal.dists.util.extract_compound_poisson_lognormal_parameters(x: ndarray, mask: ndarray | None = None) → ndarray¶

Finds the parameters of compound-Poisson-lognormal distributed data, x.

\[\begin{split}N &\sim Poisson(\lambda) \\ X &\sim Lognormal(\mu, \sigma) \\ Y &= \sum_{n=1}^{N} X_{n}\end{split}\]

The value of \(\lambda\) is extracted using the percentage of zeros in x.

\[\lambda = -\log{P(0)}\]

The expected value and variance of the underlying lognormal are extracted from the mean and variance of x.

\[\begin{split}E(Y) &= \lambda E(X) \\ V(Y) &= \lambda E(X^2)\end{split}\]

Parameters \(\mu\) and \(\sigma\) are then extracted using the method of moments.

Parameters:

x – raw ICP-ToF signal of shape (samples, features)
mask – mask of valid values, defaults to all non-nan

Returns:

array of […, (lambda, mu, sigma)]

spcal.dists.util.extract_compound_poisson_lognormal_parameters_iterative(x: ndarray, alpha: float = 1e-05, dilation: int = 50, max_iters: int = 100, iter_eps: float = 0.01, bounds: ndarray | None = None) → tuple[float, float, float]¶

Finds the parameters of compound Poisson – lognormal distributed data, x.

Parameters are iterative found using extract_compound_poisson_lognormal_parameters, a threshold based on these parameters is set then the parameters extracted again. This is repeated until either the threshold or both µ and σ no longer change. Parameters can be confined using the bounds argument, useful for reducing iterations in samples with many paraticles. By default only σ is bounded, 0.2 – 1.0.

Parameters:

x – data
alpha – alpha value to use during thresholding
dilation – number of points to remove around detected peaks
max_iters – maximum number of iterations
iter_eps – smallest change in threshold allowed
bounds – array of shape (3, 2) of parameter bounds

Returns:

lam, mu, sigma

spcal.dists.util.sum_iid_lognormals(n: int | ndarray, mu: float, sigma: float, method: str = 'Fenton-Wilkinson') → tuple[float | ndarray, float | ndarray]¶

Sum of n identical independant log-normal distributions.

The sum is approximated by another log-normal distribution, defined by the returned parameters. By feaults, the Fenton-Wilkinson approximation is used for good right-tail accuracy [3].

Parameters:

n – int or array of ints
mu – log mean of the underlying distributions
sigma – log stddev of the underlying distributions
method – approximation to use, ‘Fenton-Wilkinson’ or ‘Lo’

Returns:

mu, sigma of the log-normal approximation

References

spcal.dists.util.zero_trunc_quantile(lam: ndarray | float, y: ndarray | float) → ndarray | float¶

Returns the zero-truncated Poisson quantile.

Parameters:

lam – Poisson rate parameter(s)
y – quantile(s) of non-truncated dist

Returns:

quantile(s) of the zero-truncated dist