Hurdle Distribution#

Modifier

This is not a distribution per se, but a modifier of univariate distributions.

The hurdle distribution is a mixture distribution that combines a point mass at zero with a continuous distribution. It is used to model data that is a mixture of two processes: one that generates zeros and another that generates non-zero values. The hurdle distribution is commonly used in econometrics to model data with excess zeros, such as healthcare costs, insurance claims, and counts of events.

Key properties and parameters#

Parameters:

  • dist : (PreliZ distribution) Univariate PreliZ distribution which will be truncated at zero.

  • psi : (float) Expected proportion of the base distribution (0 < psi < 1)

Probability Density Function (PDF)#

Given a base distribution with parameters \(\theta\), cumulative distribution function (CDF) and probability density/mass function (PDF). The density of a Hurdle distribution is:

\[\begin{split} f(x \mid \psi, \mu) = \left\{ \begin{array}{l} (1 - \psi), \text{if } x = 0 \\ \psi \frac{\text{PDF}(x \mid \theta)} {1 - \text{CDF}(\epsilon \mid \theta)}, \text{if } x \neq 0 \end{array} \right. \end{split}\]

where \(\psi\) is the expected proportion of the base distribution and \(\epsilon\) is the machine precision for continuous distribution or 0 for discrete ones.

The following figure shows the difference between a Gamma distribution and a HurdleGamma, with the same parameters for the base distribution (Gamma).

Hurdle Distribution PDF

Cumulative Distribution Function (CDF)#

\[\begin{split} F(x \mid \psi, \mu) = \left\{ \begin{array}{ll} 0, \text{if } x < 0 \\ 1 - \psi, \text{if } x = 0 \\ 1 - \psi + \psi \cdot \frac{\text{CDF}(x \mid \theta) - \text{CDF}(\epsilon \mid \theta)}{1 - \text{CDF}(\epsilon \mid \theta)}, \text{if } x \neq 0 \end{array} \right. \end{split}\]
Hurdle Distribution CDF

See also

Related Distributions:

  • Mixture - A distribution modifier that combines two or more distributions with weights.

  • ZeroInflatedPoisson - A distribution that extends the Poisson distribution by adding a mechanism to account for excess zeros, often observed in count data.

  • Zero-Inflated Binomial Distribution - A distribution that combines the Binomial distribution with a zero-inflation component, useful for modeling count data with an excess of zeros.

  • Zero-Inflated Negative Binomial Distribution- A distribution that combines the Negative Binomial distribution with a zero-inflation component, modeling scenarios with overdispersion and an excess of zeros.

References#