Hurdle Distribution#
This is not a distribution per se, but a modifier of univariate distributions.
The hurdle distribution is a mixture distribution that combines a point mass at zero with a continuous distribution. It is used to model data that is a mixture of two processes: one that generates zeros and another that generates non-zero values. The hurdle distribution is commonly used in econometrics to model data with excess zeros, such as healthcare costs, insurance claims, and counts of events.
Key properties and parameters#
Parameters:
dist: (PreliZ distribution) Univariate PreliZ distribution which will be truncated at zero.psi: (float) Expected proportion of the base distribution (0 < psi < 1)
Probability Density Function (PDF)#
Given a base distribution with parameters \(\theta\), cumulative distribution function (CDF) and probability density/mass function (PDF). The density of a Hurdle distribution is:
where \(\psi\) is the expected proportion of the base distribution and \(\epsilon\) is the machine precision for continuous distribution or 0 for discrete ones.
The following figure shows the difference between a Gamma distribution and a HurdleGamma, with the same parameters for the base distribution (Gamma).
Cumulative Distribution Function (CDF)#
See also
Related Distributions:
Mixture - A distribution modifier that combines two or more distributions with weights.
ZeroInflatedPoisson - A distribution that extends the Poisson distribution by adding a mechanism to account for excess zeros, often observed in count data.
Zero-Inflated Binomial Distribution - A distribution that combines the Binomial distribution with a zero-inflation component, useful for modeling count data with an excess of zeros.
Zero-Inflated Negative Binomial Distribution- A distribution that combines the Negative Binomial distribution with a zero-inflation component, modeling scenarios with overdispersion and an excess of zeros.