FANDOM


Distribution Parameter(s) Natural parameter(s) Inverse parameter mapping Base measure $ h(x) $ Sufficient statistic $ T(x) $ Log-partition $ A(\boldsymbol\eta) $ Log-partition $ A(\boldsymbol\theta) $
Bernoulli distribution p $ \ln\frac{p}{1-p} $ $ \frac{1}{1+e^{-\eta}} = \frac{e^\eta}{1+e^{\eta}} $ $ 1 $ $ x $ $ \ln (1+e^{\eta}) $ $ -\ln (1-p) $
binomial distribution
with known number of trials n
p $ \ln\frac{p}{1-p} $ $ \frac{1}{1+e^{-\eta}} = \frac{e^\eta}{1+e^{\eta}} $ $ {n \choose x} $ $ x $ $ n \ln (1+e^{\eta}) $ $ -n \ln (1-p) $
Poisson distribution λ $ \ln\lambda $ $ e^\eta $ $ \frac{1}{x!} $ $ x $ $ e^{\eta} $ $ \lambda $
negative binomial distribution
with known number of failures r
p $ \ln p $ $ e^\eta $ $ {x+r-1 \choose x} $ $ x $ $ -r \ln (1-e^{\eta}) $ $ -r \ln (1-p) $
exponential distribution λ $ -\lambda $ $ -\eta $ $ 1 $ $ x $ $ -\ln(-\eta) $ $ -\ln\lambda $
Pareto distribution
with known minimum value xm
α $ -\alpha-1 $ $ -1-\eta $ $ 1 $ $ \ln x $ $ -\ln (-1-\eta) + (1+\eta) \ln x_{\mathrm m} $ $ -\ln \alpha - \alpha \ln x_{\mathrm m} $
Weibull distribution
with known shape k
λ $ -\frac{1}{\lambda^k} $ $ (-\eta)^{\frac{1}{k}} $ $ x^{k-1} $ $ x^k $ $ \ln(-\eta) -\ln k $ $ k\ln\lambda -\ln k $
Laplace distribution
with known mean μ
b $ -\frac{1}{b} $ $ -\frac{1}{\eta} $ $ 1 $ $ |x-\mu| $ $ \ln\left(-\frac{2}{\eta}\right) $ $ \ln 2b $
chi-squared distribution ν $ \frac{\nu}{2}-1 $ $ 2(\eta+1) $ $ e^{-\frac{x}{2}} $ $ \ln x $ $ \ln \Gamma(\eta+1)+(\eta+1)\ln 2 $ $ \ln \Gamma\left(\frac{\nu}{2}\right)+\frac{\nu}{2}\ln 2 $
normal distribution
known variance
μ $ \frac{\mu}{\sigma} $ $ \sigma\eta $ $ \frac{e^{-\frac{x^2}{2\sigma^2}}}{\sqrt{2\pi}\sigma} $ $ \frac{x}{\sigma} $ $ \frac{\eta^2}{2} $ $ \frac{\mu^2}{2\sigma^2} $
normal distribution μ,σ2 $ \begin{bmatrix} \dfrac{\mu}{\sigma^2} \\[10pt] -\dfrac{1}{2\sigma^2} \end{bmatrix} $ $ \begin{bmatrix} -\dfrac{\eta_1}{2\eta_2} \\[15pt] -\dfrac{1}{2\eta_2} \end{bmatrix} $ $ \frac{1}{\sqrt{2\pi}} $ $ \begin{bmatrix} x \\ x^2 \end{bmatrix} $ $ -\frac{\eta_1^2}{4\eta_2} - \frac12\ln(-2\eta_2) $ $ \frac{\mu^2}{2\sigma^2} + \ln \sigma $
lognormal distribution μ,σ2 $ \begin{bmatrix} \dfrac{\mu}{\sigma^2} \\[10pt] -\dfrac{1}{2\sigma^2} \end{bmatrix} $ $ \begin{bmatrix} -\dfrac{\eta_1}{2\eta_2} \\[15pt] -\dfrac{1}{2\eta_2} \end{bmatrix} $ $ \frac{1}{\sqrt{2\pi}x} $ $ \begin{bmatrix} \ln x \\ (\ln x)^2 \end{bmatrix} $ $ -\frac{\eta_1^2}{4\eta_2} - \frac12\ln(-2\eta_2) $ $ \frac{\mu^2}{2\sigma^2} + \ln \sigma $
inverse Gaussian distribution μ,λ $ \begin{bmatrix} -\dfrac{\lambda}{2\mu^2} \\[15pt] -\dfrac{\lambda}{2} \end{bmatrix} $ $ \begin{bmatrix} \sqrt{\dfrac{\eta_2}{\eta_1}} \\[15pt] -2\eta_2 \end{bmatrix} $ $ \frac{1}{\sqrt{2\pi}x^{\frac{3}{2}}} $ $ \begin{bmatrix} x \\[5pt] \dfrac{1}{x} \end{bmatrix} $ $ -2\sqrt{\eta_1\eta_2} -\frac12\ln(-2\eta_2) $ $ -\frac{\lambda}{\mu} -\frac12\ln\lambda $
gamma distribution α,β $ \begin{bmatrix} \alpha-1 \\ -\beta \end{bmatrix} $ $ \begin{bmatrix} \eta_1+1 \\ -\eta_2 \end{bmatrix} $ $ 1 $ $ \begin{bmatrix} \ln x \\ x \end{bmatrix} $ $ \ln \Gamma(\eta_1+1)-(\eta_1+1)\ln(-\eta_2) $ $ \ln \Gamma(\alpha)-\alpha\ln\beta $
k, θ $ \begin{bmatrix} k-1 \\[5pt] -\dfrac{1}{\theta} \end{bmatrix} $ $ \begin{bmatrix} \eta_1+1 \\[5pt] -\dfrac{1}{\eta_2} \end{bmatrix} $ $ \ln \Gamma(k)+k\ln\theta $
inverse gamma distribution α,β $ \begin{bmatrix} -\alpha-1 \\ -\beta \end{bmatrix} $ $ \begin{bmatrix} -\eta_1-1 \\ -\eta_2 \end{bmatrix} $ $ 1 $ $ \begin{bmatrix} \ln x \\ \frac{1}{x} \end{bmatrix} $ $ \ln \Gamma(-\eta_1-1)-(-\eta_1-1)\ln(-\eta_2) $ $ \ln \Gamma(\alpha)-\alpha\ln\beta $
scaled inverse chi-squared distribution ν,σ2 $ \begin{bmatrix} -\dfrac{\nu}{2}-1 \\[10pt] -\dfrac{\nu\sigma^2}{2} \end{bmatrix} $ $ \begin{bmatrix} -2(\eta_1+1) \\[10pt] \dfrac{\eta_2}{\eta_1+1} \end{bmatrix} $ $ 1 $ $ \begin{bmatrix} \ln x \\ \frac{1}{x} \end{bmatrix} $ $ \ln \Gamma(-\eta_1-1)-(-\eta_1-1)\ln(-\eta_2) $ $ \ln \Gamma\left(\frac{\nu}{2}\right)-\frac{\nu}{2}\ln\frac{\nu\sigma^2}{2} $
beta distribution α,β $ \begin{bmatrix} \alpha - 1 \\ \beta - 1 \end{bmatrix} $ $ \begin{bmatrix} \eta_1 + 1 \\ \eta_2 + 1 \end{bmatrix} $ $ 1 $ $ \begin{bmatrix} \ln x \\ \ln (1-x) \end{bmatrix} $ $ \ln \Gamma(\eta_1) + \ln \Gamma(\eta_2) - \ln \Gamma(\eta_1+\eta_2) $ $ \ln \Gamma(\alpha) + \ln \Gamma(\beta) - \ln \Gamma(\alpha+\beta) $
multivariate normal distribution μ,Σ $ \begin{bmatrix} \boldsymbol\Sigma^{-1}\boldsymbol\mu \\[5pt] -\frac12\boldsymbol\Sigma^{-1} \end{bmatrix} $ $ \begin{bmatrix} -\frac12\boldsymbol\eta_2^{-1}\boldsymbol\eta_1 \\[5pt] -\frac12\boldsymbol\eta_2^{-1} \end{bmatrix} $ $ (2\pi)^{-\frac{k}{2}} $ $ \begin{bmatrix} \mathbf{x} \\[5pt] \mathbf{x}\mathbf{x}^\mathrm{T} \end{bmatrix} $ $ -\frac{1}{4}\boldsymbol\eta_1^{\rm T}\boldsymbol\eta_2^{-1}\boldsymbol\eta_1 - \frac12\ln\left|-2\boldsymbol\eta_2\right| $ $ \frac12\boldsymbol\mu^{\rm T}\boldsymbol\Sigma^{-1}\boldsymbol\mu + \frac12 \ln |\boldsymbol\Sigma| $
categorical distribution (variant 1) p1,...,pk

where $ \textstyle\sum_{i=1}^k p_i=1 $
$ \begin{bmatrix} \ln p_1 \\ \vdots \\ \ln p_k \end{bmatrix} $ $ \begin{bmatrix} e^{\eta_1} \\ \vdots \\ e^{\eta_k} \end{bmatrix} $

where $ \textstyle\sum_{i=1}^k e^{\eta_i}=1 $
$ 1 $ $ \begin{bmatrix} [x=1] \\ \vdots \\ {[x=k]} \end{bmatrix} $ $ 0 $ $ 0 $
categorical distribution (variant 2) p1,...,pk

where $ \textstyle\sum_{i=1}^k p_i=1 $
$ \begin{bmatrix} \ln p_1+C \\ \vdots \\ \ln p_k+C \end{bmatrix} $ $ \begin{bmatrix} \dfrac{1}{C}e^{\eta_1} \\ \vdots \\ \dfrac{1}{C}e^{\eta_k} \end{bmatrix} = $

$ \begin{bmatrix} \dfrac{e^{\eta_1}}{\sum_{i=1}^{k}e^{\eta_i}} \\[10pt] \vdots \\[5pt] \dfrac{e^{\eta_k}}{\sum_{i=1}^{k}e^{\eta_i}} \end{bmatrix} $

where $ \textstyle\sum_{i=1}^k e^{\eta_i}=C $

$ 1 $ $ \begin{bmatrix} [x=1] \\ \vdots \\ {[x=k]} \end{bmatrix} $ $ 0 $ $ 0 $
categorical distribution (variant 3) p1,...,pk

where $ p_k = 1 - \textstyle\sum_{i=1}^{k-1} p_i $
$ \begin{bmatrix} \ln \dfrac{p_1}{p_k} \\[10pt] \vdots \\[5pt] \ln \dfrac{p_{k-1}}{p_k} \\[15pt] 0 \end{bmatrix} = $

$ \begin{bmatrix} \ln \dfrac{p_1}{1-\sum_{i=1}^{k-1}p_i} \\[10pt] \vdots \\[5pt] \ln \dfrac{p_{k-1}}{1-\sum_{i=1}^{k-1}p_i} \\[15pt] 0 \end{bmatrix} $
$ \begin{bmatrix} \dfrac{e^{\eta_1}}{\sum_{i=1}^{k}e^{\eta_i}} \\[10pt] \vdots \\[5pt] \dfrac{e^{\eta_k}}{\sum_{i=1}^{k}e^{\eta_i}} \end{bmatrix} = $

$ \begin{bmatrix} \dfrac{e^{\eta_1}}{1+\sum_{i=1}^{k-1}e^{\eta_i}} \\[10pt] \vdots \\[5pt] \dfrac{e^{\eta_{k-1}}}{1+\sum_{i=1}^{k-1}e^{\eta_i}} \\[15pt] \dfrac{1}{1+\sum_{i=1}^{k-1}e^{\eta_i}} \end{bmatrix} $

$ 1 $ $ \begin{bmatrix} [x=1] \\ \vdots \\ {[x=k]} \end{bmatrix} $ $ \ln \left(\sum_{i=1}^{k} e^{\eta_i}\right) = \ln \left(1+\sum_{i=1}^{k-1} e^{\eta_i}\right) $ $ -\ln p_k = -\ln \left(1 - \sum_{i=1}^{k-1} p_i\right) $
multinomial distribution (variant 1)
with known number of trials n
p1,...,pk

where $ \textstyle\sum_{i=1}^k p_i=1 $
$ \begin{bmatrix} \ln p_1 \\ \vdots \\ \ln p_k \end{bmatrix} $ $ \begin{bmatrix} e^{\eta_1} \\ \vdots \\ e^{\eta_k} \end{bmatrix} $

where $ \textstyle\sum_{i=1}^k e^{\eta_i}=1 $
$ \frac{n!}{\prod_{i=1}^{k} x_i!} $ $ \begin{bmatrix} x_1 \\ \vdots \\ x_k \end{bmatrix} $ $ 0 $ $ 0 $
multinomial distribution (variant 2)
with known number of trials n
p1,...,pk

where $ \textstyle\sum_{i=1}^k p_i=1 $
$ \begin{bmatrix} \ln p_1+C \\ \vdots \\ \ln p_k+C \end{bmatrix} $ $ \begin{bmatrix} \dfrac{1}{C}e^{\eta_1} \\ \vdots \\ \dfrac{1}{C}e^{\eta_k} \end{bmatrix} = $

$ \begin{bmatrix} \dfrac{e^{\eta_1}}{\sum_{i=1}^{k}e^{\eta_i}} \\[10pt] \vdots \\[5pt] \dfrac{e^{\eta_k}}{\sum_{i=1}^{k}e^{\eta_i}} \end{bmatrix} $

where $ \textstyle\sum_{i=1}^k e^{\eta_i}=C $

$ \frac{n!}{\prod_{i=1}^{k} x_i!} $ $ \begin{bmatrix} x_1 \\ \vdots \\ x_k \end{bmatrix} $ $ 0 $ $ 0 $
multinomial distribution (variant 3)
with known number of trials n
p1,...,pk

where $ p_k = 1 - \textstyle\sum_{i=1}^{k-1} p_i $
$ \begin{bmatrix} \ln \dfrac{p_1}{p_k} \\[10pt] \vdots \\[5pt] \ln \dfrac{p_{k-1}}{p_k} \\[15pt] 0 \end{bmatrix} = $

$ \begin{bmatrix} \ln \dfrac{p_1}{1-\sum_{i=1}^{k-1}p_i} \\[10pt] \vdots \\[5pt] \ln \dfrac{p_{k-1}}{1-\sum_{i=1}^{k-1}p_i} \\[15pt] 0 \end{bmatrix} $
$ \begin{bmatrix} \dfrac{e^{\eta_1}}{\sum_{i=1}^{k}e^{\eta_i}} \\[10pt] \vdots \\[5pt] \dfrac{e^{\eta_k}}{\sum_{i=1}^{k}e^{\eta_i}} \end{bmatrix} = $

$ \begin{bmatrix} \dfrac{e^{\eta_1}}{1+\sum_{i=1}^{k-1}e^{\eta_i}} \\[10pt] \vdots \\[5pt] \dfrac{e^{\eta_{k-1}}}{1+\sum_{i=1}^{k-1}e^{\eta_i}} \\[15pt] \dfrac{1}{1+\sum_{i=1}^{k-1}e^{\eta_i}} \end{bmatrix} $

$ \frac{n!}{\prod_{i=1}^{k} x_i!} $ $ \begin{bmatrix} x_1 \\ \vdots \\ x_k \end{bmatrix} $ $ n\ln \left(\sum_{i=1}^{k} e^{\eta_i}\right) = n\ln \left(1+\sum_{i=1}^{k-1} e^{\eta_i}\right) $ $ -n\ln p_k = -n\ln \left(1 - \sum_{i=1}^{k-1} p_i\right) $
Dirichlet distribution α1,...,αk $ \begin{bmatrix} \alpha_1-1 \\ \vdots \\ \alpha_k-1 \end{bmatrix} $ $ \begin{bmatrix} \eta_1+1 \\ \vdots \\ \eta_k+1 \end{bmatrix} $ $ 1 $ $ \begin{bmatrix} \ln x_1 \\ \vdots \\ \ln x_k \end{bmatrix} $ $ \sum_{i=1}^k \ln \Gamma(\eta_i+1) - \ln \Gamma\left(\sum_{i=1}^k\Big(\eta_i+1\Big)\right) $ $ \sum_{i=1}^k \ln \Gamma(\alpha_i) - \ln \Gamma\left(\sum_{i=1}^k\alpha_i\right) $
Wishart distribution V,n $ \begin{bmatrix} -\frac12\mathbf{V}^{-1} \\[5pt] \dfrac{n-p-1}{2} \end{bmatrix} $ $ \begin{bmatrix} -\frac12{\boldsymbol\eta_1}^{-1} \\[5pt] 2\eta_2+p+1 \end{bmatrix} $ $ 1 $ $ \begin{bmatrix} \mathbf{X} \\ \ln|\mathbf{X}| \end{bmatrix} $ $ -\left(\eta_2+\frac{p+1}{2}\right)\ln|-\boldsymbol\eta_1| $

      $ + \ln\Gamma_p\left(\eta_2+\frac{p+1}{2}\right) = $
$ -\frac{n}{2}\ln|-\boldsymbol\eta_1| + \ln\Gamma_p\left(\frac{n}{2}\right) = $
$ \left(\eta_2+\frac{p+1}{2}\right)(p\ln 2 + \ln|\mathbf{V}|) $
      $ + \ln\Gamma_p\left(\eta_2+\frac{p+1}{2}\right) $

  • Three variants with different parameterizations are given, to facilitate computing moments of the sufficient statistics.
$ \frac{n}{2}(p\ln 2 + \ln|\mathbf{V}|) + \ln\Gamma_p\left(\frac{n}{2}\right) $
NOTE: Uses the fact that $ {\rm tr}(\mathbf{A}^{\rm T}\mathbf{B}) = \operatorname{vec}(\mathbf{A}) \cdot \operatorname{vec}(\mathbf{B}), $ i.e. the trace of a matrix product is much like a dot product. The matrix parameters are assumed to be vectorized (laid out in a vector) when inserted into the exponential form. Also, V and X are symmetric, so e.g. $ \mathbf{V}^{\rm T} = \mathbf{V}. $
inverse Wishart distribution Ψ,m $ \begin{bmatrix} -\frac12\boldsymbol\Psi \\[5pt] -\dfrac{m+p+1}{2} \end{bmatrix} $ $ \begin{bmatrix} -2\boldsymbol\eta_1 \\[5pt] -(2\eta_2+p+1) \end{bmatrix} $ $ 1 $ $ \begin{bmatrix} \mathbf{X}^{-1} \\ \ln|\mathbf{X}| \end{bmatrix} $ $ \left(\eta_2 + \frac{p + 1}{2}\right)\ln|-\boldsymbol\eta_1| $

      $ + \ln\Gamma_p\left(-\Big(\eta_2 + \frac{p + 1}{2}\Big)\right) = $
$ -\frac{m}{2}\ln|-\boldsymbol\eta_1| + \ln\Gamma_p\left(\frac{m}{2}\right) = $
$ -\left(\eta_2 + \frac{p + 1}{2}\right)(p\ln 2 - \ln|\boldsymbol\Psi|) $
      $ + \ln\Gamma_p\left(-\Big(\eta_2 + \frac{p + 1}{2}\Big)\right) $

$ \frac{m}{2}(p\ln 2 - \ln|\boldsymbol\Psi|) + \ln\Gamma_p\left(\frac{m}{2}\right) $
normal-gamma distribution α,β,μ,λ $ \begin{bmatrix} \alpha-\frac12 \\ -\beta-\dfrac{\lambda\mu^2}{2} \\ \lambda\mu \\ -\dfrac{\lambda}{2}\end{bmatrix} $ $ \begin{bmatrix} \eta_1+\frac12 \\ -\eta_2 + \dfrac{\eta_3^2}{4\eta_4} \\ -\dfrac{\eta_3}{2\eta_4} \\ -2\eta_4 \end{bmatrix} $ $ \dfrac{1}{\sqrt{2\pi}} $ $ \begin{bmatrix} \ln \tau \\ \tau \\ \tau x \\ \tau x^2 \end{bmatrix} $ $ \ln \Gamma\left(\eta_1+\frac12\right) - \frac12\ln\left(-2\eta_4\right) - $

      $ - \left(\eta_1+\frac12\right)\ln\left(-\eta_2 + \dfrac{\eta_3^2}{4\eta_4}\right) $

$ \ln \Gamma\left(\alpha\right)-\alpha\ln\beta-\frac12\ln\lambda $