### The Dirichlet (also Beta) distribution

Here's a category of distributions we may often want: a distribution on the simplex.

I.e. a multivariate distribution on $n$ nonnegative numbers that add up to 1. This is something that we can definitely see using as prior distributions on parameters that can be interpreted as "probabilities" of something. One can see that this is important for the Categorical and Multinomial distributions, for instance: and in the case of two numbers (i.e. a univariate distribution, since it's on a line segment), for the Bernoulli and Binomial distributions.

Here's one such distribution family that may come to your mind: for $x_i$ in the simplex $\sum_i x_i = 1$,

$$f(x_1,\dots x_n\mid\theta_1,\dots\theta_n)\propto \prod_{i}{x_i}^{\theta_i}$$
By adjusting the values of the $\alpha$s, one can get suitable priors that represent our beliefs correctly. This is known as the Dirichlet distribution, and its univariate case $f(x|\theta_1,\theta_2)\propto x^{\theta_1}(1-x)^{\theta_2}$ is known as the Beta distribution.

In fact the parameters of said distributions are usually provided a bit differently, with $\alpha-1=\theta$.

Exercise: Prove that:
• The normalization constant is given by $\frac{\Gamma\left(\sum_i\alpha_i\right)}{\prod_i \Gamma(\alpha_i)}$
• The mean is given by $E(X_i)=\frac{\alpha_i}{\sum_i\alpha_i}$
• The Dirichlet distribution is the conjugate prior to the categorical/multinomial distribution. This is the key fact that makes the Beta/Dirichlet distribution important.