### Order statistics

The idea behind sampling is the duality between a tuple of IID random variables and a certain multivariate random variable -- since a sample of a distribution is just some tuple $\mathbf{X}=(X_1,\ldots X_n)$, one can consider this to be a random variable taking values in $\mathbb{X}^n$ where each $X_i$ is a random variable taking values in $\mathbb{X}$.

In particular, one can define measurable functions $\mathbb{X}^n\to\mathbb{X}$ such as, e.g. sample moments (the sample mean, etc.). Another set of sample statistics one may define (when $\mathbb{X}$ is a totally ordered set, such as $\mathbb{R}$ but notably not something like $\mathbb{R}^m$) are the order statistics, which we will denote as $\Omega_i$, where $\Omega_i(x_1,\ldots x_n)$ gives the $i$th value in the sorted (in ascending order) list -- in particular, $\Omega_1$ is the $\min$ function and $\Omega_n$ is the $\max$ function.

Well, so $\Omega_i(\mathbf{X})$ is a random variable -- we can ask about how it's distributed.

For example, to calculate the cumulative of $\Omega_n=\max$, note that $\max(\mathbf{X})\le w \iff \forall i, X_i\le w$. Since the $X_i$s are IID, the CDF is just:

$$F_{\Omega_n(\mathbf{X})}(w)=F_{X}(w)^n$$

It's clear that as $n\to\infty$, this function approaches (non-uniformly) either the Heaviside step function or zero (or really the "Heaviside step function at $+\infty$"). This makes sense -- if your distribution has a finite upper bound, then you'll eventually get that bound and the maximum of an infinite sample (i.e. of the distribution) will be that bound, but if it doesn't, then you're eventually bound to get every value, the maximum of an infinite sample is infinity.

Illustration of $F(x)^n$ for different distributions as $n\to\infty$

Similarly, $\min(\mathbf{X})\le w\iff \lnot \forall i, \lnot (X_i\le w)$. So the CDF is:

$$F_{\Omega_1(\mathbf{X})}(w)=1-(1-F_X(w))^n$$
OK, teaser over. Now consider $F_{\Omega_i(\mathbf{X})}(w)$, which is the probability that at least $i$ of the data points are $\le w$. The probability that some some specific $r$-selection is exactly these data points is $F_X(w)^r(1-F_X(w))^{n-r}$. So:

$$F_{\Omega_i(\mathbf{X})}(w)=\sum_{r=i}^{n}{\binom{n}{i}}F_X(w)^i(1-F_X(w))^{n-i}$$
Interestingly, the joint PDF of the order statistics (which are not at all uncorrelated) actually has a much simpler form -- the probability that $(\Omega_{1}(\mathbf{X}),\ldots \Omega_{n}(\mathbf{X}))$ takes the value $(x_1,\ldots x_n)$ is zero if the latter is not in ascending order. And if it is, the value can result from $(X_1,\ldots X_n)$ taking a value that is some permutation of $(x_1,\ldots x_n)$ -- and there are $n!$ such permutations. So the joint PDF is:

$$f_{\mathbf{\Omega}(\mathbf{X})}(\mathbf{w})=n!\prod_i f_X(w_i)$$
So the formulae above are just some fancy special cases of integration by parts on the above.