Processing math: 100%

Order statistics

The idea behind sampling is the duality between a tuple of IID random variables and a certain multivariate random variable -- since a sample of a distribution is just some tuple X=(X1,Xn), one can consider this to be a random variable taking values in Xn where each Xi is a random variable taking values in X.

In particular, one can define measurable functions XnX such as, e.g. sample moments (the sample mean, etc.). Another set of sample statistics one may define (when X is a totally ordered set, such as R but notably not something like Rm) are the order statistics, which we will denote as Ωi, where Ωi(x1,xn) gives the ith value in the sorted (in ascending order) list -- in particular, Ω1 is the min function and Ωn is the max function.

Well, so Ωi(X) is a random variable -- we can ask about how it's distributed.

For example, to calculate the cumulative of Ωn=max, note that max(X)wi,Xiw. Since the Xis are IID, the CDF is just:

FΩn(X)(w)=FX(w)n

It's clear that as n, this function approaches (non-uniformly) either the Heaviside step function or zero (or really the "Heaviside step function at +"). This makes sense -- if your distribution has a finite upper bound, then you'll eventually get that bound and the maximum of an infinite sample (i.e. of the distribution) will be that bound, but if it doesn't, then you're eventually bound to get every value, the maximum of an infinite sample is infinity.


Illustration of F(x)n for different distributions as n

Similarly, min(X)w¬i,¬(Xiw). So the CDF is:

FΩ1(X)(w)=1(1FX(w))n
OK, teaser over. Now consider FΩi(X)(w), which is the probability that at least i of the data points are w. The probability that some some specific r-selection is exactly these data points is FX(w)r(1FX(w))nr. So:

FΩi(X)(w)=nr=i(ni)FX(w)i(1FX(w))ni
Interestingly, the joint PDF of the order statistics (which are not at all uncorrelated) actually has a much simpler form -- the probability that (Ω1(X),Ωn(X)) takes the value (x1,xn) is zero if the latter is not in ascending order. And if it is, the value can result from (X1,Xn) taking a value that is some permutation of (x1,xn) -- and there are n! such permutations. So the joint PDF is:

fΩ(X)(w)=n!ifX(wi)
So the formulae above are just some fancy special cases of integration by parts on the above.

No comments:

Post a Comment