The Winding Number: Frequentism as a bureacuratic restriction on speech

Although it is traditional to claim that Bayesian and frequentist statistics are two "schools" of probability and statistics, there are actually two distinct ideas that are labeled "frequentist" (and analogously "Bayesian"):

Frequentist probability: An interpretation of probability, an alternative to the subjective interpretation of probability normally associated with Bayesianism. Under the frequentist interpretation, probability is understood as having to do with frequencies of observations, e.g. a probability of 1/2 means that you can expect 50 coin tosses out of 100 to come up heads. In reality, this merely shifts the problem to interpreting expectation instead of probability.
Frequentist statistics: A formulation of statistical inference, in which we don't really discuss probabilities of theories at all, but restrict ourselves to talking about tautological statements about the probabilities of observations given theories. The supposed advantage of this is that we don't need to follow unproven beliefs about priors.

In the Bayesian view, the frequentist interpretation of probability appears as follows: given a random variable $X$, we sample a sequence of IID variables $X_i$ -- the "limiting" distribution of these $X_i$s is then seen as the probability distribution of $X$.

(From a "positivist"/"scientific" viewpoint, it is rather absurd to define the probability of heads on a coin-flip in terms of a large number of other coin flips. After all, it is only an approximate assumption that all these coin flips are distributed identically. The purest way to talk about IID variables would be to consider an infinite number of universes, each in which a coin flip occurs. Of course, if the universes were identical in every way, the coin flips would be the same in every universe and the probability of heads will be 0 or 1. Probability arises from the fact that we have limited information about the state of the universe -- thus the average is not being taken over all identical universes, but over all universes which match the limited information we have. This "purification" of frequentism is simply Bayesian probability, as the "set of universes" is just the sample space, and the distribution of the unknown information is a Bayesian prior.)

The reason that frequentist statistics is typically associated with frequentist probability is that frequentist probabilists see it as odd to talk about the frequency of a theory being true -- to them, the same theory is true in each experiment conducted, since frequentist probability does not consider the situations in "alternate universes". Of course, this distinction between theory and observation is completely arbitrary -- on a logical level, theories and observations are both just logical statements, and analogously parameters and observed quantities are both just random variables.

To "forbid" talking about the probability of a theory is simply a bureaucratic restriction on what one can say, not something that affects your decisions in any way. When it comes to making decisions, one has to compute expected utilities, which requires a Bayesian approach (because frequentism does not consider "multiple universes"). Even if you don't say the probability of a theory is 80%, and only say that the probability of some observation would be 95% given that theory, the fact that your discipline "accepts" a p-value of 0.05 itself implicitly means that you believe the theory to be true with sufficient certainty.

Frequentism as a bureacuratic restriction on speech

No comments:

Post a Comment