Bayes, bias, p-hacking and the Monty Hall problem

TL;DR: Bias is a fundamental concept in the science of agents; bias of a source is fundamentally related to cognitive bias.


You open the news in the morning, and you see the following headline from The Pan-Pan times: 80% of Chimpanzee Party lawmakers have criminal cases filed against them.

"Outrageous!" You cry-- "The Chimpanzee Party has no respect for the law! I certainly shall not be voting for them!" 

But upon subsequent investigation, you find that while this statement is true, in fact, 80% of Bonobo Party lawmakers also have criminal cases filed against them. Or perhaps you find that they have criminal cases filed against them -- in an unrecognized country created by some crackpot on Reddit. And the person responsible for generating these headlines was aware of this fact.

You feel cheated, even though everything you heard was the truth. Even though The Pan-Pan times only gave you information, and you acted as a rational agent when exposed to this information (well, actually you didn't, but let's ignore that for now), you feel like you've been "exploited" somehow, "tricked" even. 

Is this truly possible? Can a Bayes-rational agent truly be fooled by cleverly selecting information to provide?

Let's think about the problem more carefully. 

Our ultimate decision might be governed by the following rule: vote for whichever party you expect to have fewer criminal cases filed against its lawmakers. We may have some prior distribution on the fraction of criminal-accused lawmakers of each party, say $\mathrm{B}(2,3)$ and $\mathrm{B}(3,2)$ for the Chimpanzee and Bonobo parties respectively. Under the prior, the expected fraction of criminal-accused lawmakers are 0.4 and 0.6 respectively, and one would vote for the Chimpanzee Party; under the posterior, the expected fraction of criminal-accused lawmakers are 0.8 and 0.6 respectively, and one would vote for the Bonobo Party. 

Or.

Our ultimate decision might be governed by the following rule: vote for whichever party minimizes the expected sum $X_1+\dots+X_{100}$, where say: $X_1$ is the regulatory burden the party will place upon coming to power, $X_2$ is the tax rate the party will implement upon coming to power, $X_3$ is the number of criminal cases against the party lamakers, $X_4$ is the number of bad words the party's candidates use on TV, $X_5$ is the number of lies the party's candidates say on TV, $X_6$ is the number of dissidents the party will throw in prison upon coming to power, etc.

Let's say, for simplicity, that these are all Bernoulli distributed variables -- further that each $X_i$ (Chimpanzee Party) and $Y_i$ (Bonobo Party) are distributed as $\mathrm{Bernoulli}(0.5)$. Then in this prior distribution, we would be uncertain as to whom to vote for, as both parties have $\mathrm{E}[\sum X_i]=50$ is lower. 

And suppose the Pan-Pan Times tells us: "We looked at $X_{3}$ and it turned out to be 1 for the Chimpanzee Party!" Now, $\mathrm{E}[\sum X_i]=50.5$, while $\mathrm{E}[\sum Y_i]=50$, so we vote for the Bonobo Party. 

But here's the thing though: the precise information you receive isn't $X_3$ is equal to 1 -- it is the Pan-Pan Times's report is "$X_3$ is equal to 1". And that's what you should be conditioning on.

If you were to condition on the Pan-Pan Times's report is "$X_3$ is equal to 1" (call this variable $\Pi$), well, what would your inference look like? You could apply Bayes's theorem, etc. but simply put -- let's say the Pan-Pan Times's report is some generative process that looks at all the $X_i$, and reports one that is equal to 1 (i.e. tosses a hundred coins and reports a heads) -- it can do so in all but $2^{-100}$ of outcomes, so the only information we're given is that that particular outcome (where all $X_i$ are 0) is not the case -- the only information we're given is that the Chimpanzee Party isn't literally perfect -- so that $\mathrm{E}[\sum X_i] = 50/(1-2^{-100})$.

(OK, in this case, the decision is the same -- but for example, suppose our decision was instead "donate some sum of money proportional to the difference in $\mathrm{E}[\sum X_i]-\mathrm{E}[\sum Y_i]$" then the decision would be different.)

To instead condition on just $X_3=1$ -- rather than the full information provided -- is a cognitive bias. As actually implementing Bayes's theorem everywhere is expensive, the mind processes information using heuristics -- one such heuristic is that only some information is selected to be conditioned on, leading to selection bias. Indeed, all such "source biases" are fundamentally manifested as some form of cognitive bias -- the use of negative terms to describe the Chimpanzee Party, for example, is the exploitation of some sort of the association fallacy, repeating the word "Bonobo Party" for hours on screentime is an exploitation of the availability heuristic, etc. 

A perfectly Bayes-rational agent -- that takes into account all information it is exposed to -- is immune to being tricked in this way. But a real agent, which uses heuristics, can be exploited. The idea is that if such biases are systematic, then it can be predicted and avoided cheaply. 

No comments:

Post a Comment