A **statistical model** in the most general sense is a proposed distribution for some random variable. In the simplest case, one can have a univariate distribution, e.g. $X\sim N(\mu, \sigma^2)$ and experiment to get posterior distributions for the values of $\mu$, $\sigma$.

More generally, the random variable in question may be correlated with some other random variable. Well, the most general way to express such a model is via its joint distribution, but the joint distribution might be too expensive to reasonably test -- a classic example is that of causal relationship.

However, there are various special cases of this general description that naturally capture the assumptions that one would make while attempting to model these variables:

**Hierarchical/Causal model:**We arrange the random variables in a tree-like format, with each parent's distribution taking the value of its child random variables as an explanatory variable -- the distribution of each node is marginalized against the distribution of its parents so that it only depends on its own children. E.g. $Y\sim N(\mu_{0Y} + \mu_{1Y} X, 1)$, $X\sim N(\mu_X, 1)$.**Exogenous variables:**We don't bother about the distribution of the explanatory variables, treating them as "exogenous variables". In the context of decision theory, these should be seen as variables which we can manipulate at will -- e.g. if we're discussing the effects of taxation on GDP growth, then tax rates should be seen as an exogenous variable (even though it might be true that the tax rate is*actually*influenced by voter priorities and can be predicted in such way -- that's just not the question we're interested in addressing). Essentially, we are only interested in modeling the conditional distribution.

**Causation**

*partial correlation*.

*apart*from tax rates, e.g. regulations were fewer in the 1960s, the kinds of industries and their accounting in GDP were different in the 1960s.

*want*to answer is "what will be the effect of increasing taxes be on economic growth?" Since it is unlikely that increasing taxes will reduce regulations or change accounting methods, we want to remove the contributions of these correlations in the explanatory variables from the correlation between tax rates and economic growth.

**partial correlations**. To study the

*causal link*between a dependent variable $Y$ and an independent variable $X_i$ means to study the

**conditional distribution**of $Y\mid X_{j\ne i}$.

*completely depends on the random variables we have chosen to control*. For example, if we had also controlled for some economic variables that taxation affects GDP growth "through", the effect of changing taxation would perhaps be smaller. The choice of these variables depends on our purpose -- i.e. based on what we're actually able to control.

Apply this reasoning to the context of a classic example: since wind speed correlates with the rotation of a windmill, does this mean the windmill rotation affects wind speed? What are we really asking here? What are our variables (you should have at least 3 of them)?

**partial correlation**of $Y\mid X_{j\ne i}$ as the value of the correlation computed from the conditional distribution of $Y\mid X_{j\ne i}$, and may geometrically be interpreted as the $\cos\theta$ of the projections of $Y$ and $X_i$ (as vectors) onto the orthogonal complement of the span of the controlled variables (therefore eliminating the correlations $Y$ and $X_i$ have with them).

**Causal models**

*underlying explanation*-- it is represented in the form of

*causal networks*, which are directed acyclic graphs that are

*models*of the underlying behavior of the system. It's much like how quarks are an underlying model of the observed particle zoo -- the quarks themselves cannot be observed, but they form a framework on which predictions can be made.

*equivalent*in the sense that they lead to the same joint distribution -- saying that something causes another thing is not an absolute truth, but simply

*a*model that is consistent with the truth. If two models lead to an identical joint distribution, they are "equivalent" for all physical purposes.

*not*equivalent, and can thus be distinguished by experiment.

*about*conditional independence), and so the equivalence of two trees means that they have the same pattern of conditional independence. Reading off patterns of conditional independence is known as

**D-separation**.

## No comments:

## Post a Comment