See also the Probability Puzzles section of the blog, and the other probability courses.

**Bayes's theorem**

Statistics can be understood as the study of the applications of Bayes's theorem with different priors.

- ★★★☆☆ Introduction to Bayesian inference
- ★★★☆☆ Uniform priors: Maximum Likelihood Estimation
- ★★★☆☆ Classical priors: Confidence Regions and Hypothesis tests
- Objective priors: Maximum entropy, Jeffreys, etc.
- ★★★☆☆ Bounded rationality: You cannot hack Bayes's theorem
- ★★★☆☆ Bounded rationality: I don't believe p-hacking is a problem
- ★★★☆☆ Bounded rationality: Bayes, bias, p-hacking and the Monty-Hall problem
- ★★★☆☆ Frequentism as a bureaucratic restriction on speech

**Decision theory**

**General statistics**

Informational properties of estimators/statistics.

- ★★★☆☆ Sufficient statistics and the Rao-Blackwell theorem
- ★★★☆☆ The Three Theorems of complete statistics
- Fisher information [0] [1] [2]

**Moment statistics**

- ★★★★☆ Random variables as vectors
- ★★★★☆ Covariance matrix and Mahalanobis distance
- ★★★★☆ Moments as tensors; tensor notation for moments
- Moment generating functions, aka Why/when moments suffice
- ★★☆☆☆ Probabilistic convergence
- ★★★☆☆ Probabilistic inequalities

**Robust statistics**

Median, absolute error, etc.

**Sample statistics**

- ★★☆☆☆ Sample statistics: Central limit theorem
- ★★★☆☆ Sample statistics: Order statistics
- Sample statistics: Extreme value theory

**Regression**

Linear models serve as a "toy" prerequisite to machine learning -- while machine learning is more general, dealing with linear problems first allows us to more clearly understand the objectives and capabilities of machine learning.

- ★★★☆☆ Introduction: Statistical models; causal models
- ★★★☆☆ Introduction: Normal linear models
- Introduction: Hierarchical models
- Introduction: Piranha theorem [1]
- Non-linearity: Kernels, GLMs
- Non-linearity: Hierarchical clustering, bootstrap [1], kNN
- Non-linearity: ACE, backfitting, etc.
- Non-linearity: Game dynamics, MCMC, etc.
- Neural networks: Universal approximation, Turing completeness
- Neural networks: Gradient descent and modifications [1][2]
- ★★☆☆☆ Neural networks: Backpropagation and the chain rule
- ★★☆☆☆ Neural networks: Overfitting and hyperparameter optimization
- ★☆☆☆☆ Neural networks: Overview of basic neural network architectures [1]
- Neural networks: Bayesian prior of neural networks [1] [2]
- Neural networks: Gaussian processes and infinite-width networks [1] [2]

**Processes**

Time-series and random processes; ARIMA/differential equations, characteristic functions intuition, ambit stochastics; actuary science, Markov chains, random walks, martingales, Stochastic calculus

**Reference: special distributions**

This is not to be read "in order" or after the above section or whatever -- it's just some reference material on special cases.

- Normal (inflection pt, rot. invar., etc.)
- ★★★☆☆ Bernoulli and Poisson processes
- ★☆☆☆☆ The Dirichlet (also Beta) distribution
- Distribution of means with non-vanishing variance: Cauchy
- Heavy tails, perhaps some generalisations

## No comments:

## Post a Comment