### Statistics

See also the Probability Puzzles section of the blog, and the other probability courses.

Bayes's theorem
Statistics can be understood as the study of the applications of Bayes's theorem with different priors.
1. ★★★☆☆ Introduction to Bayesian inference
2. ★★★☆☆ Uniform priors: Maximum Likelihood Estimation
3. ★★★☆☆ Classical priors: Confidence Regions and Hypothesis tests
4. Objective priors: Maximum entropy, Jeffreys, etc.
5. ★★★☆☆ Bounded rationality: You cannot hack Bayes's theorem
6. ★★★☆☆ Bounded rationality: I don't believe p-hacking is a problem
7. ★★★☆☆ Bounded rationality: Bayes, bias, p-hacking and the Monty-Hall problem
8. ★★★☆☆ Frequentism as a bureaucratic restriction on speech
Decision theory
General statistics
Informational properties of estimators/statistics.
Moment statistics
1. ★★★★☆ Random variables as vectors
2. ★★★★☆ Covariance matrix and Mahalanobis distance
3. ★★★★☆ Moments as tensors; tensor notation for moments
4. Moment generating functions, aka Why/when moments suffice
5. ★★☆☆☆ Probabilistic convergence
6. ★★★☆☆ Probabilistic inequalities
Robust statistics
Median, absolute error, etc.

Sample statistics
1. ★★☆☆☆ Sample statistics: Central limit theorem
2. ★★★☆☆ Sample statistics: Order statistics
3. Sample statistics: Extreme value theory
Regression
Linear models serve as a "toy" prerequisite to machine learning -- while machine learning is more general, dealing with linear problems first allows us to more clearly understand the objectives and capabilities of machine learning.
1. ★★★☆☆ Introduction: Statistical models; causal models
2. ★★★☆☆ Introduction: Normal linear models
3. Introduction: Hierarchical models
4. Introduction: Piranha theorem [1]
5. Non-linearity: Kernels, GLMs
6. Non-linearity: Hierarchical clustering, bootstrap [1], kNN
7. Non-linearity: ACE, backfitting, etc.
8. Non-linearity: Game dynamics, MCMC, etc.
9. Neural networks: Universal approximation, Turing completeness
10. Neural networks: Gradient descent and modifications [1][2]
11. ★★☆☆☆ Neural networks: Backpropagation and the chain rule
12. ★★☆☆☆ Neural networks: Overfitting and hyperparameter optimization
13. ★☆☆☆☆ Neural networks: Overview of basic neural network architectures [1]
14. Neural networks: Bayesian prior of neural networks [1] [2]
15. Neural networks: Gaussian processes and infinite-width networks [1] [2]
Processes
Time-series and random processes; ARIMA/differential equations, characteristic functions intuition, ambit stochastics; actuary science, Markov chains, random walks, martingales, Stochastic calculus

Reference: special distributions
This is not to be read "in order" or after the above section or whatever -- it's just some reference material on special cases.
1. Normal (inflection pt, rot. invar., etc.)
2. ★★★☆☆ Bernoulli and Poisson processes
3. ★☆☆☆☆ The Dirichlet (also Beta) distribution
4. Distribution of means with non-vanishing variance: Cauchy
5. Heavy tails, perhaps some generalisations