### Polynomial interpolation and Vandermonde

Suppose you want to find the minimum-degree polynomial $p(x) = \sum {a_jx^j}$ passing through some points $(x_i, y_i)$. This amounts to solving the system:

$$\sum {a_j{x_i}^j}=y_i$$
A first bit of intuition: it seems completely reasonable that any set of $n+1$ points with $x_i$s distinct (so you're actually describing a function), can be interpolated with a polynomial of $n$ degree. What this means is that the matrix $X_{ij}={x_i}^j$, called the Vandermonde matrix, should be square for it to be invertible. Indeed, it's easy to show by considering the degree of the determinant polynomial of the matrix that:

$$\det X = \prod_{1\le i < j \le n} {(x_i-x_j)}$$

So the question is of course if there's a simple general expression for the inverse of the Vandermonde matrix.

Here's an idea: if all the $y_i$s were zero, then an $n+1$-degree (NOT $n$-degree! this is a different category of problem, which is not uniquely determined) polynomial would be a constant times $(x-x_0)\dots(x-x_n)$. If we didn't want it to be zero at some particular $x_{i_0}$, we could exclude $x-x_{i_0}$ from the product. Then we can play with the constants so it takes the value we really want ($y_{i_0}$).

In other words, the polynomial

$$L_n^{i_0}(x) = \frac{1}{\prod_{i\ne {i_0}}(x_{i_0}-x_i)}\prod_{i\ne i_0}(x-x_i)$$
Called the Lagrange polynomial gives zeroes at all $x_{i}$ except $x_{i_0}$, where it gives 1. Therefore the polynomial:

$$\sum_i{y_i L_n^i(x)}$$
Which is conveniently of $n$-degree, is the true interpolating polynomial.

Conveniently, this approach also tells you what to do when you get a new data point: just add a polynomial that is zero at the existing points and the right adjustment at the added data point. I.e. given $P_n$ is the $n$-degree interpolating polynomial for $(x_0,y_0)\dots(x_n, y_n)$, we want to add $p_{n+1}$, the $n+1$-degree polynomial that is zero at all these points but $y_{n+1}-P_n(x_{n+1})$ at $x_{n+1}$. I.e.

$$P_{n+1}(x)=P_n(x)+\left(y_{n+1}-P_n(x_{n+1})\right)L_{n+1}^{n+1}$$
Alternatively we can also see the added term as a polynomial proportional to $\prod_{i<n+1}{(x-x_i)}$, with the coefficient given by $(y_{n+1}-P_n(x_{n+1})/\prod_{i<n+1} {(x_{n+1}-x_i)}$.

It is easy to show that this coefficient, which we will write as $f[x_0,\dots x_{n+1}]$, can be written recursively as:

$$f[x_0,\dots x_n]=\frac{f[x_1,\dots x_n] - f[x_0,\dots x_{n-1}] }{x_n-x_0}$$
This is known as Newton's divided difference, and is the coefficient on $x^n$ in the interpolating polynomial -- one may observe that this is a discrete analog of the higher-order derivative (with some attention given to the denominators). It should be perfectly natural that this occurs of course.