Curvature is just the Hessian

If you recall some basic calculus, the gradient of a scalar function $f(x_1,\dots x_n)$ is just the generalization of the derivative: 

$$f'(x_1,\dots x_n) =\left[\begin{array}{}\frac{\partial f}{\partial x_1} \\ \vdots \\ \frac{\partial f}{\partial x_n} \end{array} \right] $$

And the Hessian of a scalar function $f(x_1,\dots x_n)$ is just the generalization of the second derivative:

$$f''(x_1,\dots x_n) =\left[\begin{array}{}\frac{\partial^2 f}{\partial x_1^2} & \dots & \frac{\partial^2 f}{\partial x_1\partial x_n} \\ \vdots & \ddots & \vdots \\ \frac{\partial^2 f}{\partial x_n\partial x_1} & \dots & \frac{\partial^2 f}{\partial x_n^2} \end{array} \right] $$

Why is this interesting? Consider just $f$ quadratic -- then just like in one dimension, $f$ can be written only in terms of its value, derivative, and second derivative at 0, $f$ can be written only in terms of its value, gradient and Hessian at 0. 

$$  \begin{align} f(x,y) &= c + (c_1x+c_2y) + (c_{11}x^2 + c_{12}xy + c_{21}y^2) \\ &= f(0) + \left(f_x(0)x+f_y(0)y\right) + \frac12 \left(f_{xx}(0)x^2+2f_{xy}(0)xy+ f_{yy}(0)y^2\right) \\ f(\mathbf{x}) &= f(0) + f'(0)\cdot \mathbf{x} + \mathbf{x}\cdot f''(0) \mathbf{x} \end{align}$$

What this tells us is:

  • The gradient is naturally thought of as a linear form.
  • The Hessian is naturally thought of as a quadratic form.

A what and a what?

There are two ways of thinking of a thing like $\left[\begin{array}{}a \\ b \end{array} \right]$ -- a vector $a\mathbf{e}_1+b\mathbf{e}_2$, or a linear expression $ax_1+bx_2$, a function on $x_1,x_2$. The former is an object in the space $\mathbb{R}^n$, while the latter is a function $\mathbb{R}^n\to\mathbb{R}$ (do you see why?).  

Similarly, there are two ways of thinking of a matrix $\left[\begin{array}{}a_{11} & a_{12} \\ a_{21} & a_{22} \end{array} \right]$ -- a linear transformation $\mathbb{R}^n\to\mathbb{R}^n$, or a quadratic expression $a_{11}x^2+(a_{12}+a_{21})xy+a_{22}y^2$, which is a function  on $x_1,x_2$, a function $\mathbb{R}^n\times\mathbb{R}^n\to\mathbb{R}$ (do you see why?). 

This is what duality is in linear algebra. Also in tensor notation, vectors are $v_i$ while linear forms are $v^i$; linear transformations are $A_i^j$, quadratic forms are $A_{ij}$ -- see also. Don't bother with this if you don't want to.

So e.g. the gradient should naturally be thought of as a function that, given some vector as input, gives you the directional derivative in the direction of that vector.

(Make sure you understand this very clearly.)

Similarly, the Hessian should be thought of as a function that, given two vectors as input, gives the second derivative in their directions $f_{xy}$.

(Make sure you understand this VERY clearly.)

Now suppose we wanted to talk about the curvature of a surface.

We know that the curvature of some curve $\phi(t)$ at the point $t=0$ is $\phi''(0)$. Naturally, we'd like the "curvature of a surface" would be something of a function that gives you the curvature in each direction -- that gives you the second derivative in each direction. So naturally, you'd want something like the Hessian.

I'm not sure if the cross-derivative $f_{xy}$, i.e. $A(X, Y)$, has any natural geometric interpretation. Does this have anything to do with torsion? Does $A(X, Y)$ ever come of use?

So we'd like to define some quadratic form $A$ such that $\phi'(0)\cdot A \phi'(0)$ is the curvature $\phi''(0)$. Actually, it should just be the normal curvature, the component of $\phi''(0)$ normal to the surface, the sort of curvature that can be attributed entirely to the surface, rather than to the curve wiggling around on the surface.

[For whomsoever it may concern, Theorem 10.4 in your notes is what computes this quadratic form $A$ as the differential of the Gauss map, and is what motivates the Gauss map in the first place. This is why you should start with the last chapter and read backwards.]

2 comments:

  1. Hello, great article but could you please give some examples to illustrate this? Cannot really make sense of
    "So e.g. the gradient should naturally be thought of as a function that, given some vector as input, gives you the directional derivative in the direction of that vector.

    (Make sure you understand this very clearly.)

    Similarly, the Hessian should be thought of as a function that, given two vectors as input, gives the second derivative in their directions (Make sure you understand this VERY clearly.)"

    ReplyDelete
  2. Blackfriars_this_fridayMay 7, 2022 at 8:41 PM

    Hi, regarding your comment on if A(X,Y) ever come to use, I would like to point that if they belong to the orthonormal basis of TpS then they would naturally be 0, so what significance do they hold?

    ReplyDelete