### Intuition behind some basic ideas of calculus

Fundamental theorem

The fundamental theorem of calculus is fundamentally a theorem that links differential calculus to integral calculus. The best way to understand it is to realise that the rate of change of some volume is proportional to the surface of the region where it is being expanded -- for example, the rate of change of $\pi r^2$ when a circle is expanded along its circumference -- i.e. the radius is increased -- is $2\pi r \frac{dr}{dt}$. The simplest and most relevant example is the area under any given curve  $f(x)$ between some points $x=a$ and $x=b$. As the curve is extended by a unit $dx$ beyond $x=b$, the area increases from $A$ to $A+f(x)dx$, hence $dA=f(x)dx$ and $dA/dx=f(x)$. As the point $a$ might anywhere along the $x$-axis, the value of $A(x)$ is indefinite unless $a$ is also known -- here, you have the area of the curve between two given points.

Full derivative of a multivariable function

You often hear in multivariable calculus, such as in a special case of the multivariable chain rule, of the full derivative with respect to $x$ of a multivariable function of the form $f(x,y(x))$. This doesn't make a lot of sense at first, and they key to understanding this is to realise that $f(x,y(x))$ is no longer a multivariable function, but rather a single-variable function "inside" of a multivariable function.

In other words, if the general $f(x,y)$ represents a surface in three dimensions, then $f(x,y(x))$ represents a curve on that surface. Of course, this is the case with any function of the sort $f(x(t),y(t))$, except here we set the parameterisation as $x=t$, $y(t)=y(x(t))$.While $\partial f/\partial x$ is simply the partial derivative of the surface, $df/dx$ is the slope of the curve along that surface.

Gradient

We're told that the vector of steepest ascent (with direction the direction of steepest ascent and magnitude the ratio of the magnitude of this steepest ascent to the direction moved on the x-y plane) is the vector $\left(\partial z/\partial x_1, \partial z/\partial x_2,...\partial z/\partial x_n\right)$, but this might not immediately be intuitive.

The following description is explained for functions of two variables, but it can be extended to any number of variables.

Take some point $(x,y)$ where the multivariable function has derivatives at the point. Suppose that $\partial z/\partial x$ were zero. Then going in any direction besides straight in the y-direction will mean going a bit through the $x$-direction, where there is no vertical gain. Therefore, the best direction to go (to make the greatest vertical ascent) is in the $y$-direction.

Likewise if $\partial z/\partial y$ were zero -- the best direction, then becomes the $x$-direction. And what would the ascent be? Well, we're looking at infinitesimal movements, so it's only meaningful to talk about the instantaneous rate of ascent, i.e. the derivative in the direction of steepest ascent.

(Note: extending this intuition to more dimensions would require setting all but one of the partial derivatives to zero.)

But now let's think about the general case where nothing is zero. There is going to be some direction of steepest ascent at a point, but we don't yet know how to calculate it. Now suppose we make $\partial z/\partial x$ slightly bigger at the point, holding $\partial z/\partial y$ constant. Now, going more in the x-direction yields a greater ascent than before, and the $x$-component of our gradient vector should increase.

The central point is this: for a well-behaved (continuous, differentiable) function, all the derivatives (directional derivatives, to be precise) at a point can be written as a linear combination of any two of them, which are linearly independent. The answer to "which derivative is greatest?", therefore, can be determined simply with the knowledge of two directional derivatives in non-parallel directions. Specifically, the above argument should give you an idea as to why the $(\partial z/\partial x,\partial z/\partial y)$ expression for the gradient in Cartesian co-ordinates makes sense.

Radius of curvature

It is often claimed in elementary textbooks that the radius of curvature is the radius of a circle inscribed within the curves of a function.

Clearly, this explanation makes little sense. You cannot inscribe a circle within most curves. One might wonder if there is a better interpretation for this circle than the incorrect "it's inscribed within a curve" explanation. One may wonder if the function can be locally approximated as a circle much like you can locally approximate a function as a line segment (by setting the first derivatives equal), parabola (by setting the second derivative equal), cubic (third derivative), etc.

However, this clearly only works with polynomials, not with circles, as however many derivatives one may set as equal to that of the function, the approximation will remain a polynomial, and will not appear circular unless the original function is itself circular.

Instead, one appeals to the equation $a=v^2/r$ from basic mechanics, pretending that there is a particle transversing the path at some speed $v$. Then we look at the (centripetal) acceleration of this particle, then pretend that the particle letting this be $a$ and solving for $r$, which we call the radius of curvature. The circle of curvature is the circle tangent to the curve at that point with radius $r$, i.e. the circle it seems to be transversing as it moves around that curve (or precisely, an infinitesimally small region curvy region).

Then the acceleration $a$ is essentially $d\vec{v}/dt=v\frac{d\hat{T}}{dt}$ for constant speed $v$. Letting $1/r=a/v^2$, this means $1/r=\frac1v \frac{d\hat{T}}{dt}=\frac{d\hat{T}}{ds}$ where $\hat{T}$ is the unit tangent vector to the curve (and therefore to the particle's motion) at the point and $s$ is the distance transversed. This is the curvature, and its inverse is its radius.

The ratio tests of convergence

This is a useful exercise to demonstrate how intuition is formalised into a proof.

It states that if the ratio of two consecutive terms in a series approaches a number less than 1 near infinity, the series converges. Intuitively, this makes sense -- at some really high number, if the common ratio is less than one, we're left with a convergent infinite geometric series to the right of it and a finite series to its left.

Can we find such a "really high number"? Well, yes. That's how a limit (specifically a limit to infinity) is defined. So we formalise our intuition in the following sense: if the limit of the common ratio is 0.7, then for epsilon = say, 0.1, we can always find a term sufficiently "far out" that the common ratio is within $0.7\pm0.1$, and all the numbers in this range are less than 1. We replace our example numbers with general placeholders, and we have our proof.

Similar deal with the limit comparison test -- this is left as an exercise to the reader.

A geometric proof of integration by parts