### What's with e^(-1/x)? On smooth non-analytic functions: part I

When you first learned about the Taylor series, your intuition probably went something like this: you have $f(x)$, the derivative at this point tells you how $f$ changes from $x$ to $x+dx$ (which tells you $f(x+dx)$), the second derivative tells you how $f'$ changes from $x$ to $x+dx$, which recursively tells you $f(x+2\ dx)$, the third derivative tells you $f(x+3\ dx)$, and so on -- so if you have an infinite number of derivatives, you know how each derivative changes, so you should be able to predict the full global behaviour of the function, assuming it is infinitely differentiable (smooth) throughout.

Everything is nice and dandy in this picture. But then you come across two disastrous, life-changing facts that make you cry for those good old days:
1. Taylor series have radii of convergence -- If I can predict the behaviour of a function up until a certain point, why can't I predict it a bit afterwards? It makes sense if the function becomes rough at that point, like if it jumps to infinity, but even functions like $1/(1+x^2)$ have this problem. Sure, we've heard the explanation involving complex numbers, but why should we care about the complex singularities (here's a question: do we care about quaternion singularities?)? Specifically, a Taylor series may have a zero radius of convergence. Points around which a Taylor series has a zero radius of convergence are called Pringsheim points.
2. Weird crap -- Like $e^{-1/x}$. Here, the Taylor series does converge, but it converges to the wrong thing -- in this case, to zero. Points at which the Taylor series doesn't equal a function on any neighbourhood, despite converging, are called Cauchy points.
In this article, we'll address the weird crap -- $e^{-1/x}$ (or "$e^{-1/x}$ for $x>0$, 0 for $x= 0$" if you want to be annoyingly formal about it) will be the example we'll use throughout, so if you haven't already seen this, go plot it on Desmos and get a feel for how it looks near the origin.

Terminology: We'll refer to smooth non-analytic functions as defective functions.

The thing to realise about $e^{-1/x}$ is that the Taylor series -- $0 + 0x + 0x^2 + ...$ -- isn't wrong. The truncated Taylor series of degree $n$ is the best polynomial approximation for the function near zero, and none of the logic here fails for $e^{-1/x}$. There is honestly no other polynomial that better approximates the shape of the function as $x\to 0$.

If you think about it this way, it isn't too surprising that such a function exists -- what we have is a function that goes to zero as $x\to 0$ faster than any polynomial does. I.e. a function $g(x)$ such that
$$\forall n, \lim\limits_{x\to0}\frac{g(x)}{x^n}=0$$
This is not fundamentally any weirder than a function that escapes to infinity faster than all polynomials. In fact, such functions are quite directly connected. Given a function $f(x)$ satisfying:
$$\forall n, \lim\limits_{x\to\infty} \frac{x^n}{f(x)} = 0$$
We can make the substitution $x\leftrightarrow 1/x$ to get
$$\forall n, \lim\limits_{x\to0} \frac{1}{x^n f(1/x)} = 0$$
So $\frac1{f(1/x)}$ is a valid $g(x)$. Indeed, we can generate plenty of the standard smooth non-analytic functions this way: $f(x)=e^x$ gives $g(x)=e^{-1/x}$, $f(x)=x^x$ gives $g(x)=x^{1/x}$, $f(x)=x!$ gives $g(x)=\frac1{(1/x)!}$ etc.

To better study what exactly is going on here, consider Taylor expanding $e^{-1/x}$ around some point other than 0, or equivalently, expanding $e^{-1/(x+\varepsilon)}$ around 0. One can see that:
$$\begin{array}{*{20}{c}}{f(0) = {e^{ - 1/\varepsilon }}}\\{f'(0) = \frac{1}{{{\varepsilon ^2}}}{e^{ - 1/\varepsilon }}}\\{f''(0) = \frac{{ - 2\varepsilon + 1}}{{{\varepsilon ^4}}}{e^{ - 1/\varepsilon }}}\\{f'''(0) = \frac{{6{\varepsilon ^2} - 6\varepsilon + 1}}{{{\varepsilon ^6}}}{e^{ - 1/\varepsilon }}}\\ \vdots \end{array}$$
Or ignoring higher-order terms for our purposes,
$$f^{(N)}(0)\approx(1/\varepsilon)^{2N}e^{-1/\varepsilon}$$
Each derivative $\frac{e^{-1/\varepsilon}}{\varepsilon^{2N}}\to0$ as $\varepsilon\to0$, but they each approach zero slower than the previous derivative, and somehow that is enough to give the sequence of derivatives the "kick" that they need in the domino effect that follows -- from somewhere at $N=\infty$ (putting it non-rigorously) -- to make the function grow as $x$ leaves zero, even though all the derivatives were zero at $x=0$.

But we can still make it work -- by letting $N$, the upper limit of the summation approach $\infty$ first, before $\varepsilon\to 0$. In other words, instead of directly computing the derivatives $f^{(n)}(0)$, we consider the terms
$$\begin{array}{*{20}{c}}{f_\varepsilon^{(0)} = f(0)}\\{{{f}_\varepsilon^{(1)} }(0) = \frac{{f(\varepsilon ) - f(0)}}{\varepsilon }}\\{{{f}_\varepsilon^{(2)} }(0) = \frac{{f(2\varepsilon ) - 2f(\varepsilon ) + f(0)}}{{{\varepsilon ^2}}}}\\{{{f}_\varepsilon^{(3)} }(0) = \frac{{f(3\varepsilon ) - 3f(2\varepsilon ) + 3f(\varepsilon ) - f(0)}}{{{\varepsilon ^3}}}}\\ \vdots \end{array}$$
And write the generalised Hille-Taylor series as:
$$f(x) = \mathop {\lim }\limits_{\varepsilon \to 0} \sum\limits_{n = 0}^\infty {\frac{{{x^n}}}{{n!}}f_\varepsilon ^{(n)}(0)}$$
Then $N\to\infty$ before $\varepsilon\to0$ so you "reach" $N\to\infty$ first (or rather, you get large $n$th derivatives for increasing $n$) before $\varepsilon$ gets to 0.

Another way of thinking about it is that the "local determines global" stuff makes sense to predict the value of the function at $N\varepsilon$, countable $N$, but it's a stretch to talk about uncountably many $\varepsilon$s away, which is what a finite neighbourhood is. But with these difference operators in the Hille-Taylor series, one ensures that each neighbourhood is a finite multiple of $h$ away at any point, so the differences determine $f$.

Very simple (but fun to plot on Desmos) exercise: use $e^{-1/x}$ or another defective function to construct a "bump function", i.e. a smooth function that is 0 outside $(0, 1)$, but takes non-zero values everywhere in that range.

Similarly, construct a "transition function", i.e. a smooth function that is 0 for $x\le0$, 1 for $x\ge1$. (hint: think of a transition as going from a state with "none of the fraction" to "all of the fraction")

If you're done, play around with this (but no peeking): desmos.com/calculator/ccf2goi9bj