### From polarisation to quantum mechanics: states, observables, Born's law

Like most texts on the theory, I will motivate the mathematics of quantum mechanics from the example of polarisation -- mostly because it's a very accessible example of stuff being wavelike. From this example, we will be able to motivate: the state vector (generalising the polarisation), state vector collapse (the event of polarisation), observables and their eigenvalues (stuff like energy, number of photons, etc.), eigenstates and their orthogonality (polarisation basis), noncommuting operators and uncertainty (the noncommuting of lenses).

The key feature of quantum mechanics -- the fundamentally probabilistic nature -- comes from the following two facts, confirmed by experiments (the famous experiments here are the double-slit experiment and photoelectric effect respectively):

• Everything is a wave -- objects behave as waves, following the superposition principle and the waves represent densities of observations at large scales.
• Everything is a particle -- which manifests itself in the form of some stuff, like energy and momentum, coming in little quanta.

This is the principle of wave-particle duality. You may realise how this implies a probabilistic description, but the following example should make it quite clear: consider a wave of light, with energy $hf$ (so it's a single photon) polarised at angle $\theta$ to the horizontal -- and it passes through a horizontal polarising filter. Well, then the wave that passes through would be a horizontally polarised wave with energy $hf\cos^2\theta$, right?

But this is impossible, since energy levels in quantum mechanics are quantised -- you can't have $\cos^2\theta$ of a photon, you can only have integer multiples of a photon. But the fact that energy drops as $E\cos^2\theta$ is something that you can verify at your home, using sunglasses -- what the heck?

The key point is that the empirical verification of the $\cos^2\theta$ business that you can do at home is on a macroscopic level, when you have a large number of photons $E=Nhf$. So something occurs with the photons on a microscopic level such that when you try it with a large number of photons, $\cos^2\theta$ of the photons pass through.

Well, this is essentially the "definition" of probability! A single photon passes through the filter with a probability of $\cos^2\theta$ so that for a large number of photons, $\cos^2\theta$ of the photons pass through. This is a non-trivial result -- wave-particle duality makes no mentions of probability as such, it just tells us that stuff is both a particle and a wave, but this simple condition in itself implies a probabilistic, non-deterministic reality.

Similar thought experiments can illustrate the probabilistic nature of other things (the "things" in question here will soon be called "eigenstates"): position is easy -- consider a standing wave photon in a box (this can easily be constructed). This is uniformly distributed throughout the box -- so how much of the energy is in some chunk of the box?

Momentum is trickier, but shouldn't be too hard if you're familiar with Fourier transforms -- what's the analog of a "box" in momentum-space? Well, consider a concentrated pulse of light -- this can be written, via a Fourier transform, as the sum of several light waves of different momenta (i.e. frequencies), each wave with some lower energy. Taking "some chunk" of this "box" amounts to filtering some specific frequencies of the light. This can be done easily, e.g. with a colour filter -- so how much of the energy is contained in the waves with these specific momenta?

In both cases, the key point is that you can't have a fraction of the energy of the photon at these positions/momenta, so you must have a probability of measuring the photon to be in a specific range of positions or a specific range of positions -- to be in a specific region or in a specific region of momentum-space.

The fundamental point here can be made for any quantity $X$: if you can filter out the "part" of a collection of particles that has $X$ in a certain subset of its range, then on a microscopic level, is probabilistic. The act of "filtering out the parts with a certain $X$", applied to a single particle, is just the act of checking if a particle is in a certain $X$-interval, and is called measurement. Any quantity that you can measure is called an observable

Something like polarisation is really a form of measurement -- you're finding out whether or not the photon is in a certain polarisation $|\phi_{\parallel}\rangle$. You may have another observable, corresponding to a different polarisation -- even one that is orthogonal to the first polarisation -- $|\phi_{\perp}\rangle$ and still get that the photon is in $|\phi_\perp\rangle$. There is nothing wrong with this, as we just know beforehand that the photon is in $|\phi_\parallel\rangle$ or $|\phi_\perp\rangle$. If you perform the polarisation with $|\phi_\perp\rangle$ after the polarisation with $|\phi_\parallel\rangle$, you will find that the photon doesn't pass through, as you know for sure that the photon is not in both $|\phi_\parallel\rangle$ and $|\phi_{\perp}\rangle$.

Now, you may have certain psychological issues with this, as have many in history -- however, you might want to note that the aim of quantum mechanics is not to fix your psychological problems but to explain nature. You need to accept logical positivism and learn to shut up and calculate to be comfortable with quantum mechanics.

So whatever calculus we invent to describe these probabilistic phenomena, it is going to apply to all observables.

In our first example, the polarisation of the photon can be represented by a unit vector which we will denote as $|\psi\rangle$. The polarising filter has two special axes, represented by unit vectors $|\phi_{\parallel}\rangle$ and $|\phi_\perp\rangle$ -- these are special in the sense that an incoming photon polarised as $|\phi_{\parallel}\rangle$ or $|\phi_\perp\rangle$ will simply be scaled, by factors of 1 and 0 respectively -- so these form an eigenbasis for a certain operator.

Well, we said that the photon passes through (with polarisation $|\phi_{\parallel}\rangle$) with probability $\cos^2\theta$ -- this arises simply from considering the amplitude of $|\psi\rangle$ in the direction of $|\phi_\parallel\rangle$. So we can write the probability that the photon ends up in a state $|\phi\rangle$ as $|\langle\psi|\phi\rangle|^2$ where $\langle\psi|\phi\rangle$ is called the corresponding "probability amplitude".

This expression, $P(x=\lambda)=|\langle\psi|\phi_\lambda\rangle|^2$ is called Born's rule.

Let's get back to the eigenbasis -- what exactly is this an eigenbasis of? We said that the corresponding eigenvalues are 1 and 0, so this gives us a complete description of the operator. Note that this operator depends only on the observable (namely "number of photons in the $|\phi_{\parallel}\rangle$ direction), not on the state or any other feature of the observation. So we decide to call this operator/matrix the "observable", and its eigenvalues are the values of the observable that can be measured.

To find properties of these observables, the natural way is to note that the only feature we've really required of them is Born's rule, i.e. the probabilistic interpretation -- so we can apply the axioms of probability and see what they apply in the context of these observables.

• $P(E)\ge 0$ -- imply that the observables are over either the reals or complexes, so that $|\langle\psi|\phi\rangle|^2\in \mathbb{R}$ in the first place. The nonnegativity then follows.
• $P(\Omega)=1$ and $P\left(\bigcup_i E_i\right) = \sum_i P(E_i)$ for disjoint $E_i$ -- this, along with the second axiom, implies that $\sum |\langle\phi|\psi\rangle|^2 = 1=|\langle\psi|\psi\rangle|^2$ where the sum is taken over all eigenstates $|\phi\rangle$ of the operator. As this must be true for all states $|\psi\rangle$, the thing on the left must be a Pythagorean sum, so the $|\phi\rangle$s must form an orthogonal basis. This implies that all observables are normal operators.

The latter fact is very important, and can also be seen in the following way -- if you a system is in one eigenstate, it cannot possibly collapse onto another eigenstate (the probabilistic interpretation is: if you know for sure the value of the symbol is a thing, it's that thing) -- so we must have $|\langle \phi_1|\phi_2\rangle|^2=0$ for all eigenstates $|\phi_1\rangle$ and $|\phi_2\rangle$.

Another restriction we add is that the observables be not only normal, but Hermitian operators in particular, so they have real eigenvalues. This may seem an odd choice, but it makes sense, as any normal operator may be uniquely written as $X_H+iX_{AH}$ where $X_H$ and $X_{AH}$ are Hermitian, and $X_H$ and $X_{AH}$ commute, so any complex observation can be done unambiguously as two real observations. So we stick to real eigenvalues.

This also makes it essential that we allow complex operators rather than just real ones (the two choices were given to us from the first probability axiom), so that this decomposition is possible. Later, we will see concrete examples of this with commutators $[X,Y]$, which must be multiplied by $i$ to turn Hermitian. We will also see more fundamental reasons to choose complex numbers in QM.

Exercise: Show that the expected value of an observable $X$ given a state $\psi$ can be given as $\langle \psi|X|\psi \rangle$ (i.e. $\psi^*X\psi$ in conventional notation).

Exercise: Explain Born's rule with other observables, like position and momentum. Explain why it holds in general.