Approximation in Physics

Physicists are not always concerned with exact solutions to problems. Oftentimes, an approximate solution to a problem or an approximate mathematical expression is good enough for practical purposes. Using approximations can also greatly simplify the appearance of a mathematical expression, so that the expression can be interpreted more easily. Furthermore, in many situations, it is impossible to write down an exact solution in closed form. In this article, I introduce two of the most popular techniques that we use when making approximations. Examples are also provided.

Series Expansion and Truncation

The most common approximation technique relies on the fact that most functions of interest can be written in terms of an infinite series. Typically, a function is written as a Taylor series or a Laurent series and then the series is truncated so that only the first few terms remain. For example, the Taylor series expansion of the function

\[\displaystyle f(x) = \frac{3}{\sqrt{1+x}} = 3(1+x)^{-1/2}\]

about \(x = 0\) is

\[\displaystyle f(x) = 3\left(1 - \frac{1}{2}x + \frac{3}{8}x^2 - \frac{5}{16}x^3 + \cdots\right)\]

If \(x \ll 1.0\) (where the symbol \(\ll\) means “much less than”), the function can be approximated quite well by the first few terms of the series expansion. This is because the condition, \(x \ll 1.0\), implies that \(x^2\) and higher powers of \(x\) are even smaller. Oftentimes, \(x^2\) is so small that it is essentially negligible; only constant terms and linear terms are included in the approximation. In this case, the approximation for \(f(x)\) is simply

\[\displaystyle f(x) \approx 3\left(1 - \frac{1}{2}x \right) = 3 - \frac{3}{2}x.\]

In some cases, it may be better to keep the quadratic term or even the cubic term until near the end of the derivation. Typically, these situations arise when the approximated function will later be differentiated.  

Of course, most situations are not so simple; you usually have to manipulate the expression algebraically before you can apply the approximation. More often, the expression looks something like this:

\[\displaystyle \frac{1}{\left(R^2+r^2\right)^{3/2}}.\]

If \(r^2\ll R^2\), we can use the same method by first re-writing the expression as

\[\displaystyle \frac{1}{R^3\left(1+r^2/R^2\right)^{3/2}}.\]

Defining \(\beta\equiv r^2/R^2\), this becomes

\[\displaystyle \frac{1}{R^3\left(1+\beta\right)^{3/2}} = \frac{1}{R^3}(1+\beta)^{-3/2}.\]

When \(r^2\ll R^2\), the quantity \(\beta \ll 1.0\) and this can be approximated as

\[\displaystyle \frac{1}{R^3}\left(1 - \frac{3}{2}\beta\right),\]

or

\[\displaystyle \frac{1}{R^3} - \frac{3r^2}{2R^5}.\]

The plot below compares the exact and approximate expressions when \(R=2\). The approximation is very good for \(r\lesssim0.5\), which corresponds to \(\beta\lesssim0.063\).

This sort of approximation is typically first introduced in physics classes when the motion of a pendulum is being studied, but the general method is not explained in introductory courses. Instead, the textbook or instructor usually introduces something called the “small angle approximation for sine,” which is just a particular case of the general technique described above. The derivation goes something like this:

Suppose a pendulum bob of mass \(m\) hangs from a string of length \(\ell\). Choose the zero-point of potential energy as the lowest point in the pendulum bob’s swing. The gravitational potential energy of the pendulum bob is then \(U = mg\ell(1 - \cos \theta)\), where \(\theta\) is the angle that the string makes with the vertical. The net force on the pendulum bob is the tension force in the string plus the weight of the bob. This net force points tangential to the arc of the pendulum and acts as restoring force, pulling the pendulum bob toward the equilibrium position (\(\theta = 0\)). The net force can be written as the negative derivative of the potential energy with respect to the path length, \(s\):

\[\displaystyle F_{\rm t}(s) = -\frac{{{\rm d}}U}{{{\rm d}}s}\]

where \(s = \ell\theta\), which means

\[\displaystyle F_{\rm t}(\theta) = -\frac{1}{\ell}\frac{{{\rm d}}}{{{\rm d}}\theta}mg\ell(1 - \cos \theta) =-mg\frac{{{\rm d}}}{{{\rm d}}\theta}(1 - \cos \theta)\]

or simply

\[\displaystyle F_{\rm t}(\theta) = -mg\sin\theta\]

This can be written as a differential equation, using the fact that \(F_{\rm t}(\theta) = ma\), where \(a = {{\rm d}}^2s/{{\rm d}}t^2\),

\[\displaystyle m\frac{{{\rm d}}^2s}{{{\rm d}}t^2} = -mg\sin\theta.\]

Cancelling \(m\) and re-expressing \(s\) gives us

\[\displaystyle \frac{{{\rm d}}^2\theta}{{{\rm d}}t^2} = -\frac{g}{\ell}\sin\theta.\]

Solving this is problematic. However, the right side of this equation can be expanded as a Taylor series. Notably,

\[\displaystyle \sin\theta = \theta - \frac{\theta^3}{6} + \frac{\theta^5}{120} + \cdots\]

For small values of \(\theta\), the cubic term and higher order terms are negligible. Thus, \(\sin\theta \approx \theta.\) The differential equation can then be approximated by

\[\displaystyle \frac{{{\rm d}}^2\theta}{{{\rm d}}t^2} = -\frac{g}{\ell}\theta.\]

This is a well-known differential equation; the solution is a sinusoid or a linear combination of sinusoids with two arbitrary constants. For instance, it can be written as

\[\displaystyle \theta(t) = A \cos(\omega t + \theta_0),\]

where

\[\displaystyle\omega = \sqrt{\frac{g}{\ell}},\]

The constant, \(A\), is the maximum value of \(\theta\) (the amplitude of the pendulum swing), and \(\theta_0\) is a phase angle, which tells us the position of the pendulum at the time origin, \(t=0\).

Since this derivation relied upon the assumption that \(\theta \ll 1.0\), the final expression is only a good approximation when \(A \ll 1.0\).

Perturbative solution with \(\epsilon\)

The other popular method of approximation is called perturbation theory. It can be applied in situations for which the problem that we are trying to solve is very similar to a simpler problem, except for a minor complicating factor, called a “perturbation.” For instance, using Newtonian gravity, it is easy to compute the orbit (position as a function of time) of a planet orbiting a star if the stellar system consists of only one star and one planet in isolation. On the other hand, introducing a third body complicates the situation; the presence of a moon, another planet, or a companion star causes the orbits to no longer be strictly elliptical; the orbit becomes perturbed. Perturbation theory can be used to estimate the effect of the complicating factor (the perturbation) as long as this factor only changes the physical situation in a small way.

The general assumption is as follows: If a physical situation, \(S\), leads to a mathematical result, \(R\), then a similar situation,

\[S' = S + s,\]

gives rise to a similar result,

\[R' = R + r,\]

where \(s\) is a small modification (perturbation) to the situation \(S\), and \(r\) is a small modification to the result, \(R\). The physical situation could consist of the positions, velocities, and masses of a star and a planet, along with a force law. In this case, the result might be the position of the planet as a function of time. In other cases, the physical situation might involve the density, temperature, and pressure of a fluid or the positions and velocities of particles in a particular electromagnetic field configuration. The situation could be anything that leads to a computable result. There are several varieties of perturbation theory, but they all share this general assumption.

Suppose the abstract physical situation \(S\) is represented mathematically by the equation,

\(\Phi(x) = 0\tag{1}\)

such that all of the relevant physics (the laws and the description of the physical system) are included in this equation. The result, \(R\), is then the solution to this equation; \(R\) is the value of \(x\) that satisfies Equation (1). In other words, \(x\) represents all of the unknown quanties that you wish to compute. Equation (1) is oftentimes a differential equation, but it could also be an algebraic equation. Thus, \(x\) is generally a function, a single number, or a collection of numbers (like a vector or matrix).

The more complicated situation \(S'\) is represented mathematically by

\[\Phi(x) + \epsilon\phi(x) = 0\tag{2}\]

where \(\epsilon\) is a parameter between 0 and 1. The function, \(\phi(x),\) describes the perturbation being made to the exactly solvable situation. Clearly, setting \(\epsilon=0\) yields the unperturbed case, Eq. (1). The solution to the perturbed equation is expressed as,

\(x=x_0+\epsilon x_1+\epsilon^2x_2+\epsilon^3x_3+\cdots\tag{3}\)

where \(x_0\) is the solution to the exactly solvable situation in Eq. (1).

We proceed by substituting Eq. (3) into Eq. (2). We then collect the terms multiplying the same order of \(\epsilon\) and treat terms multiplying each power of \(\epsilon\) as a separate equation. For example, the result of substituting Eq. (3) into Eq. (2) might be

\[x_0^2 - 3 + \epsilon x_1^3 - 8\epsilon + (x_1+x_2 - 6)\epsilon^2 = 0\]

This would become the set of equations:

\[\begin{array}\ {\mathcal{O}}(\epsilon^0): & x_0^2 - 3 & = & 0\\ {\mathcal{O}}(\epsilon^1): & x_1 - 2 & = & 0\\ {\mathcal{O}}(\epsilon^2): & x_1+x_2 - 6 & = & 0 \end{array}\]

From which, we can deduce,

\[\begin{array}\ x_0 & = & \pm\sqrt{3}\\ x_1 & = & 2\\ x_2 & = & 4 \end{array}\]

Thus, from Eq. (3), the result is

\[x=\pm\sqrt{3} + 2\epsilon + 4\epsilon^2 + {\mathcal{O}}(\epsilon^3)\]

Typically, \(\epsilon\ll1.0\), so the only term that is not negligible is the term that is linear in \(\epsilon\).

Algebraic Equation Example:

Suppose you need to solve the following equation for \(x\):

\[\displaystyle x + 2 = 0.05e^{x}.\]

This can be written as

\[\displaystyle x + 2 - \epsilon e^{x} = 0.\]

Where \(\epsilon = 0.05.\) The exponential term can be thought of as a perturbation to the trivial equation \(x+2=0\). This perturbation will cause the solution to differ slightly from the unperturbed solution (\(x=-2\)). Perturbation theory can be used to approximate the solution near \(x=-2\); any additional solutions will need to be determined using other methods.

First, recall that the solution is written as a series:

\[\displaystyle x = x_0 + \epsilon x_1 + \epsilon^2 x_2 + \cdots\]

This is substituted into the equation, yielding

\[\displaystyle x_0 + \epsilon x_1 + \epsilon^2 x_2 + {\mathcal{O}}{\epsilon^3} + 2 - \epsilon e^{x_0 + \epsilon x_1 + \epsilon^2 x_2 + {\mathcal{O}}{\epsilon^3}} = 0.\]

From the \({\mathcal{O}}(\epsilon^0)\) part of of the equation,

\[\displaystyle x_0 + 2 = 0 \quad\Rightarrow\quad x_0 = -2.\]

So, the equation becomes

\[\displaystyle \epsilon x_1 + \epsilon^2 x_2 + {\mathcal{O}}(\epsilon^3) - \epsilon e^{-2}e^{\epsilon x_1 + \epsilon^2 x_2 + {\mathcal{O}}(\epsilon^3)} = 0.\]

Now, note that \(e^{z} = 1 + z + z^2/2 + {\mathcal{O}}(z^3)\) and use this to re-express the exponential term:

\[\displaystyle \epsilon x_1 + \epsilon^2 x_2 + {\mathcal{O}}(\epsilon^3) - \epsilon e^{-2}\left[1 + \epsilon x_1 + \epsilon^2 x_2 + \epsilon^2\frac{x_1^2}{2} + {\mathcal{O}}(\epsilon^3) \right] = 0\]

or simply

\[\displaystyle \epsilon x_1 + \epsilon^2 x_2 - \epsilon e^{-2} - \epsilon^2 e^{-2} x_1 + {\mathcal{O}}(\epsilon^3) = 0.\]

The \({\mathcal{O}}(\epsilon)\) equation is

\[\displaystyle x_1 - e^{-2} = 0\quad\Rightarrow\quad x_1 = e^{-2}.\]

This leaves us with the \({\mathcal{O}}(\epsilon^2)\) equation:

\[\displaystyle x_2 - e^{-2} x_1 = 0\] \[\displaystyle x_2 - e^{-2} e^{-2} = 0\] \[\displaystyle x_2 = e^{-4}.\]

Thus, we have the result:

\[\displaystyle x = -2 + \epsilon e^{-2} + \epsilon^2 e^{-4} + {\mathcal{O}}(\epsilon^3).\]

Substituting \(\epsilon = 0.05,\) we get the approximation \(x\approx-1.9931874.\) The exact solution, up to 9 decimal places, is -1.993186976. If we neglect the \({\mathcal{O}}(\epsilon^2)\) term, we still get a decent approximation: \(x\approx-1.9932332.\)

Perturbation theory is useful only if you are looking for the root of the perturbed equation that falls near the root of the unperturbed equation (\(x=-2\)). There is another solution at \(x\approx4.93\), which would require the use of complex analysis or a root-finding scheme.

Differential Equation Example: Sound Waves in a Fluid

Consider a fluid of constant density, \(\rho_0\), temperature \(T_0\), and pressure, \(P_0\). Note that these three conditions imply that there is no gradient in the gravitational potential (the potential is uniform). Additionally, assume that the fluid is barotropic (pressure is uniquely determined by the density) and the velocity field within the fluid is \(\mathbf{v}(\mathbf{x},t)=0\); there is no bulk flow anywhere within the volume of interest. The fluid trivially obeys the following relations…

Continuity / conservation of mass:

\[\displaystyle \frac{\partial\rho}{\partial t} + \nabla\cdot(\rho\mathbf{v})=0\tag{4}\]

Euler’s equation in terms of the specific enthalpy, \(h\), for a barotropic / isentropic fluid:

\[\displaystyle \frac{\partial\mathbf{v}}{\partial t} + (\mathbf{v}\cdot\nabla)\mathbf{v} + \nabla(h+\Phi)=0\tag{5}\]

where \(\Phi\) is the gravitational potential and

\[\displaystyle h = h(\rho) = \int_0^\rho \frac{{{\rm d}}P(\rho')}{{{\rm d}}\rho'}\frac{{{\rm d}}\rho'}{\rho'}\tag{6}\]

Now, let’s examine the effect of a very small perturbation in the fluid properties. This could be due to a small, sudden movement of an object in the fluid; perhaps an object begins vibrating and moving the fluid in its vicinity. The perturbations in the fluid properties can be expressed as

\[\begin{array}\ \rho(\mathbf{x},t) = \rho_0 + \epsilon \rho_1(\mathbf{x},t)\\ h(\mathbf{x},t) = h_0 + \epsilon h_1(\mathbf{x},t)\tag{7}\\ \mathbf{v}(\mathbf{x},t) = \mathbf{v}_0 + \epsilon \mathbf{v}_1(\mathbf{x},t) \end{array}\]

Substituting Eqns (7) into Eq. (4), Eq. (5), and Eq. (6) and neglecting terms multiplying \(\epsilon^2\) yields1

\[\begin{array}\ \frac{\partial\rho_1}{\partial t} + \rho_0\nabla\cdot\mathbf{v}_1=0\tag{8}\\ \frac{\partial\mathbf{v}_1}{\partial t} = -\nabla h_1\\ h_1 = \left(\frac{\partial P}{\partial \rho}\right)_{\!\rho_0}\frac{\rho_1}{\rho_0}\qquad({\rm see\,footnote}) \end{array}\]

Differentiating the first equation in Eq. (8) with respect to time and using the other two equations to eliminate \(\mathbf{v}_1\) and \(h_1\), we get

\[\displaystyle\frac{\partial^2\rho_1}{\partial t^2} = \left(\frac{\partial P}{\partial \rho}\right)_{\!\rho_0} \nabla^2\rho_1 \]

This is a wave equation for the density, \(\rho_1\). The solution will consist of waves that travel with speed

\[\displaystyle c_s = \left.\sqrt{\frac{\partial P}{\partial \rho}}\right|_{\rho_0}\]

The general solution to the wave equation is

\[\displaystyle \rho_1(\mathbf{x},t) = F(\mathbf{x}-c_s{\mathbf{\hat x}}t) + G(\mathbf{x}+c_s{\mathbf{\hat x}}t)\]

where \(F\) is a wave traveling in the positive \({\mathbf{\hat x}}\) direction and \(G\) is a wave traveling in the negative \({\mathbf{\hat x}}\) direction. In this particular situation, the waves are called sound waves; the speed with which they travel is the speed of sound. Note that, since the fluid was assumed to be barotropic \((P=P(\rho))\), the existence of density waves implies the existence of corresponding pressure waves. One special solution to the equation is the plane wave,

\[\rho_1(\mathbf{x},t) = A\cos(\mathbf{k}\cdot\mathbf{x} - \omega t + \phi)\]

where the wave vector \(\mathbf{k}\) points in the direction of wave propagation and has magnitude, \(|\mathbf{k}|=2\pi/\lambda\) (where \(\lambda\) is the wavelength of the wave). The angular frequency, \(\omega = 2\pi f\), where \(f\) is the ordinary frequency. The constants \(A\) and \(\phi\) determine the amplitude and phase of the wave, respectively.

Returning to the general solution of the density in our perturbed fluid, from Eqns. (7), we have

\[\displaystyle \rho(\mathbf{x},t) = \rho_0 + \epsilon \left[F(\mathbf{x}-c_s{\mathbf{\hat x}}t) + G(\mathbf{x}+c_s{\mathbf{\hat x}}t)\right].\]

This says that sound waves represent a small perturbation (of order \(\epsilon\)) on the density (and pressure) of the fluid, which is what we generally expect from experience. If the value of \(\epsilon\) was not very small, then this would be a poor approximation. In particular, the term

\[\displaystyle(\mathbf{v}\cdot\nabla)\mathbf{v},\]

which is of order \(\epsilon^2,\) would not be negligible. From this, one might expect waves caused by violent explosions (or, generally, by very fast-moving objects) to behave in more complicated ways than the typical sounds that we encounter in everyday life. To study the problem in more detail, a quadratic term could be added to Eqns (7). The modified equations would then be substituted into Eq. (4), Eq. (5), and Eq. (6). Then, the \({\mathcal{O}}(\epsilon^2)\) terms would be collected and we would attempt to solve for \(\rho_2(\mathbf{x},t)\). Upon attempting to do this, we would find that additional assumptions (or data) about the disturbance are needed in order to learn something new about the behavior of the system.

This example is based on a derivation found in the appendix of Galactic Dynamics by Binney and Tremaine.


  1. Regarding the third line of Eq. (8), we begin with

    \[\displaystyle \begin{array}\ h(\rho_0 + \epsilon\rho_1) & = & \int_{0}^{\rho_0 + \epsilon\rho_1} \frac{{{\rm d}}P}{{{\rm d}}\rho}\frac{{{\rm d}}\rho}{\rho}\\ & = & \int_0^{\rho_0} \frac{{{\rm d}}P}{{{\rm d}}\rho}\frac{{{\rm d}}\rho}{\rho} + \int_{\rho_0}^{\rho_0 + \epsilon\rho_1} \frac{{{\rm d}}P}{{{\rm d}}\rho}\frac{{{\rm d}}\rho}{\rho}\\ & = & h(\rho_0) + \int_{\rho_0}^{\rho_0 + \epsilon\rho_1} \frac{{{\rm d}}P}{{{\rm d}}\rho}\frac{{{\rm d}}\rho}{\rho} \end{array}\]

    Note that \(h(\rho_0)\) is simply \(h_0\). Since the remaining integral is taken over such a small density interval, the term \(dP/d\rho\) is essentially a constant. The value of the constant is the derivative, evaluated at \(\rho_0\). This constant can be removed from the integral:

    \[\displaystyle h = h_0 + \left(\frac{\partial P}{\partial\rho}\right)_{\!\rho_0}\int_{\rho_0}^{\rho_0 + \epsilon\rho_1}\frac{{{\rm d}}\rho}{\rho}.\]

    Evaluating the remaining integral yields

    \[\displaystyle h = h_0 + \left(\frac{\partial P}{\partial\rho}\right)_{\!\rho_0}\ln\left(\frac{\rho_0 + \epsilon\rho_1}{\rho_0}\right).\]

    Now, we use the first type of approximation method, discussed above. We note that

    \[\displaystyle \ln\left(\frac{\rho_0 + \epsilon\rho_1}{\rho_0}\right) = \ln\left(1 + \epsilon\frac{\rho_1}{\rho_0}\right).\]

    The Taylor expansion of \(\ln(1+x)\) about \(x = 0\) is

    \[\displaystyle \ln(1+x) = x - \frac{1}{2}x^2 + \frac{1}{3}x^3 - \frac{1}{4}x^4 + \cdots\]

    So,

    \[\displaystyle \ln\left(1 + \epsilon\frac{\rho_1}{\rho_0}\right) = \epsilon\frac{\rho_1}{\rho_0} - \frac{1}{2}\left(\epsilon\frac{\rho_1}{\rho_0}\right)^2 + \cdots\]

    The term that is linear in \(\epsilon\) is the only term of interest:

    \[\displaystyle h = h_0 + \epsilon\left(\frac{\partial P}{\partial\rho}\right)_{\!\rho_0}\frac{\rho_1}{\rho_0}.\]

    Thus, we arrive at the result for the \({\mathcal{O}}(\epsilon)\) term:

    \[\displaystyle h_1 = \left(\frac{\partial P}{\partial \rho}\right)_{\!\rho_0}\frac{\rho_1}{\rho_0}\]