Perturbation Theory

In DESC, we employ perturbation techniques to find neighboring solutions. The basic ideas and outline of the relevant equations are given below. For full details and examples see our paper

Suppose we have a vector valued function $f(x,c)$, where we can interpret $f$ as a measure of equilibrium, $x$ as the optimization vector and $c$ as some additional parameter.

Further suppose that we have $f(x,c) = 0$, so that we are at an equilibrium. We can then ask “if we change the parameter $c$, how does $x$ need to change to maintain equilibrium?”

We can expand $f$ in a Taylor series:

$ \begin{eqnarray} f(x+\Delta x, c+\Delta c) &= f(x,c) + \frac{df}{dx}\Delta x + \frac{df}{dc}\Delta c \\ &+ \frac{1}{2}\frac{d^2f}{dx^2}\Delta x \Delta x + \frac{1}{2}\frac{d^2f}{dc^2}\Delta c \Delta c + \frac{d^2f}{dxdc} \Delta x \Delta c \\ &+ \frac{1}{6}\frac{d^3f}{dx^3}\Delta x \Delta x \Delta x + \frac{1}{6}\frac{d^3f}{dc^3}\Delta c \Delta c \Delta c + \frac{1}{2}\frac{d^3f}{dx^2 dc}\Delta x \Delta x \Delta c + \frac{1}{2} \frac{d^3f}{dxdc^2}\Delta x \Delta c \Delta c + ... \end{eqnarray} $

Where we assume that we know $\Delta c$, the change in the parameters, and we want to find $\Delta x$

In general, this is a tensor polynomial equation for $\Delta x$ which is difficult/impossible to solve (ie, just storing $\frac{d^2f}{dx^2}$ and $\frac{d^3f}{dx^3}$ in memory could take hundreds or thousands of GB). To get around this, we can further expand $\Delta x$ and $\Delta c$ in a perturbation series in powers of $\epsilon$:

$\Delta x = \epsilon x_1 + \epsilon^2 x_2 + \epsilon^3 x_3$

$\Delta c = \epsilon c_1$

Plugging this expansion in we get:

$ \begin{eqnarray} f(x+\Delta x, c+\Delta c) &= f(x,c) + \frac{df}{dx}\epsilon x_1 + \frac{df}{dx}\epsilon^2 x_2 + \frac{df}{dx}\epsilon^3 x_3 + \frac{df}{dc}\epsilon c_1 \\ &+ \frac{1}{2}\frac{d^2f}{dx^2}(\epsilon^2 x_1^2 + 2 \epsilon^3 x_1 x_2 + 2 \epsilon^4 x_1 x_3) + \frac{1}{2}\frac{d^2f}{dc^2} (\epsilon^2 c_1^2) + \frac{d^2f}{dxdc} (\epsilon^2 x_1 c_1 + \epsilon^3 x_2 c_1 + \epsilon^4 x_3 c_1) \\ &+ \frac{1}{6}\frac{d^3f}{dx^3}(\epsilon^3 x_1^3 + 3 \epsilon^4 x_1^2 x_2 + 3 \epsilon^5 x_1^2 x_3) + \frac{1}{6}\frac{d^3f}{dc^3}\epsilon^3 c_1^3 + \frac{1}{2}\frac{d^3f}{dx^2 dc}(\epsilon^3 x_1^2 c_1 + 2 \epsilon^4 x_1 x_2 c_1 + 2 \epsilon^5 x_1 x_3 c_1) + \frac{1}{2}\frac{d^3f}{dxdc^2}(\epsilon^3 x_1 c_1^2 + \epsilon^4 x_2 c_1^2 + \epsilon^5 x_3 c_1^2) \end{eqnarray} $

Collecting terms of order $\epsilon$ and setting equal to zero gives the first order term $x_1$:

$0 = f(x,c) + \frac{df}{dx}\epsilon x_1 + \frac{df}{dc}\epsilon c_1$

$x_1 = - \left(\frac{df}{dx}\right)^{-1} \left(\frac{df}{dc}c_1 \right)$

Where the inverse is meant in the least squares sense.

Similary for 2nd order, $\epsilon^2$:

$0 = \frac{df}{dx}\epsilon^2 x_2 + \frac{1}{2}\frac{d^2f}{dx^2}\epsilon^2 x_1^2 + \frac{1}{2}\frac{d^2f}{dc^2}\epsilon^2 c_1^2 + \frac{d^2f}{dxdc}\epsilon^2 x_1 c_1$

$x_2 = -\left( \frac{df}{dx} \right)^{-1} \left(\frac{1}{2}\frac{d^2f}{dx^2} x_1^2 + \frac{1}{2}\frac{d^2f}{dc^2} c_1^2 + \frac{d^2f}{dxdc} x_1 c_1 \right)$

Order $\epsilon^3$:

$0 = \frac{df}{dx}\epsilon^3 x_3 + \frac{d^2f}{dx^2} \epsilon^3 x_1 x_2 + \frac{d^2f}{dxdc}\epsilon^3 x_2 c_1 + \frac{1}{6}\frac{d^3f}{dx^3}\epsilon^3 x_1^3 + \frac{1}{6}\frac{d^3f}{dc^3}\epsilon^3 c_1^3 + \frac{1}{2}\frac{d^3f}{dx^2 dc}\epsilon^3 x_1^2 c_1 + \frac{1}{2}\frac{d^2f}{dx dc^2}\epsilon^3 x_1 c_1^2$

$x_3 = - \left( \frac{df}{dx} \right) ^{-1} \left( \frac{d^2f}{dx^2} x_1 x_2 + \frac{d^2f}{dxdc}x_2 c_1 + \frac{1}{6}\frac{d^3f}{dx^3}x_1^3 + \frac{1}{6}\frac{d^3f}{dc^3} c_1^3 + \frac{1}{2}\frac{d^3f}{dx^2 dc} x_1^2 c_1 + \frac{1}{2}\frac{d^2f}{dx dc^2} x_1 c_1^2 \right)$

The advantage of this approach is that the higher order derivatives only ever appear multiplied by known quantities, so those terms can be efficiently computed as jacobian vector products (effectively a directional derivative) without needing to calculate the full derivative tensors. Additionally, the only operator that needs to be inverted is $\frac{df}{dx}$, which is the regular jacobian of the objective and is already computed and inverted during optimization.