Purpose
Break a large computation into pieces and only store values at the
interface of the pieces (this is much easier to do using checkpoint
).
In actual applications, there may be many functions, but
for this example there are only two.
The functions
@(@
F : \B{R}^2 \rightarrow \B{R}^2
@)@
and
@(@
G : \B{R}^2 \rightarrow \B{R}^2
@)@
defined by
@[@
F(x) = \left( \begin{array}{c} x_0 x_1 \\ x_1 - x_0 \end{array} \right)
\; , \;
G(y) = \left( \begin{array}{c} y_0 - y_1 \\ y_1 y_0 \end{array} \right)
@]@
Processing Steps
We apply reverse mode to compute the derivative of
@(@
H : \B{R}^2 \rightarrow \B{R}
@)@
is defined by
@[@
\begin{array}{rcl}
H(x)
& = & G_0 [ F(x) ] + G_1 [ F(x) ]
\\
& = & x_0 x_1 - ( x_1 - x_0 ) + x_0 x_1 ( x_1 - x_0 )
\\
& = & x_0 x_1 ( 1 - x_0 + x_1 ) - x_1 + x_0
\end{array}
@]@
Given the zero and first order Taylor coefficients
@(@
x^{(0)}
@)@ and @(@
x^{(1)}
@)@,
we use @(@
X(t)
@)@, @(@
Y(t)
@)@ and @(@
Z(t)
@)@
for the corresponding functions; i.e.,
@[@
\begin{array}{rcl}
X(t) & = & x^{(0)} + x^{(1)} t
\\
Y(t) & = & F[X(t)] = y^{(0)} + y^{(1)} t + O(t^2)
\\
Z(t) & = & G \{ F [ X(t) ] \} = z^{(0)} + z^{(1)} t + O(t^2)
\\
h^{(0)} & = & z^{(0)}_0 + z^{(0)}_1
\\
h^{(1)} & = & z^{(1)}_0 + z^{(1)}_1
\end{array}
@]@
Here are the processing steps:
Use forward mode on @(@
F(x)
@)@ to compute
@(@
y^{(0)}
@)@ and @(@
y^{(1)}
@)@.
Free some, or all, of the memory corresponding to @(@
F(x)
@)@.
Use forward mode on @(@
G(y)
@)@ to compute
@(@
z^{(0)}
@)@ and @(@
z^{(1)}
@)@
Use reverse mode on @(@
G(y)
@)@ to compute the derivative of
@(@
h^{(1)}
@)@ with respect to
@(@
y^{(0)}
@)@ and @(@
y^{(1)}
@)@.
Free all the memory corresponding to @(@
G(y)
@)@.
Use reverse mode on @(@
F(x)
@)@ to compute the derivative of
@(@
h^{(1)}
@)@ with respect to
@(@
x^{(0)}
@)@ and @(@
x^{(1)}
@)@.