Base
This syntax is used by
f.Reverse
where
f
has prototype
ADFun<Base> f
and atomic function
afun
is used in
f
;
see Base
.
AD<Base>
This syntax is used by
af.Reverse
where
af
has prototype
ADFun< AD<Base> , Base > af
and the atomic function
afun
is used in
af
; see base2ad
.
Implementation
This function must be defined if
afun
is
used during the recording of an ADFun
object
f
,
and reverse mode derivatives are computed for
f
.
It can return
ok == false
(and not compute anything) for values
of
order_up
that are greater than those used by your
reverse
mode calculations.
select_x
This argument has size equal to the number of arguments to this
atomic function; i.e. the size of ax
.
It specifies which components of
x
the corresponding
partial derivatives
partial_x
must be computed.
order_up
This argument is one greater than highest order Taylor coefficient that
computing the derivative of.
q
We use the notation
q = order_up + 1
below.
This is one less than the number of Taylor coefficients for each component
of
x
and
y
.
taylor_x
The size of
taylor_x
is
q*n
.
For @(@
j = 0 , \ldots , n-1
@)@ and @(@
k = 0 , \ldots , q-1
@)@,
we use the Taylor coefficient notation
@[@
\begin{array}{rcl}
x_j^k & = & \R{taylor\_x} [ j * q + k ]
\\
X_j (t) & = & x_j^0 + x_j^1 t^1 + \cdots + x_j^{q-1} t^{q-1}
\end{array}
@]@
Note that superscripts represent an index for @(@
x_j^k
@)@
and an exponent for @(@
t^k
@)@.
Also note that the Taylor coefficients for @(@
X(t)
@)@ correspond
to the derivatives of @(@
X(t)
@)@ at @(@
t = 0
@)@ in the following way:
@[@
x_j^k = \frac{1}{ k ! } X_j^{(k)} (0)
@]@
parameters
If the j-th component of
x
is a parameter,
type_x[j] < CppAD::variable_enum
In this case, for
k > 0
,
taylor_x[ j * q + k ] == 0
ataylor_x
The specifications for
ataylor_x
is the same as for
taylor_x
(only the type of
ataylor_x
is different).
taylor_y
The size of
taylor_y
is
q*m
.
For @(@
i = 0 , \ldots , m-1
@)@ and @(@
k = 0 , \ldots , q-1
@)@,
we use the Taylor coefficient notation
@[@
\begin{array}{rcl}
Y_i (t) & = & g_i [ X(t) ]
\\
Y_i (t) & = &
y_i^0 + y_i^1 t^1 + \cdots + y_i^{q-1} t^{q-1} + o ( t^{q-1} )
\\
y_i^k & = & \R{taylor\_y} [ i * q + k ]
\end{array}
@]@
where @(@
o( t^{q-1} ) / t^{q-1} \rightarrow 0
@)@ as @(@
t \rightarrow 0
@)@.
Note that superscripts represent an index for @(@
y_j^k
@)@
and an exponent for @(@
t^k
@)@.
Also note that the Taylor coefficients for @(@
Y(t)
@)@ correspond
to the derivatives of @(@
Y(t)
@)@ at @(@
t = 0
@)@ in the following way:
@[@
y_j^k = \frac{1}{ k ! } Y_j^{(k)} (0)
@]@
ataylor_y
The specifications for
ataylor_y
is the same as for
taylor_y
(only the type of
ataylor_y
is different).
F
We use the notation @(@
\{ x_j^k \} \in \B{R}^{n \times q}
@)@ for
@[@
\{ x_j^k \W{:} j = 0 , \ldots , n-1, k = 0 , \ldots , q-1 \}
@]@
We use the notation @(@
\{ y_i^k \} \in \B{R}^{m \times q}
@)@ for
@[@
\{ y_i^k \W{:} i = 0 , \ldots , m-1, k = 0 , \ldots , q-1 \}
@]@
We use
@(@
F : \B{R}^{n \times q} \rightarrow \B{R}^{m \times q}
@)@ by
to denote the function corresponding to the forward mode calculations
@[@
y_i^k = F_i^k [ \{ x_j^k \} ]
@]@
Note that
@[@
F_i^0 ( \{ x_j^k \} ) = g_i ( X(0) ) = g_i ( x^0 )
@]@
We also note that
@(@
F_i^\ell ( \{ x_j^k \} )
@)@ is a function of
@(@
x^0 , \ldots , x^\ell
@)@; i.e.,
it is determined by the derivatives of @(@
g_i (x)
@)@
up to order @(@
\ell
@)@.
G, H
We use @(@
G : \B{R}^{m \times q} \rightarrow \B{R}
@)@
to denote an arbitrary scalar valued function of @(@
\{ y_i^k \}
@)@.
We use @(@
H : \B{R}^{n \times q} \rightarrow \B{R}
@)@
defined by
@[@
H ( \{ x_j^k \} ) = G[ F( \{ x_j^k \} ) ]
@]@
partial_y
The size of
partial_y
is
q*m
.
For @(@
i = 0 , \ldots , m-1
@)@, @(@
k = 0 , \ldots , q-1
@)@,
@[@
\R{partial\_y} [ i * q + k ] = \partial G / \partial y_i^k
@]@
apartial_y
The specifications for
apartial_y
is the same as for
partial_y
(only the type of
apartial_y
is different).
partial_x
The size of
partial_x
is
q*n
.
The input values of the elements of
partial_x
are not specified (must not matter).
Upon return,
for @(@
j = 0 , \ldots , n-1
@)@ and @(@
\ell = 0 , \ldots , q-1
@)@,
@[@
\begin{array}{rcl}
\R{partial\_x} [ j * q + \ell ] & = & \partial H / \partial x_j^\ell
\\
& = &
( \partial G / \partial \{ y_i^k \} ) \cdot
( \partial \{ y_i^k \} / \partial x_j^\ell )
\\
& = &
\sum_{k=0}^{q-1}
\sum_{i=0}^{m-1}
( \partial G / \partial y_i^k ) ( \partial y_i^k / \partial x_j^\ell )
\\
& = &
\sum_{k=\ell}^{q-1}
\sum_{i=0}^{m-1}
\R{partial\_y}[ i * q + k ] ( \partial F_i^k / \partial x_j^\ell )
\end{array}
@]@
Note that we have used the fact that for @(@
k < \ell
@)@,
@(@
\partial F_i^k / \partial x_j^\ell = 0
@)@.
azmul
An optimized
function will use zero
for values in
taylor_x
and
taylor_y
that are
not necessary in the current context.
If you divide by these values when computing
@(@
( \partial F_i^k / \partial x_j^\ell )
@)@ you could get an nan
if the corresponding value in
partial_y
is zero.
To be careful, if you do divide by
taylor_x
or
taylor_y
, use azmul
for to avoid zero over zero calculations.
apartial_x
The specifications for
apartial_x
is the same as for
partial_x
(only the type of
apartial_x
is different).
ok
If this calculation succeeded,
ok
is true.
Otherwise it is false.