Atomic Function Reverse Mode

Loading [MathJax]/extensions/tex2jax.js

atomic_four_reverse

@(@\newcommand{\W}[1]{ \; #1 \; } \newcommand{\R}[1]{ {\rm #1} } \newcommand{\B}[1]{ {\bf #1} } \newcommand{\D}[2]{ \frac{\partial #1}{\partial #2} } \newcommand{\DD}[3]{ \frac{\partial^2 #1}{\partial #2 \partial #3} } \newcommand{\Dpow}[2]{ \frac{\partial^{#1}}{\partial {#2}^{#1}} } \newcommand{\dpow}[2]{ \frac{ {\rm d}^{#1}}{{\rm d}\, {#2}^{#1}} }@)@This is cppad-20221105 documentation. Here is a link to its current documentation . Atomic Function Reverse Mode
Syntax

Base

ok = afun.reverse(

    call_id, select_x,

    order_up, taylor_x, taylor_y, partial_x, partial_y

)

AD<Base>

ok = afun.reverse(

    call_id, select_x,

    order_up, ataylor_x, ataylor_y, apartial_x, apartial_y

)

Prototype

Base


template <class Base>
bool atomic_four<Base>::reverse(
    size_t                      call_id     ,
    const vector<bool>&         select_x    ,
    size_t                      order_up    ,
    const vector<Base>&         taylor_x    ,
    const vector<Base>&         taylor_y    ,
    vector<Base>&               partial_x   ,
    const vector<Base>&         partial_y   )

AD<Base>


template <class Base>
bool atomic_four<Base>::reverse(
    size_t                          call_id      ,
    const vector<bool>&             select_x     ,
    size_t                          order_up     ,
    const vector< AD<Base> >&       ataylor_x    ,
    const vector< AD<Base> >&       ataylor_y    ,
    vector< AD<Base> >&             apartial_x   ,
    const vector< AD<Base> >&       apartial_y   )

Base
see Base .

vector
is the CppAD_vector template class.

Usage

Base
This syntax is used by f.Reverse where f has prototype



    ADFun<Base> f

and atomic function afun is used in f ; see Base .

AD<Base>
This syntax is used by af.Reverse where af has prototype



    ADFun< AD<Base> , Base > af

and the atomic function afun is used in af ; see base2ad .

Implementation
This function must be defined if afun is used during the recording of an ADFun object f , and reverse mode derivatives are computed for f . It can return ok == false (and not compute anything) for values of order_up that are greater than those used by your reverse mode calculations.

call_id
See call_id .

select_x
This argument has size equal to the number of arguments to this atomic function; i.e. the size of ax . It specifies which components of x the corresponding partial derivatives partial_x must be computed.

order_up
This argument is one greater than highest order Taylor coefficient that computing the derivative of.

q
We use the notation q = order_up + 1 below. This is one less than the number of Taylor coefficients for each component of x and y .

taylor_x
The size of taylor_x is q*n . For @(@ j = 0 , \ldots , n-1 @)@ and @(@ k = 0 , \ldots , q-1 @)@, we use the Taylor coefficient notation @[@ \begin{array}{rcl} x_j^k & = & \R{taylor\_x} [ j * q + k ] \\ X_j (t) & = & x_j^0 + x_j^1 t^1 + \cdots + x_j^{q-1} t^{q-1} \end{array} @]@ Note that superscripts represent an index for @(@ x_j^k @)@ and an exponent for @(@ t^k @)@. Also note that the Taylor coefficients for @(@ X(t) @)@ correspond to the derivatives of @(@ X(t) @)@ at @(@ t = 0 @)@ in the following way: @[@ x_j^k = \frac{1}{ k ! } X_j^{(k)} (0) @]@

parameters
If the j-th component of x is a parameter,



    type_x[j] < CppAD::variable_enum

In this case, for k > 0 ,



    taylor_x[ j * q + k ] == 0

ataylor_x
The specifications for ataylor_x is the same as for taylor_x (only the type of ataylor_x is different).

taylor_y
The size of taylor_y is q*m . For @(@ i = 0 , \ldots , m-1 @)@ and @(@ k = 0 , \ldots , q-1 @)@, we use the Taylor coefficient notation @[@ \begin{array}{rcl} Y_i (t) & = & g_i [ X(t) ] \\ Y_i (t) & = & y_i^0 + y_i^1 t^1 + \cdots + y_i^{q-1} t^{q-1} + o ( t^{q-1} ) \\ y_i^k & = & \R{taylor\_y} [ i * q + k ] \end{array} @]@ where @(@ o( t^{q-1} ) / t^{q-1} \rightarrow 0 @)@ as @(@ t \rightarrow 0 @)@. Note that superscripts represent an index for @(@ y_j^k @)@ and an exponent for @(@ t^k @)@. Also note that the Taylor coefficients for @(@ Y(t) @)@ correspond to the derivatives of @(@ Y(t) @)@ at @(@ t = 0 @)@ in the following way: @[@ y_j^k = \frac{1}{ k ! } Y_j^{(k)} (0) @]@

ataylor_y
The specifications for ataylor_y is the same as for taylor_y (only the type of ataylor_y is different).

F
We use the notation @(@ \{ x_j^k \} \in \B{R}^{n \times q} @)@ for @[@ \{ x_j^k \W{:} j = 0 , \ldots , n-1, k = 0 , \ldots , q-1 \} @]@ We use the notation @(@ \{ y_i^k \} \in \B{R}^{m \times q} @)@ for @[@ \{ y_i^k \W{:} i = 0 , \ldots , m-1, k = 0 , \ldots , q-1 \} @]@ We use @(@ F : \B{R}^{n \times q} \rightarrow \B{R}^{m \times q} @)@ by to denote the function corresponding to the forward mode calculations @[@ y_i^k = F_i^k [ \{ x_j^k \} ] @]@ Note that @[@ F_i^0 ( \{ x_j^k \} ) = g_i ( X(0) ) = g_i ( x^0 ) @]@ We also note that @(@ F_i^\ell ( \{ x_j^k \} ) @)@ is a function of @(@ x^0 , \ldots , x^\ell @)@; i.e., it is determined by the derivatives of @(@ g_i (x) @)@ up to order @(@ \ell @)@.

G, H
We use @(@ G : \B{R}^{m \times q} \rightarrow \B{R} @)@ to denote an arbitrary scalar valued function of @(@ \{ y_i^k \} @)@. We use @(@ H : \B{R}^{n \times q} \rightarrow \B{R} @)@ defined by @[@ H ( \{ x_j^k \} ) = G[ F( \{ x_j^k \} ) ] @]@

partial_y
The size of partial_y is q*m . For @(@ i = 0 , \ldots , m-1 @)@, @(@ k = 0 , \ldots , q-1 @)@, @[@ \R{partial\_y} [ i * q + k ] = \partial G / \partial y_i^k @]@

apartial_y
The specifications for apartial_y is the same as for partial_y (only the type of apartial_y is different).

partial_x
The size of partial_x is q*n . The input values of the elements of partial_x are not specified (must not matter). Upon return, for @(@ j = 0 , \ldots , n-1 @)@ and @(@ \ell = 0 , \ldots , q-1 @)@, @[@ \begin{array}{rcl} \R{partial\_x} [ j * q + \ell ] & = & \partial H / \partial x_j^\ell \\ & = & ( \partial G / \partial \{ y_i^k \} ) \cdot ( \partial \{ y_i^k \} / \partial x_j^\ell ) \\ & = & \sum_{k=0}^{q-1} \sum_{i=0}^{m-1} ( \partial G / \partial y_i^k ) ( \partial y_i^k / \partial x_j^\ell ) \\ & = & \sum_{k=\ell}^{q-1} \sum_{i=0}^{m-1} \R{partial\_y}[ i * q + k ] ( \partial F_i^k / \partial x_j^\ell ) \end{array} @]@ Note that we have used the fact that for @(@ k < \ell @)@, @(@ \partial F_i^k / \partial x_j^\ell = 0 @)@.

azmul
An optimized function will use zero for values in taylor_x and taylor_y that are not necessary in the current context. If you divide by these values when computing @(@ ( \partial F_i^k / \partial x_j^\ell ) @)@ you could get an nan if the corresponding value in partial_y is zero. To be careful, if you do divide by taylor_x or taylor_y , use azmul for to avoid zero over zero calculations.

apartial_x
The specifications for apartial_x is the same as for partial_x (only the type of apartial_x is different).

ok
If this calculation succeeded, ok is true. Otherwise it is false.

Example
The following is an example reverse definition taken from atomic_four_norm_sq.cpp :

        bool reverse(
            size_t                              call_id     ,
            const CppAD::vector<bool>&          select_x    ,
            size_t                              order_up    ,
            const CppAD::vector<double>&        tx          ,
            const CppAD::vector<double>&        ty          ,
            CppAD::vector<double>&              px          ,
            const CppAD::vector<double>&        py          ) override
        {
            size_t q = order_up + 1;
            size_t n = tx.size() / q;
    # ifndef NDEBUG
            size_t m = ty.size() / q;
            assert( call_id == 0 );
            assert( m == 1 );
            assert( px.size() == tx.size() );
            assert( py.size() == ty.size() );
            assert( n == select_x.size() );
    # endif
            // ok
            bool ok = order_up == 0;
            if ( ! ok )
                return ok;

            // first order reverse mode
            for(size_t j = 0; j < n; ++j)
            {   // x_0^0
                double xj0 = tx[ j * q + 0];
                //
                // H = G( F( { x_j^k } ) )
                double dF = 2.0 * xj0; // partial F w.r.t x_j^0
                double dG = py[0];     // partial of G w.r.t. y[0]
                double dH = dG * dF;   // partial of H w.r.t. x_j^0

                // px[j]
                px[j] = dH;
            }
            return ok;
        }

Examples
The file atomic_four_norm_sq.cpp contains an example that defines this routine.

Input File: include/cppad/core/atomic/four/reverse.hpp