6 Automatic Differentiation

8.12

6 Automatic Differentiation🔗ℹ

Automatic differentiation functions are described in terms of these types, that are described in Overview.

dual? - Duals
link? - Links included in a dual. Defined as the type
(-> dual? tensor? gradient-state? gradient-state?)
gradient-state? - A hashtable from dual? to tensor?
differentiable? - Either a dual?, or a (listof differentiable?). In the learner representation (vectorof differentiable?) is also considered to be differentiable?, but not in other representations.

procedure
(dual ρ κ) → dual?
ρ : tensor?
κ : link?

Constructs a dual with ρ as its real part, and κ as its link.

procedure
(dual? x) → boolean?
x : any

Returns #t if x is a dual.

procedure
(ρ d) → tensor?
d : (or tensor? dual?)

If d is a tensor?, returns d. Otherwise, returns the real part of the dual.

procedure
(κ d) → link?
d : (or tensor? dual?)

If d is a tensor?, returns the function end-of-chain. Otherwise, returns the link of the dual.

procedure
(scalar? x) → boolean?
x : any

Returns #t if x is a number? or x is a dual? and (ρ x) is a number?

procedure
(end-of-chain d z σ) → gradient-state
  d : dual?
  z : tensor?
  σ : gradient-state?

The default link that terminates gradient computation for any dual. It returns a new gradient-state? that includes the mapping of d to the addition of z and the mapping of d in σ, if it exists, or 0.0 otherwise. If z is not a number?, +-ρ is used for the addition.

In the learner representation, z can only be a scalar?.

procedure
(∇¹ f t0 ... tn) → (listof tensor?)
  f : (-> differentiable? ... differentiable?)
  t0 : differentiable?
  tn : differentiable?

Returns a list of gradients (list g0 ... gn) where gi is the gradient of (f t0 ... tn) with respect to ti. If (f t0 ... tn) is not a scalar?, then gi is the sum of the gradients of each scalar in (f t0 ... tn) with respect to ti.

procedure
(∇ f θ) → (listof tensor?)
f : (-> (listof tensor?) tensor?)
θ : (listof tensor?)

This is equivalent to

(ref (∇¹ f θ) 0)

procedure
(gradient-of f θ) → (listof tensor?)
f : (-> (listof tensor?) tensor?)
θ : (listof tensor?)

This is also equivalent to

(ref (∇¹ f θ) 0)

1	Overview
2	Entry points
3	List functions
4	Tensor functions
5	Extended Functions
6	Automatic Differentiation
7	Differentiable extended numerical functions
8	Non-differentiable extended numerical functions
9	Base-rank (non-extended) differentiable functions
10	Boolean comparison functions
11	Tensorized comparison functions
12	Hyperparameters
13	Gradient Descent Functions and Hyperparameters
14	Layer functions
15	Loss Functions
16	Building blocks for neural networks
17	He Initialization
18	Random number functions
19	Models and Accuracy
20	Logging
21	Utilities
22	Setting tensor implementations