# (CSE 599W)Reverse Mode Autodiff

## 背景

• Symbolic Differentiation
• Numerical Differentiation
• Automatic Differentiation (auto diff)

auto diff中两种主流的方式分别是forward-mode和reverse-mode 由于forward-mode的方法中，计算的时间复杂度是O(n),n是输入的参数个数。而reverse-mode中，计算的时间复杂度是O(m),m是输出节点的个数。在dnn中，n往往很大，远大于m，因此这里主要介绍reverse-mode auto diff方法。

### backprop和reverse mode auto diff的区别

Bakpropagation refers to the whole process of training an artificial neural network using multiple backpropagation steps, each of which computes gradients and uses them to perform a Gradient Descent step. In contrast, reverse-mode auto diff is simply a technique used to compute gradients efficiently and it happens to be used by backpropagation.

### chain rule

chain rule..也就是微积分里的链式法则 更准确的说，是多变量的chain rule 简单来说，就是同一条路径相乘，不同路径相加 MULTI-VARIABLE CHAIN RULE

### 实现

`````` 1  def run(self, feed_dict):
2        """Computes values of nodes in eval_node_list given computation graph.
3        Parameters
4        ----------
5        feed_dict: list of variable nodes whose values are supplied by user.
6
7        Returns
8        -------
9        A list of values for nodes in eval_node_list.
10        """
11        node_to_val_map = dict(feed_dict)
12        print("self.eval_node_list={}".format(self.eval_node_list))
13        # Traverse graph in topological sort order and compute values for all nodes.
14        topo_order = find_topo_sort(self.eval_node_list)
15        # 按照拓扑排序的顺序来计算，保证计算当前节点的值时，其依赖的值都计算出来了。
17        for node in topo_order:
18            if isinstance(node.op, PlaceholderOp):
19                continue
20            # 在实际计算的时候，要用具体的值来替代node
21            input_vals = [node_to_val_map[x] for x in node.inputs]
22            res = node.op.compute(node,input_vals)
23            node_to_val_map[node] = res
24
25        # Collect node values.
26        # 因为是按照topo order计算的，最后再变为和输入相同的顺序去输出
27        node_val_results = [node_to_val_map[node] for node in self.eval_node_list]
28        return node_val_results
29
``````

`````` 1def test_multi_var_chain_rule():
3    x2 = x1+3
4    x3 = x1+5
5    y = x2*x3
6
8
10    x1_val = 1 * np.ones(3)
12
``````
`````` 1
2nosetests -s  -v  autodiff_test.py                                                                                                      130 ↵
3autodiff_test.custom_test ... output_node=(x1+((x2*x3)*x1))
4node.name=(x1+((x2*x3)*x1))
8
9node.name=((x2*x3)*x1)
13
14node.name=(x2*x3)
18
19node.name=x3
23
24node.name=x2
28
29node.name=x1
33
34length of node_to_output_grads_list = {(x1+((x2*x3)*x1)): [Oneslike((x1+((x2*x3)*x1)))], x1: [Oneslike((x1+((x2*x3)*x1))), (Oneslike((x1+((x2*x3)*x1)))*(x2*x3))], ((x2*x3)*x1): [Oneslike((x1+((x2*x3)*x1)))], (x2*x3): [(Oneslike((x1+((x2*x3)*x1)))*x1)], x2: [((Oneslike((x1+((x2*x3)*x1)))*x1)*x3)], x3: [((Oneslike((x1+((x2*x3)*x1)))*x1)*x2)]}
35self.eval_node_list=[(x1+((x2*x3)*x1)), (Oneslike((x1+((x2*x3)*x1)))+(Oneslike((x1+((x2*x3)*x1)))*(x2*x3))), ((Oneslike((x1+((x2*x3)*x1)))*x1)*x3), ((Oneslike((x1+((x2*x3)*x1)))*x1)*x2)]
36
``````
`````` 1
2 for node in reverse_topo_order:
3        # print("node.name={} op={}".format(node.name, type(node.op)))