《昇思25天学习打卡营第07天|函数式自动微分》

发布于:2024-07-04 ⋅ 阅读:(21) ⋅ 点赞:(0)

函数式自动微分

环境配置

# 实验环境已经预装了mindspore==2.2.14,如需更换mindspore版本,可更改下面mindspore的版本号
!pip uninstall mindspore -y
!pip install -i https://pypi.mirrors.ustc.edu.cn/simple mindspore==2.2.14
import numpy as np
import mindspore
from mindspore import nn
from mindspore import ops
from mindspore import Tensor, Parameter

函数与计算图

  • w x + b = z wx + b = z wx+b=z
    -> A c t i v a t i o n − F u n c t i o n ( z ) Activation - Function(z) ActivationFunction(z)
    -> y p r e d y_{pred} ypred
    -> C r o s s − E n t r o p y ( y , y p r e d ) Cross - Entropy(y , y_{pred}) CrossEntropy(y,ypred)

  • w , b 为需要优化的参数 w,b为需要优化的参数 w,b为需要优化的参数

    x = ops.ones(5, mindspore.float32) # input tensor
    y = ops.zones(3, mindspore.float32) # expected output
    w = Parameter(Tensor(np.random.randn(5, 3), mindspore.float32), name = 'w')
    b = Parameter(Tensor(np.random.randn(3,), mindspore.float32), name='b') # bias
    
    def function(x, y, w, b):
        z = ops.matmul(x, w) + b
        loss = ops.binary_cross_entropy_with_logits(z, y, ops.ones_like(z), ops.ones_like(z))
        return loss
        
    loss = function(x, y, w, b)
    print(loss)
    #output Tensor(shape=[], dtype=Float32, value= 0.914285)
    

微分函数与梯度计算

  • 为优化模型需要求参数对loss的导数 ∂ l o s s ∂ w \frac{\partial loss}{\partial w} wloss, ∂ l o s s ∂ b \frac{\partial loss}{\partial b} bloss
  • 调用mindspore.grad函数获取function的微分函数
  • fn: 待求导函数
  • grad_position: 指定求导输入位置索引
  • 使用grad获得微分函数是一种函数变换,即输入为函数,输出也为函数
grad_fn = mindspore.grad(function, (2, 3))
grads = grad_fn(x, y, w, b)
print(grads)
#Output (Tensor(shape=[5, 3], dtype=Float32, value= [[ 6.56869709e-02,  5.37334494e-02,  3.01467031e-01], [ 6.56869709e-02,  5.37334494e-02,  3.01467031e-01], [ 6.56869709e-02,  5.37334494e-02,  3.01467031e-01], [ 6.56869709e-02,  5.37334494e-02,  3.01467031e-01], [ 6.56869709e-02,  5.37334494e-02,  3.01467031e-01]]), Tensor(shape=[3], dtype=Float32, value= [ 6.56869709e-02,  5.37334494e-02,  3.01467031e-01]))

Stop Gradient

  • 实现对某个输出项的梯度截断,或消除某个Tensor对梯度的影响
def function_with_logits(x, y, w, b):
    z = ops.matmul(x, w) + b
    loss = ops.binary_cross_entropy_with_logits(z, y, ops.ones_like(z), ops.ones_like(z))
    return loss, z
grad_fn = mindspore.grad(function_with_logits, (2, 3))
grads = grad_fn(x, y, w, b)
# 若想屏蔽掉z对梯度的影响,使用ops.stop_gradient接口, 将梯度在此截断

def function_stop_gradient(x, y, w, b):
    z = ops.matmul(x, w) + b
    loss = ops.binary_cross_entropy_with_logits(z, y, ops.ones_like(z), ops.ones_like(z))
    return loss, ops.stop_gradient(z)

grad_fn = mindspore.grad(function_stop_gradient, (2, 3))
grads = grad_fn(x, y, w, b)

Auxiliary data

  • Auxiliary data为辅助数据,是函数除第一个输出项外的其他输出。
  • gradvalue_and_grad提供has_aux参数,当其设置为True时,可以自动实现前文手动添加stop_gradient的功能。
grad_fn = mindspore.grad(function_with_logits, (2, 3), has_aux=True)
grads, (z,) = grad_fn(x, y, w, b)

神经网络梯度计算

#定义模型
class Network(nn.Cell):
    def __init__(self):
        super().__init__()
        self.w = w
        self.b = b

    def construct(self, x):
        z = ops.matmul(x, self.w) + self.b
        return z
# 实例化模型
model = Network()
# 实例化损失函数
loss_fn = nn.BCEWithLogitsLoss()
# 定义正向传播
def forward_fn(x, y):
    z = model(x)
    loss = loss_fn(z, y)
    return loss
grad_fn = mindspore.value_and_grad(forward_fn, None, weights=model.trainable_params())
loss, grads = grad_fn(x, y)