pytorch入门之四参数初始化

pytorch之四参数初始化

# -*- coding: utf-8 -*-
"""
Created on Fri Jul 27 20:57:51 2018

@author: dj
"""
import numpy as np
import torch
from torch import nn

使用 NumPy 来初始化因为 PyTorch 是一个非常灵活的框架，理论上能够对所有的 Tensor 进行操作，所以我们能够通过定义新的 Tensor 来初始化，直接看下面的例子

定义一个 Sequential 模型

net1 = nn.Sequential(
    nn.Linear(30, 40),
    nn.ReLU(),
    nn.Linear(40, 50),
    nn.ReLU(),
    nn.Linear(50, 10)
)

访问第一层的参数

w1 = net1[0].weight
b1 = net1[0].bias
print(w1)
print(b1)

pytorch入门之四参数初始化
注意，这是一个 Parameter，也就是一个特殊的 Variable，我们可以访问其 .data属性得到其中的数据，然后直接定义一个新的 Tensor 对其进行替换，我们可以使用 PyTorch 中的一些随机数据生成的方式，比如 torch.randn，如果要使用更多 PyTorch 中没有的随机化方式，可以使用 numpy

# 定义一个 Tensor 直接对其进行替换
net1[0].weight.data = torch.from_numpy(np.random.uniform(3, 5, size=(40, 30)))
print(net1[0].weight)

pytorch入门之四参数初始化
可以看到这个参数的值已经被改变了，也就是说已经被定义成了我们需要的初始化方式，如果模型中某一层需要我们手动去修改，那么我们可以直接用这种方式去访问，但是更多的时候是模型中相同类型的层都需要初始化成相同的方式，这个时候一种更高效的方式是使用循环去访问，比如

for layer in net1:
    if isinstance(layer, nn.Linear): # 判断是否是线性层
        param_shape = layer.weight.shape
        layer.weight.data = torch.from_numpy(np.random.normal(0, 0.5, size=param_shape)) 
        # 定义为均值为 0，方差为 0.5 的正态分布
"""

小练习：一种非常流行的初始化方式叫 Xavier，方法来源于 2010 年的一篇论文 Understanding the difficulty of training deep feedforward neural networks，其通过数学的推到，证明了这种初始化方式可以使得每一层的输出方差是尽可能相等的
，给出这种初始化的公式 $w\ \sim \ Uniform[- \frac{\sqrt{6}}{\sqrt{n_j + n_{j+1}}}, \frac{\sqrt{6}}{\sqrt{n_j + n_{j+1}}}]$
其中 $n_j$ 和 $n_{j+1}$ 表示该层的输入和输出数目，所以请尝试实现以下这种初始化方式
对于 Module 的参数初始化，其实也非常简单，如果想对其中的某层进行初始化，可以直接像 Sequential 一样对其 Tensor 进行重新定义，其唯一不同的地方在于，如果要用循环的方式访问，需要介绍两个属性，children 和 modules，下面我们举例来说明

class sim_net(nn.Module):
    def __init__(self):
        super(sim_net, self).__init__()
        self.l1 = nn.Sequential(
            nn.Linear(30, 40),
            nn.ReLU()
        )
        
        self.l1[0].weight.data = torch.randn(40, 30) # 直接对某一层初始化
        
        self.l2 = nn.Sequential(
            nn.Linear(40, 50),
            nn.ReLU()
        )
        
        self.l3 = nn.Sequential(
            nn.Linear(50, 10),
            nn.ReLU()
        )
    
    def forward(self, x):
        x = self.l1(x)
        x =self.l2(x)
        x = self.l3(x)
        return x
net2 = sim_net()

访问 children

for i in net2.children():
    print(i)
# 访问 modules
for i in net2.modules():
    print(i)

children 只会访问到模型定义中的第一层，因为上面的模型中定义了三个Sequential，所以只会访问到三个 Sequential，而 modules 会访问到最后的结构比如上面的例子，modules 不仅访问到了 Sequential，也访问到了 Sequential 里面，这就对我们做初始化非常方便，比如

for layer in net2.modules():
    if isinstance(layer, nn.Linear):
        param_shape = layer.weight.shape
        layer.weight.data = torch.from_numpy(np.random.normal(0, 0.5, size=param_shape))

from torch.nn import init
print(net1[0].weight)
init.xavier_uniform(net1[0].weight) # 这就是上面我们讲过的 Xavier 初始化方法，PyTorch 直接内置了其实现
print(net1[0].weight)

注：转载大神的GitHub ，https://github.com/L1aoXingyu/code-of-learn-deep-learning-with-pytorch

pytorch入门之四参数初始化

相关推荐