实战人工智能框架PyTorch的模型训练

我们关心网络结构和数据,定义损失函数,定义优化函数等。

具体步骤如下:

第一步:图像封装为向量后,将输入input向前传播,进行运算后得到输出output

第二步:将output再输入loss函数,计算loss值(是个标量)-损失函数用来得到新权重

第三步:将梯度反向传播到每个参数(优化函数):主要指标是 学习速率????  x 梯度向量g

第四步:利用下面公式进行权重更新

新权重w =  旧权重w  +  学习速率????  x 梯度向量g

封装好了数据后,就可以作为模型的输入了。所以要先导入你的模型。在PyTorch中已经默认为大家准备了一些常用的网络结构,比如分类中的VGG,ResNet,DenseNet等等,可以用torchvision.models模块来导入。比如用torchvision.models.resnet18(pretrained=True)来导入ResNet18网络,同时指明导入的是已经预训练过的网络。因为预训练网络一般是在1000类的ImageNet数据集上进行的,所以要迁移到你自己数据集的2分类,需要替换最后的全连接层为你所需要的输出。因此下面这三行代码进行的就是用models模块导入resnet18网络,然后获取全连接层的输入channel个数,用这个channel个数和你要做的分类类别数(这里是2)替换原来模型中的全连接层。这样网络结果也准备好。

model = models.resnet18(pretrained=True)
num_ftrs = model.fc.in_features
model.fc = nn.Linear(num_ftrs, 2)

但是只有网络结构和数据还不足以让代码运行起来,还需要定义损失函数。在PyTorch中采用torch.nn模块来定义网络的所有层,比如卷积、降采样、损失层等等,这里采用交叉熵函数,因此可以这样定义:

criterion = nn.CrossEntropyLoss()
然后你还需要定义优化函数,比如最常见的随机梯度下降,在PyTorch中是通过torch.optim模块来实现的。另外这里虽然写的是SGD,但是因为有momentum,所以是Adam的优化方式。这个类的输入包括需要优化的参数:model.parameters(),学习率,还有Adam相关的momentum参数。现在很多优化方式的默认定义形式就是这样的。

optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

然后一般还会定义学习率的变化策略,这里采用的是torch.optim.lr_scheduler模块的StepLR类,表示每隔step_size个epoch就将学习率降为原来的gamma倍。

实战人工智能框架PyTorch的模型训练

代码实现如下:

from __future__ import print_function, division

1.导入算法的包

import torch

import torch.nn as nn

import torch.optim as optim

from torch.optim import lr_scheduler

from torch.autograd import Variable

import torchvision

from torchvision import datasets, models, transforms

import time

import os

2.训练模型

def train_model(model, criterion, optimizer, scheduler, num_epochs=25):

    since = time.time()

 

    best_model_wts = model.state_dict()

    best_acc = 0.0

 

    for epoch in range(num_epochs):

        print('Epoch {}/{}'.format(epoch, num_epochs - 1))

        print('-' * 10)

 

        # Each epoch has a training and validation phase

        for phase in ['train', 'val']:

            if phase == 'train':

                scheduler.step()

                model.train(True# Set model to training mode

            else:

                model.train(False# Set model to evaluate mode

 

            running_loss = 0.0

            running_corrects = 0.0

 

            # Iterate over data.

            for data in dataloders[phase]:

                # get the inputs

                inputs, labels = data

 

                # wrap them in Variable

                if use_gpu:

                    inputs = Variable(inputs.cuda())

                    labels = Variable(labels.cuda())

                else:

                    inputs, labels = Variable(inputs), Variable(labels)

 

                # zero the parameter gradients

                optimizer.zero_grad()

 

                # forward

                outputs = model(inputs)

                _, preds = torch.max(outputs.data, 1)

                loss = criterion(outputs, labels)

 

                # backward + optimize only if in training phase

                if phase == 'train':

                    loss.backward()

                    optimizer.step()

 

                # statistics

                running_loss += loss.data[0]

                running_corrects += torch.sum(preds == labels.data).to(torch.float32)

 

            epoch_loss = running_loss / dataset_sizes[phase]

            epoch_acc = running_corrects / dataset_sizes[phase]

 

            print('{} Loss: {:.4f} Acc: {:.4f}'.format(

                phase, epoch_loss, epoch_acc))

 

            # deep copy the model

            if phase == 'val' and epoch_acc > best_acc:

                best_acc = epoch_acc

                best_model_wts = model.state_dict()

 

    time_elapsed = time.time() - since

    print('Training complete in {:.0f}m {:.0f}s'.format(

        time_elapsed // 60, time_elapsed % 60))

    print('Best val Acc: {:4f}'.format(best_acc))

 

    # load best model weights

    model.load_state_dict(best_model_wts)

    return model

 

if __name__ == '__main__':

 

    # data_transform, pay attention that the input of Normalize() is Tensor and the input of RandomResizedCrop() or RandomHorizontalFlip() is PIL Image

    data_transforms = {

        'train': transforms.Compose([

            transforms.RandomSizedCrop(224),

            transforms.RandomHorizontalFlip(),

            transforms.ToTensor(),

            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])

        ]),

        'val': transforms.Compose([

            transforms.Scale(256),

            transforms.CenterCrop(224),

            transforms.ToTensor(),

            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])

        ]),

    }

转换为元组image_datasets

    # your image data file

    data_dir = '/data'

    image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),

                                              data_transforms[x]) for x in ['train', 'val']}

    # wrap your data and label into Tensor

image_datasets元组转换为Tensor

    dataloders = {x: torch.utils.data.DataLoader(image_datasets[x],

                                                 batch_size=4,

                                                 shuffle=True,

                                                 num_workers=4) for x in ['train', 'val']}

 

    dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'val']}

 

    # use gpu or not

    use_gpu = torch.cuda.is_available()

 

    # get model and replace the original fc layer with your fc layer

    model_ft = models.resnet18(pretrained=True)

    num_ftrs = model_ft.fc.in_features

    model_ft.fc = nn.Linear(num_ftrs, 2)

 

    if use_gpu:

        model_ft = model_ft.cuda()

 

    # define loss function

    criterion = nn.CrossEntropyLoss()

 

    # Observe that all parameters are being optimized

    optimizer_ft = optim.SGD(model_ft.parameters(), lr=0.001, momentum=0.9)

 

    # Decay LR by a factor of 0.1 every 7 epochs

    exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)

 

    model_ft = train_model(model=model_ft,

                           criterion=criterion,

                           optimizer=optimizer_ft,

                           scheduler=exp_lr_scheduler,

                           num_epochs=25)