-->

signed

QiShunwang

“诚信为本、客户至上”

pytorch官网教程翻译

2021/6/3 15:30:09   来源:

翻译pytorch官网内容,有些翻译不太准确的地方,我会放上原文,如有错误的地方,欢迎指正

1.学习基础

大多数机器学习工作流包括数据处理、模型创建、优化模型参数和保存训练过的模型。本教程向您介绍用 PyTorch 实现的完整 ML 工作流程,并提供链接让您了解有关每个概念的更多信息。

我们将使用 FashionMNIST 数据集来训练一个神经网络,用于预测输入图像是否属于以下类别之一:T-shirt/top, Trouser, Pullover, Dress, Coat, Sandal, Shirt, Sneaker, Bag, or Ankle boot。

本教程假设您基本熟悉 Python 和深度学习概念。

1.1运行教程代码

可以以下面两种方法运行这个教程:

  • 在云上:这是最简单的入门方式!每个部分的顶部都有一个“在 Microsoft Learn 中运行(Run in Microsoft Learn)”链接,可在完全托管的环境中打开 Microsoft Learn 中的集成笔记本,其中包含代码
  • 本地运行:此选项要求您首先在本地机器上设置 PyTorch 和 TorchVision。下载笔记本或将代码复制到您喜欢的 IDE 中

1.2 如何使用这个指导

如果你已经熟悉其他深度学习框架,请先查看快速开始(Quickstart)来学习PyTorch的API
如果你不熟悉深度学习框架,请直接进入我们分步指南的第一部分:张量(Tsnsor)

2.快速开始

本节贯穿机器学习中常见任务的 API。请参阅每个部分中的链接以深入了解。

2.1 处理数据

PyTorch 有两个用于处理数据的原语(primitives):torch.utils.data.DataLoadertorch.utils.data.Dataset. Dataset存储样本及其相应的标签,并且DataLoader在Dataset周围包装一个可迭代对象(and DataLoader wraps an iterable around the Dataset).

import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor, Lambda, Compose
import matplotlib.pyplot as plt

PyTorch 提供特定领域的库,例如TorchText、 TorchVision和TorchAudio,所有这些库都包含数据集。在本教程中,我们将使用 TorchVision 数据集。
torchvision.datasets模块包含许多Dataset,如 CIFAR、COCO(此处为完整列表)。在本教程中,我们使用 FashionMNIST 数据集。每个 TorchVision Dataset包含两个参数:transform和 target_transform,分别修改样本和标签。

# Download training data from open datasets.
training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True, # 为True,表示下载数据集
    transform=ToTensor(),
)

# Download test data from open datasets.
test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor(),
)

输出:

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to data/FashionMNIST/raw/train-images-idx3-ubyte.gz
Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw/train-labels-idx1-ubyte.gz
Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz
Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz
Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw

Processing...
Done!

我们将Dataset作为参数传递给DataLoader。这在我们的数据集上包装了一个迭代(This wraps an iterable over our dataset),并支持自动批处理(automatic batching)、采样(sampling)、打乱(shuffling)和多进程数据加载(multiprocess data loading)。在这我们定义batch size=64,也就是在数据加载器迭代中的每个元素将返回一批 64 个特征和标签(each element in the dataloader iterable will return a batch of 64 features and labels.)。

batch_size = 64

# Create data loaders.
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)

for X, y in test_dataloader:
    print("Shape of X [N, C, H, W]: ", X.shape)
    print("Shape of y: ", y.shape, y.dtype)
    break

输出:

Shape of X [N, C, H, W]:  torch.Size([64, 1, 28, 28])
Shape of y:  torch.Size([64]) torch.int64

2.2创建模型

为了在 PyTorch 中定义一个神经网络,我们创建了一个继承nn.Module的类。我们在__init__函数中定义网络层,并且在forward函数中指定数据是如何通过网络的。如果GPU可用的话,可以将网络移入GPU来加速神经网络运行。

# Get cpu or gpu device for training.
device = "cuda" if torch.cuda.is_available() else "cpu"
print("Using {} device".format(device))

# Define model
class NeuralNetwork(nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10),
            nn.ReLU()
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork().to(device)
print(model)

输出:

Using cuda device
NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
    (5): ReLU()
  )
)

2.3优化模型参数

训练模型需要一个loss function(损失函数)
和一个优化器(optimizer)

loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

在单个训练循环中,模型对训练数据集进行预测(分批提供给它),并反向传播预测误差来调整模型的参数。

def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)

        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if batch % 100 == 0:
            loss, current = loss.item(), batch * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")

我们还根据测试数据集检查模型的性能,来确保它正在学习。

def test(dataloader, model):
   size = len(dataloader.dataset)
   model.eval()
   test_loss, correct = 0, 0
   with torch.no_grad():
       for X, y in dataloader:
           X, y = X.to(device), y.to(device)
           pred = model(X)
           test_loss += loss_fn(pred, y).item()
           correct += (pred.argmax(1) == y).type(torch.float).sum().item()
   test_loss /= size
   correct /= size
   print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")

训练过程在多次迭代(epochs)中进行。在每个epoch,模型学习参数以做出更好的预测。我们在每个epoch print模型的准确性(accuracy)和损失(loss);我们希望看到随着epoch的增加,准确率增加,loss减少。

epochs = 5
for t in range(epochs):
   print(f"Epoch {t+1}\n-------------------------------")
   train(train_dataloader, model, loss_fn, optimizer)
   test(test_dataloader, model)
print("Done!")

输出

Epoch 1
-------------------------------
loss: 2.311972  [    0/60000]
loss: 2.306603  [ 6400/60000]
loss: 2.300294  [12800/60000]
loss: 2.290011  [19200/60000]
loss: 2.288779  [25600/60000]
loss: 2.297002  [32000/60000]
loss: 2.282910  [38400/60000]
loss: 2.284932  [44800/60000]
loss: 2.279393  [51200/60000]
loss: 2.281883  [57600/60000]
Test Error:
 Accuracy: 25.9%, Avg loss: 0.035572

Epoch 2
-------------------------------
loss: 2.277968  [    0/60000]
loss: 2.266700  [ 6400/60000]
loss: 2.247214  [12800/60000]
loss: 2.234270  [19200/60000]
loss: 2.246031  [25600/60000]
loss: 2.267033  [32000/60000]
loss: 2.243770  [38400/60000]
loss: 2.245201  [44800/60000]
loss: 2.230118  [51200/60000]
loss: 2.250285  [57600/60000]
Test Error:
 Accuracy: 25.9%, Avg loss: 0.034725

Epoch 3
-------------------------------
loss: 2.235750  [    0/60000]
loss: 2.215401  [ 6400/60000]
loss: 2.173603  [12800/60000]
loss: 2.154524  [19200/60000]
loss: 2.192134  [25600/60000]
loss: 2.224188  [32000/60000]
loss: 2.186744  [38400/60000]
loss: 2.188137  [44800/60000]
loss: 2.159041  [51200/60000]
loss: 2.202392  [57600/60000]
Test Error:
 Accuracy: 25.8%, Avg loss: 0.033527

Epoch 4
-------------------------------
loss: 2.176824  [    0/60000]
loss: 2.146703  [ 6400/60000]
loss: 2.073140  [12800/60000]
loss: 2.048597  [19200/60000]
loss: 2.128556  [25600/60000]
loss: 2.165518  [32000/60000]
loss: 2.116198  [38400/60000]
loss: 2.121913  [44800/60000]
loss: 2.079846  [51200/60000]
loss: 2.146714  [57600/60000]
Test Error:
 Accuracy: 27.7%, Avg loss: 0.032243

Epoch 5
-------------------------------
loss: 2.110255  [    0/60000]
loss: 2.075479  [ 6400/60000]
loss: 1.971338  [12800/60000]
loss: 1.951628  [19200/60000]
loss: 2.072443  [25600/60000]
loss: 2.103646  [32000/60000]
loss: 2.053480  [38400/60000]
loss: 2.064291  [44800/60000]
loss: 2.009853  [51200/60000]
loss: 2.099065  [57600/60000]
Test Error:
 Accuracy: 31.8%, Avg loss: 0.031166

Done!

2.4保存模型

保存模型的的常用方法是序列化内部状态字典(包含模型参数)。(A common way to save a model is to serialize the internal state dictionary (containing the model parameters).)

torch.save(model.state_dict(), "model.pth")
print("Saved PyTorch Model State to model.pth")

输出

Saved PyTorch Model State to model.pth

2.5加载模型

加载模型的过程包括重新生成模型结构并将状态字典(the state dictionary)加载到其中。

model = NeuralNetwork()
model.load_state_dict(torch.load("model.pth"))

该模型现在可用于进行预测。

classes = [
    "T-shirt/top",
    "Trouser",
    "Pullover",
    "Dress",
    "Coat",
    "Sandal",
    "Shirt",
    "Sneaker",
    "Bag",
    "Ankle boot",
]

model.eval()
x, y = test_data[0][0], test_data[0][1]
with torch.no_grad():
    pred = model(x)
    predicted, actual = classes[pred[0].argmax(0)], classes[y]
    print(f'Predicted: "{predicted}", Actual: "{actual}"')

输出:

Predicted: "Bag", Actual: "Ankle boot"