【pytorch】实现简单的CNN卷积神经网络
目录
二. 池化层 nn.MaxPool1d() 和 nn.MaxPool2d()
一. 卷积层 nn.Conv2d()
参数设置
在Pytorch的nn模块中,封装了nn.Conv2d()类作为二维卷积的实现。参数如下图所示
- in_channels:输入张量的channels数
- out_channels:输出张量的channels数
- kernel_size:卷积核的大小。一般我们会使用3x3这种两个数相同的卷积核,这种情况只需要写
kernel_size = 3
就行了。如果左右两个数不同,比如3x5,那么写作kernel_size = (3, 5)
- stride = 1:步长
- padding:填充图像的上下左右,后面的常数代表填充的多少(行数、列数),默认为0。padding = 1时,若原始图像大小为
32x32
,则padding后的图像大小为34x34
- dilation = 1:是否采用空洞卷积,默认为1(即不采用)
- groups = 1:决定了是否采用分组卷积,默认为1可以参考一下:groups参数的理解
- bias = True:是否要添加偏置参数作为可学习参数的一个,默认为True
- padding_mode = ‘zeros’:
padding
的模式,默认采用零填充
前三个参数需要手动提供,后面的都有默认值
代码示例
class Net(nn.Module):
def __init__(self):
nn.Module.__init__(self)
self.conv2d = nn.Conv2d(in_channels=3,out_channels=64,kernel_size=4,stride=2,padding=1)
def forward(self, x):
print(x.requires_grad)
x = self.conv2d(x)
return x
# 查看卷积层的权重和偏置
print(net.conv2d.weight)
print(net.conv2d.bias)
二. 池化层 nn.MaxPool1d() 和 nn.MaxPool2d()
参数设置
class torch.nn.MaxPool1d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
class torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
- kernel_size:(int or tuple)max pooling的窗口大小
- stride:(int or tuple, optional)max pooling的窗口移动的步长。默认值是kernel_size
- padding:(int or tuple, optional)输入的每一条边补充0的层数
- dilation:(int or tuple, optional)控制窗口中元素步幅的参数
- return_indices:如果等于True,会返回输出最大值的序号,对于上采样操作会有帮助
- ceil_mode:如果等于True,计算输出信号大小的时候,会使用向上取整,代替默认的向下取整的操作
代码示例
import torch
import torch.nn as nn
from torch.autograd import Variable
input = Variable(torch.randn(2, 5, 5))
m1 = nn.MaxPool1d(3, stride=2)
m2 = nn.MaxPool2d(3, stride=2)
output1 = m1(input)
output2 = m2(input)
print(input)
print(output1)
print(output2)
# 输出结果,可对比二者区别
tensor([[[ 2.4444e-01, 1.0226e+00, 2.4089e-01, 4.3374e-01, 8.6254e-01],
[-1.1597e-01, -5.1438e-01, 4.9354e-01, 1.3846e+00, -1.4846e+00],
[-5.2985e-01, -9.7652e-01, -1.1763e+00, -1.0564e+00, 1.8538e+00],
[-1.5157e+00, 2.4466e-03, -1.3180e+00, -6.4395e-01, 1.6216e-01],
[ 5.0826e-01, -4.2336e-01, -1.1817e+00, -3.9826e-01, 1.1857e-01]],
[[-7.9605e-01, 2.2759e-01, 2.1400e+00, -2.2706e-01, 9.8575e-01],
[-3.0485e+00, -6.6409e-01, 2.9864e-01, 1.3190e+00, -1.5249e+00],
[ 3.1127e-01, 4.2901e-01, 1.0026e+00, 6.4803e-01, 9.4203e-01],
[-5.6758e-01, 3.2101e-01, -4.5395e-01, 1.8376e+00, -8.6135e-01],
[ 7.8916e-01, -1.3624e+00, -1.3352e+00, -2.5927e+00, -3.1461e-01]]])
tensor([[[ 1.0226, 0.8625],
[ 0.4935, 1.3846],
[-0.5298, 1.8538],
[ 0.0024, 0.1622],
[ 0.5083, 0.1186]],
[[ 2.1400, 2.1400],
[ 0.2986, 1.3190],
[ 1.0026, 1.0026],
[ 0.3210, 1.8376],
[ 0.7892, -0.3146]]])
tensor([[[1.0226, 1.8538],
[0.5083, 1.8538]],
[[2.1400, 2.1400],
[1.0026, 1.8376]]])
Process finished with exit code 0
三. 激活函数层 nn.ReLU()
参数设置
nn.ReLU(inplace=True)
nn.ReLU()用来实现Relu函数,实现非线性。ReLU函数有个inplace参数,如果设为True,它会把输出直接覆盖到输入中,这样可以节省内存。之所以可以覆盖是因为在计算ReLU的反向传播时,只需根据输出就能够推算出反向传播的梯度。一般不使用inplace操作。
四. CNN的简单实现
import os
import numpy as np
import torch
import torch.nn as nn
import torch.utils.data as Data
import torchvision
import matplotlib.pyplot as plt
# 循环次数
EPOCH = 2
BATCH_SIZE = 50
# 下载mnist数据集
train_data = torchvision.datasets.MNIST(root='./mnist/', train=True, transform=torchvision.transforms.ToTensor(),
download=True, )
# (60000, 28, 28)
print(train_data.data.size())
# (60000)
print(train_data.targets.size())
train_loader = Data.DataLoader(dataset=train_data, batch_size=BATCH_SIZE, shuffle=True)
# 测试集
test_data = torchvision.datasets.MNIST(root='./mnist/', train=False)
# (2000, 1, 28, 28)
# 标准化
test_x = torch.unsqueeze(test_data.data, dim=1).type(torch.FloatTensor)[:2000] / 255.
test_y = test_data. targets[:2000]
# 建立pytorch神经网络
class CNN(nn.Module):
def __init__(self):
super(CNN, self).__init__()
# 第一部分卷积
self.conv1 = nn.Sequential(
nn.Conv2d(
in_channels=1,
out_channels=32,
kernel_size=5,
stride=1,
padding=2,
dilation=1
),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2),
)
# 第二部分卷积
self.conv2 = nn.Sequential(
nn.Conv2d(
in_channels=32,
out_channels=64,
kernel_size=3,
stride=1,
padding=1,
dilation=1
),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2),
)
# 全连接+池化+全连接
self.ful1 = nn.Linear(64 * 7 * 7, 512)
self.drop = nn.Dropout(0.5)
self.ful2 = nn.Sequential(nn.Linear(512, 10), nn.Softmax(dim=1))
# 前向传播
def forward(self, x):
x = self.conv1(x)
x = self.conv2(x)
x = x.view(x.size(0), -1)
x = self.ful1(x)
x = self.drop(x)
output = self.ful2(x)
return output
cnn = CNN()
# 指定优化器
optimizer = torch.optim.Adam(cnn.parameters(), lr=1e-3)
# 指定loss函数
loss_func = nn.CrossEntropyLoss()
for epoch in range(EPOCH):
for step, (b_x, b_y) in enumerate(train_loader):
# 计算loss并修正权值
output = cnn(b_x)
loss = loss_func(output, b_y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
# 打印
if step % 50 == 0:
test_output = cnn(test_x)
pred_y = torch.max(test_output, 1)[1].data.numpy()
accuracy = float((pred_y == test_y.data.numpy()).astype(int).sum()) / float(test_y.size(0))
print('Epoch: %2d' % epoch, ', loss: %.4f' % loss.data.numpy(), ', accuracy: %.4f' % accuracy)
# 输出结果
torch.Size([60000, 28, 28])
torch.Size([60000])
Epoch: 0 , loss: 2.3028 , accuracy: 0.0885
Epoch: 0 , loss: 1.7379 , accuracy: 0.7115
Epoch: 0 , loss: 1.6186 , accuracy: 0.8775
Epoch: 0 , loss: 1.5239 , accuracy: 0.9120
Epoch: 0 , loss: 1.6127 , accuracy: 0.9170
Epoch: 0 , loss: 1.5039 , accuracy: 0.9250
Epoch: 0 , loss: 1.4878 , accuracy: 0.9415
Epoch: 0 , loss: 1.5210 , accuracy: 0.9430
Epoch: 0 , loss: 1.4822 , accuracy: 0.9425
Epoch: 0 , loss: 1.5443 , accuracy: 0.9505
Epoch: 0 , loss: 1.4634 , accuracy: 0.9510
Epoch: 0 , loss: 1.5371 , accuracy: 0.9310
Epoch: 0 , loss: 1.4888 , accuracy: 0.9585
Epoch: 0 , loss: 1.4767 , accuracy: 0.9575
Epoch: 0 , loss: 1.5294 , accuracy: 0.9610
Epoch: 0 , loss: 1.4813 , accuracy: 0.9650
Epoch: 0 , loss: 1.4972 , accuracy: 0.9635
Epoch: 0 , loss: 1.5218 , accuracy: 0.9585
Epoch: 0 , loss: 1.4837 , accuracy: 0.9605
Epoch: 0 , loss: 1.4762 , accuracy: 0.9595
Epoch: 0 , loss: 1.5419 , accuracy: 0.9565
Epoch: 0 , loss: 1.4810 , accuracy: 0.9590
Epoch: 0 , loss: 1.4621 , accuracy: 0.9575
Epoch: 0 , loss: 1.5410 , accuracy: 0.9595
Epoch: 1 , loss: 1.4650 , accuracy: 0.9670
Epoch: 1 , loss: 1.4890 , accuracy: 0.9610
Epoch: 1 , loss: 1.4875 , accuracy: 0.9630
Epoch: 1 , loss: 1.4800 , accuracy: 0.9680
Epoch: 1 , loss: 1.5326 , accuracy: 0.9655
Epoch: 1 , loss: 1.4763 , accuracy: 0.9670
Epoch: 1 , loss: 1.5177 , accuracy: 0.9685
Epoch: 1 , loss: 1.4612 , accuracy: 0.9520
Epoch: 1 , loss: 1.4632 , accuracy: 0.9605
Epoch: 1 , loss: 1.5207 , accuracy: 0.9615
Epoch: 1 , loss: 1.5021 , accuracy: 0.9645
Epoch: 1 , loss: 1.5303 , accuracy: 0.9645
Epoch: 1 , loss: 1.4821 , accuracy: 0.9565
Epoch: 1 , loss: 1.4812 , accuracy: 0.9660
Epoch: 1 , loss: 1.4762 , accuracy: 0.9685
Epoch: 1 , loss: 1.4812 , accuracy: 0.9690
Epoch: 1 , loss: 1.4614 , accuracy: 0.9490
Epoch: 1 , loss: 1.4740 , accuracy: 0.9580
Epoch: 1 , loss: 1.4625 , accuracy: 0.9695
Epoch: 1 , loss: 1.5190 , accuracy: 0.9685
Epoch: 1 , loss: 1.5242 , accuracy: 0.9645
Epoch: 1 , loss: 1.4612 , accuracy: 0.9755
Epoch: 1 , loss: 1.4812 , accuracy: 0.9580
Epoch: 1 , loss: 1.4812 , accuracy: 0.9625
Process finished with exit code 0
最后这个完整代码是刚学的时候网上找的,不太记得出处了