卷积神经网络的简单可视化
本次将进行卷积神经网络权重的简单可视化。
在本篇教程的前半部分,我们会首先定义一个及其简单的 CNN 模型,并手工指定一些过滤器权重参数,作为卷积核参数。
后半部分,我们会使用 FashionMNIST 数据集,并且定义一个 2 层的 CNN 模型,将模型训练至准确率在 85% 以上,再进行模型卷积核的可视化。
1. 简单卷积网络模型的可视化
1.1 指定过滤器卷积层的可视化
在下面的练习中,我们将手动定义几个类似索比尔算子的过滤器,并将它们指定给一个极其简单地卷积神经网络模型。然后可视化卷积层 4 个过滤器的输出(即 feature maps)。
加载目标图像
import cv2
import matplotlib.pyplot as plt
%matplotlib inline
img_path = 'images/udacity_sdc.png'
bgr_img = cv2.imread(img_path)
gray_img = cv2.cvtColor(bgr_img, cv2.COLOR_BGR2GRAY)
gray_img = gray_img.astype("float32")/255
plt.imshow(gray_img, cmap='gray')
plt.show()
手动定义过滤器
import numpy as np
filter_vals = np.array([[-1, -1, 1, 1], [-1, -1, 1, 1], [-1, -1, 1, 1], [-1, -1, 1, 1]])
# 变化产生更丰富的过滤器
filter_1 = filter_vals
filter_2 = -filter_1
filter_3 = filter_1.T
filter_4 = -filter_3
filters = np.array([filter_1, filter_2, filter_3, filter_4])
fig = plt.figure(figsize=(10, 5))
for i in range(4):
ax = fig.add_subplot(1, 4, i+1, xticks=[], yticks=[])
ax.imshow(filters[i], cmap='gray')
ax.set_title('Filter %s' % str(i+1))
width, height = filters[i].shape
for x in range(width):
for y in range(height):
ax.annotate(str(filters[i][x][y]), xy=(y,x),
horizontalalignment='center',
verticalalignment='center',
color='white' if filters[i][x][y] < 0 else 'black')
定义简单卷积神经网络
import torch
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self, weight):
super(Net, self).__init__()
k_height, k_width = weight.shape[2:]
self.conv = nn.Conv2d(1, 4, kernel_size=(k_height, k_width), bias=False)
self.conv.weight = torch.nn.Parameter(weight)
self.pool = nn.MaxPool2d(4,4)
def forward(self, x):
conv_x = self.conv(x)
activated_x = F.relu(conv_x)
pooled_x = self.pool(activated_x)
return conv_x, activated_x, pooled_x
# filters 的大小为 4 4 4
# weight 的大小被增加为 4 1 4 4,1 的维度是针对输入的一个通道
weight = torch.from_numpy(filters).unsqueeze(1).type(torch.FloatTensor)
model = Net(weight)
print('Filters shape: ', filters.shape)
print('weights shape: ', weight.shape)
print(model)
Filters shape: (4, 4, 4)
weights shape: torch.Size([4, 1, 4, 4])
Net(
(conv): Conv2d(1, 4, kernel_size=(4, 4), stride=(1, 1), bias=False)
(pool): MaxPool2d(kernel_size=4, stride=4, padding=0, dilation=1, ceil_mode=False)
)
可视化卷积输出
定义一个函数 viz_layer,在这个方法可以可视化某一层卷积的输出。
def viz_layer(layer, n_filters=4):
fig = plt.figure(figsize=(20, 20))
for i in range(n_filters):
ax = fig.add_subplot(1, n_filters, i+1, xticks=[], yticks=[])
ax.imshow(np.squeeze(layer[0,i].data.numpy()), cmap='gray')
ax.set_title('Output %s' % str(i+1))
# 输出原图
plt.imshow(gray_img, cmap='gray')
# 格式化输出过滤器(卷积核)
fig = plt.figure(figsize=(12, 6))
fig.subplots_adjust(left=0, right=1.5, bottom=0.8, top=1, hspace=0.05, wspace=0.05)
for i in range(4):
ax = fig.add_subplot(1, 4, i+1, xticks=[], yticks=[])
ax.imshow(filters[i], cmap='gray')
ax.set_title('Filter %s' % str(i+1))
# 为 gray img 添加 1 个 batch 维度,以及 1 个 channel 维度,并转化为 tensor
gray_img_tensor = torch.from_numpy(gray_img).unsqueeze(0).unsqueeze(1)
print(gray_img.shape)
print(gray_img_tensor.shape)
# 将输入图传入模型,获得输出
conv_layer, activated_layer, pooled_layer = model(gray_img_tensor)
# 可视化卷积输出
viz_layer(conv_layer)
(213, 320)
torch.Size([1, 1, 213, 320])
# 可视化卷积后激活函数后的输出
viz_layer(activated_layer)
1.2 指定过滤器池化层的可视化
下面可视化池化层后的输出。
# 可视化池化层后的输出
viz_layer(pooled_layer)
2. 多层卷积网络模型的可视化
在下面的练习中,我们将定义一个相对复杂点的神经网络,并使用 FashionMNIST 数据集训练至 85% 以上的准确率,其后再对神经网络进行可视化分析。
2.1 加载 FashionMNIST 数据集
FashionMNIST 相当于一种对 MNIST 数据集的升级。MNIST 数据集的数字识别在目前来说,模式比较简单,可能作为深度神经网络模型的目标数据集稍显简单。FashionMNIST 将图像内容变为“时尚衣物”,图像格式不变,使用起来几乎与 MNIST 无异,且比 MNIST 更能考验模型对数据模式的学习能力。
FashionMNIST 的类别列表:
0:T-shirt/top(T恤)
1:Trouser(裤子)
2:Pullover(套衫)
3:Dress(裙子)
4:Coat(外套)
5:Sandal(凉鞋)
6:Shirt(汗衫)
7:Sneaker(运动鞋)
8:Bag(包)
加载 FashionMNIST 数据集
import torch
import torchvision
from torchvision.datasets import FashionMNIST
from torch.utils.data import DataLoader
from torchvision import transforms
data_transform = transforms.ToTensor()
train_data = FashionMNIST(root='./data', train=True,
download=False, transform=data_transform)
test_data = FashionMNIST(root='./data', train=False,
download=False, transform=data_transform)
# Print out some stats about the training and test data
print('Train data, number of images: ', len(train_data))
print('Test data, number of images: ', len(test_data))
Train data, number of images: 60000
Test data, number of images: 10000
创建数据加载器
batch_size = 20
train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_data, batch_size=batch_size, shuffle=True)
# specify the image classes
classes = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
可视化目标数据集的部分数据
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
dataiter = iter(train_loader)
images, labels = dataiter.next()
images = images.numpy()
# plot the images in the batch, along with the corresponding labels
fig = plt.figure(figsize=(25, 4))
for idx in np.arange(batch_size):
ax = fig.add_subplot(2, batch_size/2, idx+1, xticks=[], yticks=[])
ax.imshow(np.squeeze(images[idx]), cmap='gray')
ax.set_title(classes[labels[idx]])#### 加载 FashionMNIST 数据集
2.2 训练多层卷积模型
定义模型
下面定义一个具有两层卷积的模型,加入的 dropout 在一定程度上起到防止过拟合的作用。
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 16, 3, padding=1)
self.pool1 = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(16, 32, 3, padding=1)
self.pool2 = nn.MaxPool2d(2, 2)
self.activation_l = nn.ReLU()
self.fc = nn.Linear(32 * 7 * 7, 24)
self.out = nn.Linear(24, 10)
self.dropout = nn.Dropout(p=0.5)
self.activation_out = nn.Softmax(dim=1)
def forward(self, x):
x = self.activation_l(self.conv1(x))
x = self.pool1(x)
x = self.activation_l(self.conv2(x))
x = self.pool2(x)
x = x.view(x.size(0), -1)
x = self.activation_l(self.fc(x))
x = self.dropout(x)
x = self.activation_out(self.out(x))
return x
训练模型
import torch.optim as optim
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(net.parameters())
def train(n_epochs):
for epoch in range(n_epochs):
running_loss = 0.0
for batch_i, data in enumerate(train_loader):
inputs, labels = data
optimizer.zero_grad()
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
if batch_i % 1000 == 999:
print('Epoch: {}, Batch: {}, Avg. Loss: {}'.format(epoch + 1, batch_i+1, running_loss/1000))
running_loss = 0.0
print('Finished Training')
n_epochs = 10
train(n_epochs)
model_dir = 'saved_models/'
model_name = 'model_best.pt'
torch.save(net.state_dict(), model_dir+model_name)
加载训练的模型
net = Net()
net.load_state_dict(torch.load('saved_models/model_best.pt'))
print(net)
Net(
(conv1): Conv2d(1, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(conv2): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(activation_l): ReLU()
(fc): Linear(in_features=1568, out_features=24, bias=True)
(out): Linear(in_features=24, out_features=10, bias=True)
(dropout): Dropout(p=0.5)
(activation_out): Softmax()
)
在测试数据集上测试模型
test_loss = torch.zeros(1)
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
print(class_correct)
print(test_loss)
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
tensor([ 0.])
net.eval()
criterion = torch.nn.CrossEntropyLoss()
for batch_i, data in enumerate(test_loader):
inputs, labels = data
output = net(inputs)
loss = criterion(outputs, labels)
# update average test loss
test_loss = test_loss + ( (torch.ones(1) / (batch_i+1)) * (loss.data - test_loss) )
_, predicted = torch.max(output.data, 1)
correct = np.squeeze(predicted.eq(labels.data.view_as(predicted)))
for i in range(batch_size):
label = labels.data[i]
class_correct[label] += correct[i].item()
class_total[label] += 1
print('Test Loss: {:.6f}
'.format(test_loss.numpy()[0]))
for i in range(10):
if class_total[i] > 0:
print('Test Accuracy of %5s: %2d%% (%2d/%2d)' % (
classes[i], 100 * class_correct[i] / class_total[i],
np.sum(class_correct[i]), np.sum(class_total[i])))
else:
print('Test Accuracy of %5s: N/A (no training examples)' % (classes[i]))
print('
Test Accuracy (Overall): %2d%% (%2d/%2d)' % (
100. * np.sum(class_correct) / np.sum(class_total),
np.sum(class_correct), np.sum(class_total)))
Test Loss: 2.362950
Test Accuracy of T-shirt/top: 85% (850/1000)
Test Accuracy of Trouser: 96% (963/1000)
Test Accuracy of Pullover: 84% (842/1000)
Test Accuracy of Dress: 91% (911/1000)
Test Accuracy of Coat: 85% (856/1000)
Test Accuracy of Sandal: 98% (989/1000)
Test Accuracy of Shirt: 49% (495/1000)
Test Accuracy of Sneaker: 94% (948/1000)
Test Accuracy of Bag: 97% (978/1000)
Test Accuracy of Ankle boot: 93% (930/1000)
Test Accuracy (Overall): 87% (8762/10000)
2.3 特征可视化
模型得到训练并且在测试数据上可以达到 87% 的准确率,下面让我们进行可视化。
可视化策略是从模型中将各卷积层的参数提取出来,作为独立的过滤器,使用 OpenCV 的 filter2D 函数,施加在一张从测试集抽样出的图像中。观察过滤器对图像起到的作用,并尝试去解释当前过滤器对原图起到了怎样的滤波作用。
从数据集中抽取单张图片
dataiter = iter(test_loader)
images, labels = dataiter.next()
images = images.numpy()
idx = 15
img = np.squeeze(images[idx])
import cv2
plt.imshow(img, cmap='gray')
<matplotlib.image.AxesImage at 0x124832a90>
进行第一层卷积核的可视化
weights = net.conv1.weight.data
w = weights.numpy()
print(w.shape)
fig = plt.figure(figsize=(30, 10))
columns = 4 * 2
row = 4
for i in range(0, columns * row):
fig.add_subplot(row, columns, i+1)
if ((i%2)==0):
plt.imshow(w[int(i/2)][0], cmap='gray')
else:
c = cv2.filter2D(img, -1, w[int((i-1)/2)][0])
plt.imshow(c, cmap='gray')
plt.show()
(16, 1, 3, 3)
进行第一层卷积核的可视化
weights = net.conv2.weight.data
w = weights.numpy()
print(w.shape)
fig = plt.figure(figsize=(30, 20))
columns = 4 * 2
row = 8
for i in range(0, columns * row):
fig.add_subplot(row, columns, i+1)
if ((i%2)==0):
plt.imshow(w[int(i/2)][0], cmap='gray')
else:
c = cv2.filter2D(img, -1, w[int((i-1)/2)][0])
plt.imshow(c, cmap='gray')
plt.show()
(32, 16, 3, 3)
可以看到一些卷积核起到了边缘检测的功能,不同的卷积核对不同方向,不同的纹理,或者说不同的图像内容敏感。
感觉这种人以主观想法可视化卷积的方法还不够丰满,可能这就算是简单的神经网络的可视化方法。除了卷积核的可视化,还可以进行全连接层的可视化。
关于全连接层的可视化,有教程表示是通过可视化类似类别间不同数据单例的“嵌入向量”距离进行可视化的,可能还需要对全连接层产生的“嵌入向量”进行 T-SNE 将为后再进行可视化。如果后续遇到了相关内容,会在本文中再补上。
后记
本文内容参考自 Udacity 计算机视觉纳米学位练习,官方源码连接:
https://github.com/udacity/CVND_Exercises/tree/master/1_5_CNN_Layers