摘要:
记录MindSpore AI框架使用ShuffleNet网络对CIFAR-10数据集进行分类的过程、步骤和方法。包括环境准备、下载数据集、数据集加载和预处理、构建模型、模型训练、模型评估、模型测试等。
一、概念
1.ShuffleNet网络
旷视科技提出的CNN模型
应用在移动端
通过设计更高效的网络结构来实现模型的压缩和加速。
目标
利用有限资源达到最好的模型精度。
核心引入了两种操作
Pointwise Group Convolution
Channel Shuffle
优点
保持准确率不低,降低参数量
2.模型架构
ShuffleNet最显著的特点
重排不同通道解决Group Convolution弊端
改进ResNet Bottleneck单元
较小计算量达到较高准确率
Pointwise Group Convolution
Group Convolution(分组卷积)原理图
分组卷积
每组卷积核大小为in_channels/g*k*k
共有g组
所有组共有(in_channels/g*k*k)*out_channels个参数
是正常卷积参数的1/g
每个卷积核只处理输入特征图部分通道
优点
降低参数量,输出通道数仍等于卷积核的数量
Depthwise Convolution(深度可分离卷积)
将组数g分为和输入通道相等的in_channels
卷积操作每个in_channels
每个卷积核只处理一个通道
卷积核大小为1*k*k
卷积核参数量:in_channels*k*k
feature maps通道数与输入通道数相等
Pointwise Group Convolution(逐点分组卷积)
分组卷积基础
每组卷积核大小 1×11×1
卷积核参数量为(in_channels/g*1*1)*out_channels
3.Channel Shuffle
通道重排
Group Convolution的弊端
不同组别的通道无法进行信息交流
降低网络的特征提取能力
不同分组通道均匀分散重组
下一层网络能处理不同组别通道的信息
对于g组
每组有n个通道的特征图
reshape成g行n列的矩阵
矩阵转置成n行g列
flatten操作得到新排列
轻操作
二、环境准备
%%capture captured_output
# 实验环境已经预装了mindspore==2.2.14,如需更换mindspore版本,可更改下面mindspore的版本号
!pip uninstall mindspore -y
!pip install -i https://pypi.mirrors.ustc.edu.cn/simple mindspore==2.2.14
# 查看当前 mindspore 版本
!pip show mindspore
输出:
Name: mindspore
Version: 2.2.14
Summary: MindSpore is a new open source deep learning training/inference framework that could be used for mobile, edge and cloud scenarios.
Home-page: https://www.mindspore.cn
Author: The MindSpore Authors
Author-email: contact@mindspore.cn
License: Apache 2.0
Location: /home/nginx/miniconda/envs/jupyter/lib/python3.9/site-packages
Requires: asttokens, astunparse, numpy, packaging, pillow, protobuf, psutil, scipy
Required-by:
分组卷积类
from mindspore import nn
import mindspore.ops as ops
from mindspore import Tensor
class GroupConv(nn.Cell):
def __init__(self, in_channels, out_channels, kernel_size,
stride, pad_mode="pad", pad=0, groups=1, has_bias=False):
super(GroupConv, self).__init__()
self.groups = groups
self.convs = nn.CellList()
for _ in range(groups):
self.convs.append(nn.Conv2d(in_channels // groups, out_channels // groups,
kernel_size=kernel_size, stride=stride, has_bias=has_bias,
padding=pad, pad_mode=pad_mode, group=1, weight_init='xavier_uniform'))
def construct(self, x):
features = ops.split(x, split_size_or_sections=int(len(x[0]) // self.groups), axis=1)
outputs = ()
for i in range(self.groups):
outputs = outputs + (self.convs[i](features[i].astype("float32")),)
out = ops.cat(outputs, axis=1)
return out
三、ShuffleNet模块
ShuffleNet的改进,从(a)->(b)->(c)
对ResNet中的Bottleneck结构进行由(a)到(b), (c)的更改:
1.(a)中的一、三层1×1Conv卷积模块(降维、升维)改成1×1GConv逐点分组卷积;
2.(a)中一层降维后进行通道重排,让不同通道的信息交流;
3.(a)中的二层3×3 DWConv降采样模块中步长设置为2
长宽降为原来的一半((c)中三层)
(c)中shortcut中采用步长为2的3×3平均池化
相加改成拼接
class ShuffleV1Block(nn.Cell):
def __init__(self, inp, oup, group, first_group, mid_channels, ksize, stride):
super(ShuffleV1Block, self).__init__()
self.stride = stride
pad = ksize // 2
self.group = group
if stride == 2:
outputs = oup - inp
else:
outputs = oup
self.relu = nn.ReLU()
branch_main_1 = [
GroupConv(in_channels=inp, out_channels=mid_channels,
kernel_size=1, stride=1, pad_mode="pad", pad=0,
groups=1 if first_group else group),
nn.BatchNorm2d(mid_channels),
nn.ReLU(),
]
branch_main_2 = [
nn.Conv2d(mid_channels, mid_channels, kernel_size=ksize, stride=stride,
pad_mode='pad', padding=pad, group=mid_channels,
weight_init='xavier_uniform', has_bias=False),
nn.BatchNorm2d(mid_channels),
GroupConv(in_channels=mid_channels, out_channels=outputs,
kernel_size=1, stride=1, pad_mode="pad", pad=0,
groups=group),
nn.BatchNorm2d(outputs),
]
self.branch_main_1 = nn.SequentialCell(branch_main_1)
self.branch_main_2 = nn.SequentialCell(branch_main_2)
if stride == 2:
self.branch_proj = nn.AvgPool2d(kernel_size=3, stride=2, pad_mode='same')
def construct(self, old_x):
left = old_x
right = old_x
out = old_x
right = self.branch_main_1(right)
if self.group > 1:
right = self.channel_shuffle(right)
right = self.branch_main_2(right)
if self.stride == 1:
out = self.relu(left + right)
elif self.stride == 2:
left = self.branch_proj(left)
out = ops.cat((left, right), 1)
out = self.relu(out)
return out
def channel_shuffle(self, x):
batchsize, num_channels, height, width = ops.shape(x)
group_channels = num_channels // self.group
x = ops.reshape(x, (batchsize, group_channels, self.group, height, width))
x = ops.transpose(x, (0, 2, 1, 3, 4))
x = ops.reshape(x, (batchsize, num_channels, height, width))
return x
四、构建ShuffleNet网络
ShuffleNet网络结构图
输入图像224×224,组数3(g = 3)为例
卷积层
通过数量24
卷积核大小为3×3
stride为2
输出特征图大小为112×112
channel为24
最大池化层
stride为2
输出特征图大小为56×56
channel数不变
堆叠3个ShuffleNet模块
Stage2重复4次
下采样模块
特征图长宽减半
Channel 240
Stage3重复8次
下采样模块
特征图长宽减半
Channel 480
Stage4重复4次
下采样模块
特征图长宽减半
Channel 960
全局平均池化
输出大小为1×1×960
全连接层
Softmax
得到分类概率
class ShuffleNetV1(nn.Cell):
def __init__(self, n_class=1000, model_size='2.0x', group=3):
super(ShuffleNetV1, self).__init__()
print('model size is ', model_size)
self.stage_repeats = [4, 8, 4]
self.model_size = model_size
if group == 3:
if model_size == '0.5x':
self.stage_out_channels = [-1, 12, 120, 240, 480]
elif model_size == '1.0x':
self.stage_out_channels = [-1, 24, 240, 480, 960]
elif model_size == '1.5x':
self.stage_out_channels = [-1, 24, 360, 720, 1440]
elif model_size == '2.0x':
self.stage_out_channels = [-1, 48, 480, 960, 1920]
else:
raise NotImplementedError
elif group == 8:
if model_size == '0.5x':
self.stage_out_channels = [-1, 16, 192, 384, 768]
elif model_size == '1.0x':
self.stage_out_channels = [-1, 24, 384, 768, 1536]
elif model_size == '1.5x':
self.stage_out_channels = [-1, 24, 576, 1152, 2304]
elif model_size == '2.0x':
self.stage_out_channels = [-1, 48, 768, 1536, 3072]
else:
raise NotImplementedError
input_channel = self.stage_out_channels[1]
self.first_conv = nn.SequentialCell(
nn.Conv2d(3, input_channel, 3, 2, 'pad', 1, weight_init='xavier_uniform', has_bias=False),
nn.BatchNorm2d(input_channel),
nn.ReLU(),
)
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, pad_mode='same')
features = []
for idxstage in range(len(self.stage_repeats)):
numrepeat = self.stage_repeats[idxstage]
output_channel = self.stage_out_channels[idxstage + 2]
for i in range(numrepeat):
stride = 2 if i == 0 else 1
first_group = idxstage == 0 and i == 0
features.append(ShuffleV1Block(input_channel, output_channel,
group=group, first_group=first_group,
mid_channels=output_channel // 4, ksize=3, stride=stride))
input_channel = output_channel
self.features = nn.SequentialCell(features)
self.globalpool = nn.AvgPool2d(7)
self.classifier = nn.Dense(self.stage_out_channels[-1], n_class)
def construct(self, x):
x = self.first_conv(x)
x = self.maxpool(x)
x = self.features(x)
x = self.globalpool(x)
x = ops.reshape(x, (-1, self.stage_out_channels[-1]))
x = self.classifier(x)
return x
五、模型训练和评估
采用CIFAR-10数据集对ShuffleNet进行预训练。
1.训练集准备与加载
CIFAR-10
有60000张32*32的彩色图像
均匀地分为10个类别
50000张图片作为训练集
10000张图片作为测试集
mindspore.dataset.Cifar10Dataset接口
下载CIFAR-10的训练集
加载
from download import download
url = "https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/cifar-10-binary.tar.gz"
download(url, "./dataset", kind="tar.gz", replace=True)
输出:
Creating data folder...
Downloading data from https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/cifar-10-binary.tar.gz (162.2 MB)
file_sizes: 100%|█████████████████████████████| 170M/170M [00:00<00:00, 177MB/s]
Extracting tar.gz file...
Successfully downloaded / unzipped to ./dataset
[6]:
'./dataset'
import mindspore as ms
from mindspore.dataset import Cifar10Dataset
from mindspore.dataset import vision, transforms
def get_dataset(train_dataset_path, batch_size, usage):
image_trans = []
if usage == "train":
image_trans = [
vision.RandomCrop((32, 32), (4, 4, 4, 4)),
vision.RandomHorizontalFlip(prob=0.5),
vision.Resize((224, 224)),
vision.Rescale(1.0 / 255.0, 0.0),
vision.Normalize([0.4914, 0.4822, 0.4465], [0.2023, 0.1994, 0.2010]),
vision.HWC2CHW()
]
elif usage == "test":
image_trans = [
vision.Resize((224, 224)),
vision.Rescale(1.0 / 255.0, 0.0),
vision.Normalize([0.4914, 0.4822, 0.4465], [0.2023, 0.1994, 0.2010]),
vision.HWC2CHW()
]
label_trans = transforms.TypeCast(ms.int32)
dataset = Cifar10Dataset(train_dataset_path, usage=usage, shuffle=True)
dataset = dataset.map(image_trans, 'image')
dataset = dataset.map(label_trans, 'label')
dataset = dataset.batch(batch_size, drop_remainder=True)
return dataset
dataset = get_dataset("./dataset/cifar-10-batches-bin", 128, "train")
batches_per_epoch = dataset.get_dataset_size()
2.模型训练
随机初始化参数做预训练。
调用ShuffleNetV1定义网络
参数量选择"2.0x"
定义损失函数为交叉熵损失
学习率4轮的warmup后
余弦退火
优化器Momentum
train.model.Model接口封装
model.train()训练
传入回调函数
ModelCheckpoint
CheckpointConfig
TimeMonitor
LossMonitor
打印
训练轮数
损失
时间
保存ckpt文件在当前目录下
import time
import mindspore
import numpy as np
from mindspore import Tensor, nn
from mindspore.train import ModelCheckpoint, CheckpointConfig, TimeMonitor, LossMonitor, Model, Top1CategoricalAccuracy, Top5CategoricalAccuracy
def train():
mindspore.set_context(mode=mindspore.PYNATIVE_MODE, device_target="Ascend")
net = ShuffleNetV1(model_size="2.0x", n_class=10)
loss = nn.CrossEntropyLoss(weight=None, reduction='mean', label_smoothing=0.1)
min_lr = 0.0005
base_lr = 0.05
lr_scheduler = mindspore.nn.cosine_decay_lr(min_lr,
base_lr,
batches_per_epoch*250,
batches_per_epoch,
decay_epoch=250)
lr = Tensor(lr_scheduler[-1])
optimizer = nn.Momentum(params=net.trainable_params(), learning_rate=lr, momentum=0.9, weight_decay=0.00004, loss_scale=1024)
loss_scale_manager = ms.amp.FixedLossScaleManager(1024, drop_overflow_update=False)
model = Model(net, loss_fn=loss, optimizer=optimizer, amp_level="O3", loss_scale_manager=loss_scale_manager)
callback = [TimeMonitor(), LossMonitor()]
save_ckpt_path = "./"
config_ckpt = CheckpointConfig(save_checkpoint_steps=batches_per_epoch, keep_checkpoint_max=5)
ckpt_callback = ModelCheckpoint("shufflenetv1", directory=save_ckpt_path, config=config_ckpt)
callback += [ckpt_callback]
print("============== Starting Training ==============")
start_time = time.time()
# 由于时间原因,epoch = 5,可根据需求进行调整
model.train(5, dataset, callbacks=callback)
use_time = time.time() - start_time
hour = str(int(use_time // 60 // 60))
minute = str(int(use_time // 60 % 60))
second = str(int(use_time % 60))
print("total time:" + hour + "h " + minute + "m " + second + "s")
print("============== Train Success ==============")
if __name__ == '__main__':
train()
输出:
model size is 2.0x
============== Starting Training ==============
epoch: 1 step: 1, loss is 2.702430248260498
epoch: 1 step: 2, loss is 2.5544934272766113
epoch: 1 step: 3, loss is 2.3527920246124268
epoch: 1 step: 4, loss is 2.432495355606079
epoch: 1 step: 5, loss is 2.442847490310669
......
epoch: 1 step: 386, loss is 1.8315027952194214
epoch: 1 step: 387, loss is 1.9081732034683228
epoch: 1 step: 388, loss is 1.8965389728546143
epoch: 1 step: 389, loss is 1.8942060470581055
epoch: 1 step: 390, loss is 1.8646998405456543
Train epoch time: 439745.086 ms, per step time: 1127.552 ms
epoch: 2 step: 1, loss is 1.9022231101989746
epoch: 2 step: 2, loss is 1.8828961849212646
epoch: 2 step: 3, loss is 1.8220021724700928
epoch: 2 step: 4, loss is 2.003005027770996
epoch: 2 step: 5, loss is 1.8657888174057007
......
epoch: 2 step: 386, loss is 1.754606008529663
epoch: 2 step: 387, loss is 1.73811674118042
epoch: 2 step: 388, loss is 1.5935282707214355
epoch: 2 step: 389, loss is 1.7022861242294312
epoch: 2 step: 390, loss is 1.7202574014663696
Train epoch time: 121300.859 ms, per step time: 311.028 ms
epoch: 3 step: 1, loss is 1.6813828945159912
epoch: 3 step: 2, loss is 1.7341467142105103
epoch: 3 step: 3, loss is 1.8423044681549072
epoch: 3 step: 4, loss is 1.8151057958602905
epoch: 3 step: 5, loss is 1.727158784866333
......
epoch: 3 step: 386, loss is 1.6009197235107422
epoch: 3 step: 387, loss is 1.7389277219772339
epoch: 3 step: 388, loss is 1.6847612857818604
epoch: 3 step: 389, loss is 1.7618985176086426
epoch: 3 step: 390, loss is 1.719774842262268
Train epoch time: 121936.621 ms, per step time: 312.658 ms
epoch: 4 step: 1, loss is 1.6524462699890137
epoch: 4 step: 2, loss is 1.5743780136108398
epoch: 4 step: 3, loss is 1.7330453395843506
epoch: 4 step: 4, loss is 1.6160061359405518
epoch: 4 step: 5, loss is 1.6632086038589478
......
epoch: 4 step: 386, loss is 1.6585990190505981
epoch: 4 step: 387, loss is 1.6520838737487793
epoch: 4 step: 388, loss is 1.4504361152648926
epoch: 4 step: 389, loss is 1.8115458488464355
epoch: 4 step: 390, loss is 1.6291583776474
Train epoch time: 121944.082 ms, per step time: 312.677 ms
epoch: 5 step: 1, loss is 1.737457275390625
epoch: 5 step: 2, loss is 1.6314475536346436
epoch: 5 step: 3, loss is 1.6039154529571533
epoch: 5 step: 4, loss is 1.59605073928833
epoch: 5 step: 5, loss is 1.6140247583389282
......
epoch: 5 step: 386, loss is 1.599562406539917
epoch: 5 step: 387, loss is 1.486626148223877
epoch: 5 step: 388, loss is 1.6146260499954224
epoch: 5 step: 389, loss is 1.6220197677612305
epoch: 5 step: 390, loss is 1.610574722290039
Train epoch time: 121699.011 ms, per step time: 312.049 ms
total time:0h 15m 26s
============== Train Success ==============
训练好的模型保存在当前目录的shufflenetv1-5_390.ckpt中,用作评估。
3.模型评估
在CIFAR-10的测试集上对模型进行评估。
设置评估模型路径
加载数据集
设置Top 1、Top 5的评估标准
model.eval()接口对模型进行评估
from mindspore import load_checkpoint, load_param_into_net
def test():
mindspore.set_context(mode=mindspore.GRAPH_MODE, device_target="Ascend")
dataset = get_dataset("./dataset/cifar-10-batches-bin", 128, "test")
net = ShuffleNetV1(model_size="2.0x", n_class=10)
param_dict = load_checkpoint("shufflenetv1-5_390.ckpt")
load_param_into_net(net, param_dict)
net.set_train(False)
loss = nn.CrossEntropyLoss(weight=None, reduction='mean', label_smoothing=0.1)
eval_metrics = {'Loss': nn.Loss(), 'Top_1_Acc': Top1CategoricalAccuracy(),
'Top_5_Acc': Top5CategoricalAccuracy()}
model = Model(net, loss_fn=loss, metrics=eval_metrics)
start_time = time.time()
res = model.eval(dataset, dataset_sink_mode=False)
use_time = time.time() - start_time
hour = str(int(use_time // 60 // 60))
minute = str(int(use_time // 60 % 60))
second = str(int(use_time % 60))
log = "result:" + str(res) + ", ckpt:'" + "./shufflenetv1-5_390.ckpt" \
+ "', time: " + hour + "h " + minute + "m " + second + "s"
print(log)
filename = './eval_log.txt'
with open(filename, 'a') as file_object:
file_object.write(log + '\n')
if __name__ == '__main__':
test()
输出:
model size is 2.0x
[ERROR] CORE(263,ffffa833e930,python):2024-07-07-15:29:24.418.000 [mindspore/core/utils/file_utils.cc:253] GetRealPath] Get realpath failed, path[/tmp/ipykernel_263/3162391481.py]
[ERROR] CORE(263,ffffa833e930,python):2024-07-07-15:29:24.418.539 [mindspore/core/utils/file_utils.cc:253] GetRealPath] Get realpath failed, path[/tmp/ipykernel_263/3162391481.py]
......
result:{'Loss': 1.6150915485162, 'Top_1_Acc': 0.4930889423076923, 'Top_5_Acc': 0.9283854166666666}, ckpt:'./shufflenetv1-5_390.ckpt', time: 0h 1m 26s
4.模型预测
在CIFAR-10的测试集上对模型进行预测,并将预测结果可视化。
import mindspore
import matplotlib.pyplot as plt
import mindspore.dataset as ds
net = ShuffleNetV1(model_size="2.0x", n_class=10)
show_lst = []
param_dict = load_checkpoint("shufflenetv1-5_390.ckpt")
load_param_into_net(net, param_dict)
model = Model(net)
dataset_predict = ds.Cifar10Dataset(dataset_dir="./dataset/cifar-10-batches-bin", shuffle=False, usage="train")
dataset_show = ds.Cifar10Dataset(dataset_dir="./dataset/cifar-10-batches-bin", shuffle=False, usage="train")
dataset_show = dataset_show.batch(16)
show_images_lst = next(dataset_show.create_dict_iterator())["image"].asnumpy()
image_trans = [
vision.RandomCrop((32, 32), (4, 4, 4, 4)),
vision.RandomHorizontalFlip(prob=0.5),
vision.Resize((224, 224)),
vision.Rescale(1.0 / 255.0, 0.0),
vision.Normalize([0.4914, 0.4822, 0.4465], [0.2023, 0.1994, 0.2010]),
vision.HWC2CHW()
]
dataset_predict = dataset_predict.map(image_trans, 'image')
dataset_predict = dataset_predict.batch(16)
class_dict = {0:"airplane", 1:"automobile", 2:"bird", 3:"cat", 4:"deer", 5:"dog", 6:"frog", 7:"horse", 8:"ship", 9:"truck"}
# 推理效果展示(上方为预测的结果,下方为推理效果图片)
plt.figure(figsize=(16, 5))
predict_data = next(dataset_predict.create_dict_iterator())
output = model.predict(ms.Tensor(predict_data['image']))
pred = np.argmax(output.asnumpy(), axis=1)
index = 0
for image in show_images_lst:
plt.subplot(2, 8, index+1)
plt.title('{}'.format(class_dict[pred[index]]))
index += 1
plt.imshow(image)
plt.axis("off")
plt.show()
输出:
model size is 2.0x
[ERROR] CORE(263,ffffa833e930,python):2024-07-07-15:30:55.337.972 [mindspore/core/utils/file_utils.cc:253] GetRealPath] Get realpath failed, path[/tmp/ipykernel_263/1681751341.py]
[ERROR] CORE(263,ffffa833e930,python):2024-07-07-15:30:55.338.097 [mindspore/core/utils/file_utils.cc:253] GetRealPath] Get realpath failed, path[/tmp/ipykernel_263/1681751341.py]
......