中文手写数字数据识别

实验环境

python=3.7
torch==1.13.1+cu117
torchaudio==0.13.1+cu117
torchvision==0.14.1

数据下载地址:Mnist中文手写数字数据集Python资源-CSDN文库

这些汉字包括:

零、一、二、三、四、五、六、七、八、九、十、百、千、万、亿
总共15个汉字,分别用0、1、2、3、4、5、6、7、8、9、10、100、1000、10000、100000000标记

使用方法

import pickle, numpy

with open("./chn_mnist", "rb") as f:
 data = pickle.load(f)
images = data["images"]
targets = data["targets"]

数据预处理

数据加载

将数据存入俩个变量,格式为numpy.ndarray

#修改自己的数据集路径
with open(r"D:\zr\data\chn_mnist\chn_mnist", "rb") as f:
    dataset = pickle.load(f)
images = dataset["images"]
targets = dataset["targets"]
统一标签值

100、1000、10000、100000000这四个标签分别用11、12、13、14表示

index = np.where(targets == 100)
targets[index] = 11
index = np.where(targets == 1000)
targets[index] = 12
index = np.where(targets == 10000)
targets[index] = 13
index = np.where(targets == 100000000)
targets[index] = 14

构建数据集

构建Dataset

使用torch.utils.data.DataLoader根据数据集生成一个可迭代的对象,用于模型训练前,需要构建自己的Dataset类

在定义自己的数据集时,需要继承Dataset类,并实现三个函数:initlen__和__getitem

init:实例化Dataset对象时运行,完成初始化工作
len:返回数据集的大小
getitem:根据索引返回一个样本(数据和标签)

import numpy as np
from torch.utils.data import Dataset
from PIL import Image

class MyDataset(Dataset):
    def __init__(self, data, targets, transform=None, target_transform=None):
        '''
        data 数据形状为(x,64,64) x张64*64图像数组
        targets 数据形状为(x) x个图像类别取值
        '''
        self.transform = transform
        self.target_transform = target_transform
        self.data = []
        self.targets = []
        #转换数据格式    
        targets = targets.astype(np.uint8)
        #标签集不做任何处理的情况下
        if target_transform is None:
            self.targets = targets
    	#我这里transform处理numpy数组图像会报错,需要将图像转为Image格式
        #遍历依次对每个图像转换
        for index in range(0, data.shape[0]):
            if self.transform:
                image = Image.fromarray(data[index])
                self.data.append(self.transform(image))
            if self.target_transform:
                self.targets.append(self.target_transform(targets))
    def __len__(self):
        return len(self.data)

    def __getitem__(self, index):
        return self.data[index], self.targets[index]

定义转换方法,对于图像数组,将每个像素点取值规范至0-1之间,均值为0.5

transform_data = transforms.Compose([
    	#确保所有图像都为(64,64),此处图像为标准数据,可以不用
   		torchvision.transforms.Resize((64, 64)),
    	#将PIL Image格式的数据转换为tensor格式,像素值大小缩放至区间[0., 1.]
    	transforms.ToTensor(),
    	#对输入进行标准化,传入均值(mean[1],…,mean[n])和标准差(std[1],…,std[n]),n与输入的维度相同
    	#对于三通道图像(mean=[0.5,0.5,0.5], std=[0.5,0.5,0.5])
    	transforms.Normalize(mean=[0.5], std=[0.5])])
transform_target = None

实例化Dataset类,此处将前14000张作为训练集,后1000张作为测试集

train_dataset = dataloader.MyDataset(images[:14000, :, :], targets[:14000], transform_data, transform_target)
test_dataset = dataloader.MyDataset(images[-1000:, :, :], targets[-1000:], transform_data, transform_target)
DataLoader加载数据集

DataLoader参数解释,通常填前三个参数即可

常用参数:

  • dataset (Dataset) :定义好的数据集
  • batch_size (int, optional):每次放入网络训练的批次大小,默认为1.
  • shuffle (bool, optional) :是否打乱数据的顺序,默认为False。一般训练集设置为True,测试集设置为False
  • num_workers (int, optional) :线程数,默认为0。在Windows下设置大于0的数可能会报错
  • drop_last (bool, optional) :是否丢弃最后一个批次的数据,默认为False

两个工具包,可配合DataLoader使用:

  • enumerate(iterable, start=0):输入是一个可迭代的对象和下标索引开始值;返回可迭代对象的下标索引和数据本身
  • tqdm(iterable):进度条可视化工具包

定义超参数

# 定义超参数
#每次进入模型的图像数量
batch_size = 32
#学习率
learning_rate = 0.001
#总的迭代次数
num_epochs = 50

加载

#shuffle=True表示打乱数据
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size)

模型构建

CNN

自定义卷积模块,对于不同数据集,修改输入图像通道数和输出的分类数量即可

import torch
import torch.nn as nn


class SelfCnn(nn.Module):
    def __init__(self):
        super(SelfCnn, self).__init__()
        self.features = nn.Sequential(
            # Block 1
            nn.Conv2d(1, 32, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(32, 64, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),  # (32,32,64)
            nn.Conv2d(64, 64, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),  # (16,16,64)
            nn.Conv2d(64, 64, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),  # (8,8,64)

        )
        self.classifier = nn.Sequential(
            nn.Linear(8 * 8 * 64, 256),
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(256, 256),
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(256, 15)  # 输出层,二分类任务
        )

    def forward(self, x):
        x = self.features(x)
        x = torch.flatten(x, 1)  # 展开特征图
        x = self.classifier(x)
        return x

加载模型

model=SelfCnn()
VGG16

VGG16由于模型参数量太大,自己从0训练不大能行,需要加载pytorch的预训练模型

#pretrained = True代表加载预训练数据
vgg16_ture = torchvision.models.vgg16(pretrained = True)

VGG16默认的输入图像数据为(224,224,3),输出为(1,1,1000) 我们的数据输入为(64,64,1),目标输出为(1,1,15),因此需要对模型进行修改结构

#增加一层线性变化,将1000类变为15类
vgg16_ture.classifier.append(nn.Linear(1000,15))
#全连接层修改,原来为(7*7*512),将(224/32=)7换为(64/32=)2即可
vgg16_ture.classifier[0]=nn.Linear(2*2*512,4096)
#输入的三通道改为单通道1
vgg16_ture.features[0]=nn.Conv2d(1, 64, kernel_size=3, padding=1)
vgg16_ture.avgpool=nn.AdaptiveAvgPool2d((2,2))
model=vgg16_ture
ResNet50

ResNet50同样需要加载预训练模型

#pretrained = True代表加载预训练数据
resnet50 = torchvision.models.resnet50(pretrained=True)

ResNet50默认输入为三通道图像,将其修改为单通道,以及全连接层输出分类修改

#输入的三通道改为单通道1
resnet50.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3)
#将输出分类改为15
resnet50.fc = (nn.Linear(2048, 15))
model=resnet50

模型训练

选择模型以及训练环境

#有gpu则使用gpu
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
#选择使用的模型,model=vgg16_ture,model=SelfCnn()
#加载已经训练过的模型: model=torch.load(r'D:\zr\projects\utils\chn_mnist_resnet50.pth')
model=resnet50
#将模型置于device
model.to(device)

定义损失函数和优化器

#多分类任务使用这个损失
criterion = torch.nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), momentum=0.9, lr=learning_rate)

定义绘图方法,本例绘制俩附图像

def plt_img(plt_data):
    # 创建数据点
    plt.clf()
    x = plt_data.get('Epoch')
    train_acc = plt_data.get('train_acc')
    train_loss = plt_data.get('train_loss')
    test_acc = plt_data.get('test_acc')
    test_loss = plt_data.get('test_loss')
    # 绘制曲线
    plt.plot(x, train_acc, label='train_acc')
    plt.plot(x, test_acc, label='test_acc')
    plt.plot(x, train_loss, label='train_loss')
    plt.plot(x, test_loss, label='test_loss')
    plt.legend(title='Accuracy And Loss')  # 添加图例标题
    plt.xlabel('epoch')
    # plt.ylabel('rate')
    plt.savefig(f'resnet50_{num_epochs}_{batch_size}_{learning_rate}_1.png')
    # 显示图形
def plt_acc_loss(plt_data):
    plt.clf()
    _, axes = plt.subplots(2, 1)
    x = plt_data.get('Epoch')
    train_acc = plt_data.get('train_acc')
    train_loss = plt_data.get('train_loss')
    test_acc = plt_data.get('test_acc')
    test_loss = plt_data.get('test_loss')
    axes[0].plot(x, train_acc, label='train_acc')
    axes[0].plot(x, test_acc, label='test_acc')
    axes[0].legend(title='Accuracy')  # 添加图例标题
    axes[0].set_xlabel('epoch')
    # axes[0].set_ylabel('rate')
    axes[1].plot(x, train_loss, label='train_loss')
    axes[1].plot(x, test_loss, label='test_loss')
    axes[1].legend(title='Loss')
    axes[1].set_xlabel('epoch')
    # axes[1].set_ylabel('rate')
    # 防止标签被遮挡
    plt.tight_layout()
    plt.savefig(f'resnet50_{num_epochs}_{batch_size}_{learning_rate}_2.png')

开始训练,每次epoch结束都会对模型进行评估,保存准确率最高的模型,同时记录每次的准确率以及loss

max_acc = 0.0
plt_data = {
    'Epoch': [],
    'train_acc': [],
    'train_loss': [],
    'test_acc': [],
    'test_loss': [],

}
for epoch in range(num_epochs):
    plt_data.get('Epoch').append(epoch + 1)
    model.eval()
    torch.no_grad()
    correct = 0.0
    total = 0.0
    loss_ = 0.0
    #测试模型
    loop = tqdm(enumerate(test_loader), total=len(test_loader))
    for i, (images, labels) in loop:
        images = images.to(device)
        labels = labels.to(device)
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss_ += loss.item()
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
        acc = correct / total
        loop.set_description(f'Epoch Test [{epoch + 1}/{num_epochs}]')
        loop.set_postfix(loss=loss_/(i+1), acc=acc)
    if epoch == 0:
        print('原有模型在测试集表现如下:')
    acc = correct / total
    loss_ = loss_ / len(test_loader)
    plt_data.get('test_acc').append(acc)
    plt_data.get('test_loss').append(loss_)
    print(f"Accuracy on test images: {acc * 100}% , Loss  {loss_}")
    if acc > max_acc:
        max_acc = acc
        torch.save(model, 'chn_mnist_resnet50.pth')
        print('The model has been saved as chn_mnist_resnet50.pth')
    correct = 0.0
    total = 0.0
    loss_ = 0.0
    time.sleep(0.1)
    #训练
    loop = tqdm(enumerate(train_loader), total=len(train_loader))
    for i, (images, labels) in loop:
        images = images.to(device)
        labels = labels.to(device)
        # 前向传播
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss_ += loss.item()
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
        acc = correct / total
        # 反向传播和优化,测试集时不要要
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        loop.set_description(f'Epoch Train [{epoch + 1}/{num_epochs}]')
        loop.set_postfix(loss=loss_/(i+1), acc=acc)
    acc = correct / total
    loss_ = loss_ / len(train_loader)
    plt_data.get('train_acc').append(acc)
    plt_data.get('train_loss').append(loss_)
    print(f"Accuracy on train images: {acc * 100}% , Loss  {loss_}")
    time.sleep(0.1)
    #绘图
    plt_img(plt_data)
	plt_acc_loss(plt_data)

结果分析

以下结果均在总训练次数(Epoch)=100,学习率(learn_rate_=0.001,批样本数量(Batch Size)=32的情况下

CNN

测试表现

训练集准确率为:99.99%,测试集准确率为 88.5%,模型存在过拟合

VGG16

可见模型正在Epoch =10左右的时候就基本收敛完成

测试表现

训练集准确率为:99.92%,测试集准确率为 99.5%,模型良好且泛化能力强

ResNet50

可见模型正在Epoch =10之前的时候就基本收敛完成,相较于VGG,resnet50的收敛速度更快

测试表现

训练集准确率为:100%,测试集准确率为 96.8%,模型良好但存在过拟合

源代码

dataloader.py
import numpy as np
from torch.utils.data import Dataset
from PIL import Image


class MyDataset(Dataset):
    def __init__(self, data, targets, transform=None, target_transform=None):
        self.transform = transform
        self.target_transform = target_transform
        self.data = []
        self.targets = []
        targets = targets.astype(np.uint8)
        if target_transform is None:
            self.targets = targets
        for index in range(0, data.shape[0]):
            if self.transform:
                image = Image.fromarray(data[index])
                self.data.append(self.transform(image))
            if self.target_transform:
                self.targets.append(self.target_transform(targets))
    def __len__(self):
        return len(self.data)

    def __getitem__(self, index):
        return self.data[index], self.targets[index]
selfnet_cnn.py
import torch
import torch.nn as nn


class SelfCnn(nn.Module):
    def __init__(self):
        super(SelfCnn, self).__init__()
        self.features = nn.Sequential(
            # Block 1
            nn.Conv2d(1, 32, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(32, 64, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),  # (32,32,64)
            nn.Conv2d(64, 64, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),  # (16,16,64)
            nn.Conv2d(64, 64, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),  # (8,8,64)

        )
        self.classifier = nn.Sequential(
            nn.Linear(8 * 8 * 64, 256),
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(256, 256),
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(256, 15)  # 输出层,二分类任务
        )

    def forward(self, x):
        x = self.features(x)
        x = torch.flatten(x, 1)  # 展开特征图
        x = self.classifier(x)
        return x
train_self_cnn.py
import pickle
import time
import matplotlib.pyplot as plt
import numpy as np
import torch.optim as optim
from torch import nn
from torch.utils.data import DataLoader
import dataloader
import torch
import torchvision
import torchvision.transforms as transforms
from tqdm import tqdm
import os

from selfnet_cnn import SelfCnn

os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
# 定义数据转换
transform_data = transforms.Compose([
    torchvision.transforms.Resize((64, 64)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5])
])
transform_target = None
with open(r"D:\zr\data\chn_mnist\chn_mnist", "rb") as f:
    dataset = pickle.load(f)
images = dataset["images"]
targets = dataset["targets"]
index = np.where(targets == 100)
targets[index] = 11
index = np.where(targets == 1000)
targets[index] = 12
index = np.where(targets == 10000)
targets[index] = 13
index = np.where(targets == 100000000)
targets[index] = 14

train_dataset = dataloader.MyDataset(images[:14000, :, :], targets[:14000], transform_data, transform_target)
test_dataset = dataloader.MyDataset(images[-1000:, :, :], targets[-1000:], transform_data, transform_target)

# 定义超参数
batch_size = 32
learning_rate = 0.001
num_epochs = 100
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size)


device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model=SelfCnn()
# model = torch.load(r'D:\zr\projects\utils\chn_mnist_resnet50.pth', map_location=device)
model.to(device)
# 定义损失函数和优化器
criterion = torch.nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), momentum=0.9, lr=learning_rate)
def plt_img(plt_data):
    # 创建数据点
    plt.clf()
    x = plt_data.get('Epoch')
    train_acc = plt_data.get('train_acc')
    train_loss = plt_data.get('train_loss')
    test_acc = plt_data.get('test_acc')
    test_loss = plt_data.get('test_loss')
    # 绘制曲线
    plt.plot(x, train_acc, label='train_acc')
    plt.plot(x, test_acc, label='test_acc')
    plt.plot(x, train_loss, label='train_loss')
    plt.plot(x, test_loss, label='test_loss')
    plt.legend(title='Accuracy And Loss')  # 添加图例标题
    plt.xlabel('epoch')
    # plt.ylabel('rate')
    plt.savefig(f'selfCnn_{num_epochs}_{batch_size}_{learning_rate}_1.png')
    # 显示图形
def plt_acc_loss(plt_data):
    plt.clf()
    _, axes = plt.subplots(2, 1)
    x = plt_data.get('Epoch')
    train_acc = plt_data.get('train_acc')
    train_loss = plt_data.get('train_loss')
    test_acc = plt_data.get('test_acc')
    test_loss = plt_data.get('test_loss')
    axes[0].plot(x, train_acc, label='train_acc')
    axes[0].plot(x, test_acc, label='test_acc')
    axes[0].legend(title='Accuracy')  # 添加图例标题
    axes[0].set_xlabel('epoch')
    # axes[0].set_ylabel('rate')
    axes[1].plot(x, train_loss, label='train_loss')
    axes[1].plot(x, test_loss, label='test_loss')
    axes[1].legend(title='Loss')
    axes[1].set_xlabel('epoch')
    # axes[1].set_ylabel('rate')
    # 防止标签被遮挡
    plt.tight_layout()
    plt.savefig(f'selfCnn_{num_epochs}_{batch_size}_{learning_rate}_2.png')
# 训练模型
max_acc = 0.0
plt_data = {
    'Epoch': [],
    'train_acc': [],
    'train_loss': [],
    'test_acc': [],
    'test_loss': [],

}

for epoch in range(num_epochs):
    plt_data.get('Epoch').append(epoch + 1)
    model.eval()
    torch.no_grad()
    correct = 0.0
    total = 0.0
    loss_ = 0.0
    loop = tqdm(enumerate(test_loader), total=len(test_loader))
    for i, (images, labels) in loop:
        images = images.to(device)
        labels = labels.to(device)
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss_ += loss.item()
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
        acc = correct / total
        loop.set_description(f'Epoch Test [{epoch + 1}/{num_epochs}]')
        loop.set_postfix(loss=loss_/(i+1), acc=acc)
    if epoch == 0:
        print('原有模型在测试集表现如下:')
    acc = correct / total
    loss_ = loss_ / len(test_loader)
    plt_data.get('test_acc').append(acc)
    plt_data.get('test_loss').append(loss_)
    print(f"Accuracy on test images: {acc * 100}% , Loss:  {loss_}")
    if acc > max_acc:
        max_acc = acc
        torch.save(model, 'chn_mnist_selfCnn.pth')
        print('The model has been saved as chn_mnist_selfCnn.pth')
    correct = 0.0
    total = 0.0
    loss_ = 0.0
    time.sleep(0.1)
    loop = tqdm(enumerate(train_loader), total=len(train_loader))
    for i, (images, labels) in loop:
        images = images.to(device)
        labels = labels.to(device)
        # 前向传播
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss_ += loss.item()
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
        acc = correct / total
        # 反向传播和优化
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        loop.set_description(f'Epoch Train [{epoch + 1}/{num_epochs}]')
        loop.set_postfix(loss=loss_/(i+1), acc=acc)
    acc = correct / total
    loss_ = loss_ / len(train_loader)
    plt_data.get('train_acc').append(acc)
    plt_data.get('train_loss').append(loss_)
    print(f"Accuracy on train images: {acc * 100}% , Loss:  {loss_}")
    time.sleep(0.1)
    plt_img(plt_data)
    plt_acc_loss(plt_data)
train_vgg16.py
import pickle
import time
import matplotlib.pyplot as plt
import numpy as np
import torch.optim as optim
from torch import nn
from torch.utils.data import DataLoader
import dataloader
import torch
import torchvision
import torchvision.transforms as transforms
from tqdm import tqdm
import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
# 定义数据转换
transform_data = transforms.Compose([
    torchvision.transforms.Resize((64, 64)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5])
])
transform_target = None
with open(r"D:\zr\data\chn_mnist\chn_mnist", "rb") as f:
    dataset = pickle.load(f)
images = dataset["images"]
targets = dataset["targets"]
index = np.where(targets == 100)
targets[index] = 11
index = np.where(targets == 1000)
targets[index] = 12
index = np.where(targets == 10000)
targets[index] = 13
index = np.where(targets == 100000000)
targets[index] = 14

train_dataset = dataloader.MyDataset(images[:14000, :, :], targets[:14000], transform_data, transform_target)
test_dataset = dataloader.MyDataset(images[-1000:, :, :], targets[-1000:], transform_data, transform_target)

# 定义超参数
batch_size = 32
learning_rate = 0.001
num_epochs = 50
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size)


device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

vgg16_ture = torchvision.models.vgg16(pretrained = True)
vgg16_ture.classifier.append(nn.Linear(1000,15))
vgg16_ture.classifier[0]=nn.Linear(2*2*512,4096)
vgg16_ture.features[0]=nn.Conv2d(1, 64, kernel_size=3, padding=1)
vgg16_ture.avgpool=nn.AdaptiveAvgPool2d((2,2))
model=vgg16_ture
# model = torch.load(r'D:\zr\projects\utils\chn_mnist_resnet50.pth', map_location=device)
model.to(device)
# 定义损失函数和优化器
criterion = torch.nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), momentum=0.9, lr=learning_rate)
def plt_img(plt_data):
    # 创建数据点
    plt.clf()
    x = plt_data.get('Epoch')
    train_acc = plt_data.get('train_acc')
    train_loss = plt_data.get('train_loss')
    test_acc = plt_data.get('test_acc')
    test_loss = plt_data.get('test_loss')
    # 绘制曲线
    plt.plot(x, train_acc, label='train_acc')
    plt.plot(x, test_acc, label='test_acc')
    plt.plot(x, train_loss, label='train_loss')
    plt.plot(x, test_loss, label='test_loss')
    plt.legend(title='Accuracy And Loss')  # 添加图例标题
    plt.xlabel('epoch')
    # plt.ylabel('rate')
    plt.savefig(f'vgg16_{num_epochs}_{batch_size}_{learning_rate}_1.png')
    # 显示图形
def plt_acc_loss(plt_data):
    plt.clf()
    _, axes = plt.subplots(2, 1)
    x = plt_data.get('Epoch')
    train_acc = plt_data.get('train_acc')
    train_loss = plt_data.get('train_loss')
    test_acc = plt_data.get('test_acc')
    test_loss = plt_data.get('test_loss')
    axes[0].plot(x, train_acc, label='train_acc')
    axes[0].plot(x, test_acc, label='test_acc')
    axes[0].legend(title='Accuracy')  # 添加图例标题
    axes[0].set_xlabel('epoch')
    # axes[0].set_ylabel('rate')
    axes[1].plot(x, train_loss, label='train_loss')
    axes[1].plot(x, test_loss, label='test_loss')
    axes[1].legend(title='Loss')
    axes[1].set_xlabel('epoch')
    # axes[1].set_ylabel('rate')
    # 防止标签被遮挡
    plt.tight_layout()
    plt.savefig(f'vgg16_{num_epochs}_{batch_size}_{learning_rate}_2.png')
# 训练模型
max_acc = 0.0
plt_data = {
    'Epoch': [],
    'train_acc': [],
    'train_loss': [],
    'test_acc': [],
    'test_loss': [],

}

for epoch in range(num_epochs):
    plt_data.get('Epoch').append(epoch + 1)
    model.eval()
    torch.no_grad()
    correct = 0.0
    total = 0.0
    loss_ = 0.0
    loop = tqdm(enumerate(test_loader), total=len(test_loader))
    for i, (images, labels) in loop:
        images = images.to(device)
        labels = labels.to(device)
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss_ += loss.item()
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
        acc = correct / total
        loop.set_description(f'Epoch Test [{epoch + 1}/{num_epochs}]')
        loop.set_postfix(loss=loss_/(i+1), acc=acc)
    if epoch == 0:
        print('原有模型在测试集表现如下:')
    acc = correct / total
    loss_ = loss_ / len(test_loader)
    plt_data.get('test_acc').append(acc)
    plt_data.get('test_loss').append(loss_)
    print(f"Accuracy on test images: {acc * 100}% , Loss:  {loss_}")
    if acc > max_acc:
        max_acc = acc
        torch.save(model, 'chn_mnist_vgg16.pth')
        print('The model has been saved as chn_mnist_vgg16.pth')
    correct = 0.0
    total = 0.0
    loss_ = 0.0
    time.sleep(0.1)
    loop = tqdm(enumerate(train_loader), total=len(train_loader))
    for i, (images, labels) in loop:
        images = images.to(device)
        labels = labels.to(device)
        # 前向传播
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss_ += loss.item()
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
        acc = correct / total
        # 反向传播和优化
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        loop.set_description(f'Epoch Train [{epoch + 1}/{num_epochs}]')
        loop.set_postfix(loss=loss_/(i+1), acc=acc)
    acc = correct / total
    loss_ = loss_ / len(train_loader)
    plt_data.get('train_acc').append(acc)
    plt_data.get('train_loss').append(loss_)
    print(f"Accuracy on train images: {acc * 100}% , Loss:  {loss_}")
    time.sleep(0.1)
    plt_img(plt_data)
    plt_acc_loss(plt_data)
train_resnet50.py
import pickle
import time
import matplotlib.pyplot as plt
import numpy as np
import torch.optim as optim
from torch import nn
from torch.utils.data import DataLoader
import dataloader
import torch
import torchvision
import torchvision.transforms as transforms
from tqdm import tqdm
import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
# 定义数据转换
transform_data = transforms.Compose([
    torchvision.transforms.Resize((64, 64)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5])
])
transform_target = None
with open(r"D:\zr\data\chn_mnist\chn_mnist", "rb") as f:
    dataset = pickle.load(f)
images = dataset["images"]
targets = dataset["targets"]
index = np.where(targets == 100)
targets[index] = 11
index = np.where(targets == 1000)
targets[index] = 12
index = np.where(targets == 10000)
targets[index] = 13
index = np.where(targets == 100000000)
targets[index] = 14

train_dataset = dataloader.MyDataset(images[:14000, :, :], targets[:14000], transform_data, transform_target)
test_dataset = dataloader.MyDataset(images[-1000:, :, :], targets[-1000:], transform_data, transform_target)

# 定义超参数
batch_size = 32
learning_rate = 0.001
num_epochs = 50
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size)


device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

resnet50 = torchvision.models.resnet50(pretrained=True)
# print(resnet50)
resnet50.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3)
resnet50.fc = (nn.Linear(2048, 15))
# resnet50.add_module('add',nn.Linear(1000,15))
model=resnet50
# model = torch.load(r'D:\zr\projects\utils\chn_mnist_resnet50.pth', map_location=device)
model.to(device)
# 定义损失函数和优化器
criterion = torch.nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), momentum=0.9, lr=learning_rate)
def plt_img(plt_data):
    # 创建数据点
    plt.clf()
    x = plt_data.get('Epoch')
    train_acc = plt_data.get('train_acc')
    train_loss = plt_data.get('train_loss')
    test_acc = plt_data.get('test_acc')
    test_loss = plt_data.get('test_loss')
    # 绘制曲线
    plt.plot(x, train_acc, label='train_acc')
    plt.plot(x, test_acc, label='test_acc')
    plt.plot(x, train_loss, label='train_loss')
    plt.plot(x, test_loss, label='test_loss')
    plt.legend(title='Accuracy And Loss')  # 添加图例标题
    plt.xlabel('epoch')
    # plt.ylabel('rate')
    plt.savefig(f'resnet50_{num_epochs}_{batch_size}_{learning_rate}_1.png')
    # 显示图形
def plt_acc_loss(plt_data):
    plt.clf()
    _, axes = plt.subplots(2, 1)
    x = plt_data.get('Epoch')
    train_acc = plt_data.get('train_acc')
    train_loss = plt_data.get('train_loss')
    test_acc = plt_data.get('test_acc')
    test_loss = plt_data.get('test_loss')
    axes[0].plot(x, train_acc, label='train_acc')
    axes[0].plot(x, test_acc, label='test_acc')
    axes[0].legend(title='Accuracy')  # 添加图例标题
    axes[0].set_xlabel('epoch')
    # axes[0].set_ylabel('rate')
    axes[1].plot(x, train_loss, label='train_loss')
    axes[1].plot(x, test_loss, label='test_loss')
    axes[1].legend(title='Loss')
    axes[1].set_xlabel('epoch')
    # axes[1].set_ylabel('rate')
    # 防止标签被遮挡
    plt.tight_layout()
    plt.savefig(f'resnet50_{num_epochs}_{batch_size}_{learning_rate}_2.png')
# 训练模型
max_acc = 0.0
plt_data = {
    'Epoch': [],
    'train_acc': [],
    'train_loss': [],
    'test_acc': [],
    'test_loss': [],

}

for epoch in range(num_epochs):
    plt_data.get('Epoch').append(epoch + 1)
    model.eval()
    torch.no_grad()
    correct = 0.0
    total = 0.0
    loss_ = 0.0
    loop = tqdm(enumerate(test_loader), total=len(test_loader))
    for i, (images, labels) in loop:
        images = images.to(device)
        labels = labels.to(device)
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss_ += loss.item()
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
        acc = correct / total
        loop.set_description(f'Epoch Test [{epoch + 1}/{num_epochs}]')
        loop.set_postfix(loss=loss_/(i+1), acc=acc)
    if epoch == 0:
        print('原有模型在测试集表现如下:')
    acc = correct / total
    loss_ = loss_ / len(test_loader)
    plt_data.get('test_acc').append(acc)
    plt_data.get('test_loss').append(loss_)
    print(f"Accuracy on test images: {acc * 100}% , Loss:  {loss_}")
    if acc > max_acc:
        max_acc = acc
        torch.save(model, 'chn_mnist_resnet50.pth')
        print('The model has been saved as chn_mnist_resnet50.pth')
    correct = 0.0
    total = 0.0
    loss_ = 0.0
    time.sleep(0.1)
    loop = tqdm(enumerate(train_loader), total=len(train_loader))
    for i, (images, labels) in loop:
        images = images.to(device)
        labels = labels.to(device)
        # 前向传播
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss_ += loss.item()
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
        acc = correct / total
        # 反向传播和优化
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        loop.set_description(f'Epoch Train [{epoch + 1}/{num_epochs}]')
        loop.set_postfix(loss=loss_/(i+1), acc=acc)
    acc = correct / total
    loss_ = loss_ / len(train_loader)
    plt_data.get('train_acc').append(acc)
    plt_data.get('train_loss').append(loss_)
    print(f"Accuracy on train images: {acc * 100}% , Loss:  {loss_}")
    time.sleep(0.1)
    plt_img(plt_data)
    plt_acc_loss(plt_data)

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:/a/219191.html

如若内容造成侵权/违法违规/事实不符,请联系我们进行投诉反馈qq邮箱809451989@qq.com,一经查实,立即删除!

相关文章

git的基本命令操作超详细解析教程

Git基础教学 1、初始化配置2、初始化仓库3、工作区域和文件状态4、添加和提交文件5、git reset 回退版本6、git diff查看差异7、删除文件git rm8、.gitignore10、分支基础应用1、本地文件提交到远程仓库 Git:一个开源的分布式版本控制系统,它可以在本地和…

MVSNeRF:多视图立体视觉的快速推广辐射场重建

MVSNeRF:多视图立体视觉的快速推广辐射场重建 摘要1 引言 摘要 在2021年,作者提出了MVSNeRF,一种新的神经渲染方法,在视图合成中可以有效地重建神经辐射场。与之前对神经辐射场的研究不同,我们考虑了对密集捕获的图像…

Oracle:左连接、右连接、全外连接、(+)号详解

目录 Oracle 左连接、右连接、全外连接、()号详解 1、左外连接(LEFT OUTER JOIN/ LEFT JOIN) 2、右外连接(RIGHT OUTER JOIN/RIGHT JOIN) 3、全外连接(FULL OUTER JOIN/FULL JOIN&#xff0…

招标新时代:如何利用全国招标投标信息API获取招标投标信息

引言 随着信息技术的迅猛发展,招标投标领域也逐渐步入了数字化、智能化的新时代。全国各地政府和企事业单位纷纷采用先进的招标系统,以提高招标效率、透明度和公平性。在这个背景下,利用全国招标投标信息API成为了获取实时招标投标信息的一种…

背景特效插件:Background Effects

全管线游戏背景动态粒子特效:插件里你可以找到不同用途的各种环境背景效果。这些背景适用于主菜单和现场游戏。这些背景特效可以在任何渲染管道中工作,因为他们使用标准的粒子着色器 适用于Unity2020.3.18f1及以上版本 Unity商店链接 CSDN下载 以下是一些效果图:

【Node.js】基础梳理 6 - MongoDB

写在最前:跟着视频学习只是为了在新手期快速入门。想要学习全面、进阶的知识,需要格外注重实战和官方技术文档,文档建议作为手册使用 系列文章 【Node.js】笔记整理 1 - 基础知识【Node.js】笔记整理 2 - 常用模块【Node.js】笔记整理 3 - n…

采购业务中的主数据

目录 一、维护BP主数据业务伙伴BP的概念业务伙伴涉及的表业务伙伴维护操作一次性客商数据 二、维护物料主数据三、维护采购信息记录四、与FI相关集成点物料主数据的价格控制评估类与科目确定 一、维护BP主数据 业务伙伴BP的概念 在S/4HANA中,SAP引入了BP(Business…

电力智能辅助监控平台

电力智能辅助监控平台是一种集成了先进技术的电力系统监控解决方案。该平台利用人工智能、物联网、云计算和大数据等技术,依托电易云-智慧电力物联网,对电力系统的运行状态进行实时监控和分析,以实现更高效、更智能的电力管理。 以下是电力智…

Hadoop学习笔记(HDP)-Part.06 安装OracleJDK

目录 Part.01 关于HDP Part.02 核心组件原理 Part.03 资源规划 Part.04 基础环境配置 Part.05 Yum源配置 Part.06 安装OracleJDK Part.07 安装MySQL Part.08 部署Ambari集群 Part.09 安装OpenLDAP Part.10 创建集群 Part.11 安装Kerberos Part.12 安装HDFS Part.13 安装Ranger …

java学习part36set

157-集合框架-Set不同实现类的对比及Set无序性、不可重复性的剖析_哔哩哔哩_bilibili 1.Set 加入集合的时候会先调用重写的hash方法计算hash值,不一样就加入。 如果hash一样且equals也是true就是重复 ,调equals是为了保险,保证排除hash碰撞…

Ubuntu 安装 CUDA 和 cuDNN 详细步骤

我的Linux系统背景: 系统和驱动都已安装。 系统是centos 8。查看自己操作系统的版本信息:cat /etc/issue或者是 cat /etc/lsb-release 用nvidia-smi可以看到显卡驱动和可支持的最高cuda版本,我的是12.2。驱动版本是535.129.03 首先&#…

中缀表达式转后缀表达式(详解)

**中缀表达式转后缀表达式的一般步骤如下: 1:创建一个空的栈和一个空的输出列表。 2:从左到右扫描中缀表达式的每个字符。 3:如果当前字符是操作数,则直接将其加入到输出列表中。 4:如果当前字符是运算符&a…

如何理解EDI报文,EDI报文标准以及版本号?

首先需要梳理EDI报文、EDI报文标准以及版本号这三个名词代表的不同含义。 EDI报文标准,即EDI文件在生成和解析时需要遵循的规则,通常情况下,在与交易伙伴实施EDI项目的过程中,交易双方需要按照同一套EDI报文标准处理文件&#xf…

Redis命令详解

文章目录 Key(键) DEL EXISTS EXPIRE EXPIREAT PEXPIRE PEXPIREAT PERSIST KEYS TTL PTTL RENAME RENAMENX TYPE SCAN HSCAN SSCAN ZSCAN DUMP String(字符串) SET GET INCR DECR MSET MGET APPEND SETNX STRLEN INCRBY DECRBY IN…

常见的几种计算机编码格式

前言: 计算机编码是指将字符、数字和符号等信息转换为计算机可识别的二进制数的过程,正因如此,计算机才能识别中英文等各类字符。计算机中有多种编码格式用于表示和存储文本、字符和数据,实际走到最后都是二进制,本质一…

【云原生-K8s】检查yaml文件安全配置kubesec部署及使用

基础介绍基础描述特点 部署在线下载百度网盘下载安装 使用官网样例yamlHTTP远程调用安全建议 总结 基础介绍 基础描述 Kubesec 是一个开源项目,旨在为 Kubernetes 提供安全特性。它提供了一组工具和插件,用于保护和管理在 Kubernetes 集群中的工作负载和…

Matplotlib plt.scatter()相关——(待完善)

1.python plt.scatter() 2.Python中的plt.scatter函数如何自定义颜色空间(附详细代码) 3.PYthon——plt.scatter各参数详解 4.plt.scatter()参数详解 5.plt.scatter各参数详解 6.plt.scatter( ) 函数的使用方法 7.matplot画图之plt.scatter()函数 8.Pyth…

计算机操作系统1

.11.操作系统的基本定义 2.操作系统的四大特征 2.1.操作系统的虚拟特征 3.操作系统的功能: 1.处理器管理 2.存储器管理 3.文件管理 4.设备管理 4.总结: 1.并发和共享互为存在,没有并发也就没有共享,反之也是。 2.并发和并行的…

Java多线程详解(上)——2023/11/23

Process(进程)与Thread(线程) 说起进程,就不得不说下程序。程序是指令和数据的有序集合,其本身没有任何运行的含义,是一个静态的概念。而进程则是执行程序的一次执行过程,它是一个动…

为什么C语言用int *a 来声明指针变量,而不是int a声明?

为什么C语言用int *a 来声明指针变量,而不是int &a声明? 在开始前我有一些资料,是我根据自己从业十年经验,熬夜搞了几个通宵,精心整理了一份「C语言从专业入门到高级教程工具包」,点个关注&#xff0c…