J6 - ResNeXt50模型的实现

  • 🍨 本文为🔗365天深度学习训练营 中的学习记录博客
  • 🍖 原作者:K同学啊 | 接辅导、项目定制

目录

  • 环境
      • 代码在之前的章节都有,我后续只贴出模型设计
      • 构建过程如下
      • 打印模型结构
      • 打印参数量
    • 训练过程与结果
    • 总结


环境

  • 系统: Linux
  • 语言: Python3.8.10
  • 深度学习框架: Pytorch2.0.0+cu118
  • 显卡:GTX2080TI

代码在之前的章节都有,我后续只贴出模型设计

Block对比
对比ResNet的Block和ResNeXt的Block可以发现,最重要的改动就是卷积变成了分组卷积

构建过程如下

  1. 创建Block
class Block(nn.Module):
    def __init__(self, input_size, hidden_size, strides=1, groups=32, conv_shortcut=True):
        super().__init__()
        
        if conv_shortcut:
            self.start = nn.Sequential(
                nn.Conv2d(input_size, hidden_size * 2, 1, stride=strides, bias=False),
                nn.BatchNorm2d(hidden_size*2, eps=1.001e-5)
            )
        else:
            self.start = nn.Identity()
        
        self.conv1 = nn.Conv2d(input_size, hidden_size, 1, padding='same', bias=False)
        self.bn1 = nn.BatchNorm2d(hidden_size, eps=1.001e-5)
        self.relu1 = nn.ReLU()
        
        self.conv2 = nn.Conv2d(hidden_size, hidden_size, 3, padding='same', groups=groups, bias=False)
        self.bn2 = nn.BatchNorm2d(hidden_size, eps=1.001e-5)
        self.relu2 = nn.ReLU()
        
        self.conv3 = nn.Conv2d(hidden_size, hidden_size * 2, 1, stride=strides, bias=False)
        self.bn3 = nn.BatchNorm2d(hidden_size*2, eps=1.001e-5)
        self.relu3 = nn.ReLU()
    def forward(self, inputs):
        short = self.start(inputs)
        
        x = self.conv1(inputs)
        x = self.bn1(x)
        x = self.relu1(x)
        
        x = self.conv2(x)
        x = self.bn2(x)
        x = self.relu2(x)
        
        x = self.conv3(x)
        x = self.bn3(x)
        x = self.relu3(x)
        
        x = x + short
        return x
  1. 创建Stack
class Stack(nn.Module):
    def __init__(self, input_size, hidden_size, blocks, strides, groups=32):
        super().__init__()
        
        self.layers = nn.Sequential()
        self.layers.add_module('first', Block(input_size, hidden_size, strides=strides, groups=groups))
        current_size = input_size
        for i in range(blocks):
            self.layers.add_module('layer%d' % (i+1), Block(hidden_size*2, hidden_size, groups=groups, conv_shortcut=False))
        
    def forward(self, inputs):
        x = self.layers(inputs)
        return x
  1. 创建模型
class ResNeXt50(nn.Module):
    def __init__(self, num_classes):
        super().__init__()
        
        self.pre = nn.Sequential(
            nn.ZeroPad2d(3),
            nn.Conv2d(3, 64, 7, stride=2),
            nn.BatchNorm2d(64, eps=1.001e-5),
            nn.ReLU(),
            nn.ZeroPad2d(1),
            nn.MaxPool2d(3, stride=2),
        )
        
        self.stack1 = Stack(64, 128, blocks=2, strides=1)
        self.stack2 = Stack(256, 256, blocks=3, strides=2)
        self.stack3 = Stack(512, 512, blocks=5, strides=2)
        self.stack4 = Stack(1024, 1024, blocks=2, strides=2)
        
        self.avg = nn.AdaptiveAvgPool2d(5)
        self.classifier = nn.Linear(5*5*2048, num_classes)
        self.softmax = nn.Softmax(dim=1)
        
    def forward(self, inputs):
        x = self.pre(inputs)
        x = self.stack1(x)
        x = self.stack2(x)
        x = self.stack3(x)
        x = self.stack4(x)
        x = self.avg(x)
        x = x.view(x.size(0), -1)
        x = self.classifier(x)
        x = self.softmax(x)
        return x

打印模型结构

model = ResNeXt50(2).to(device)
model
ResNeXt50(
  (pre): Sequential(
    (0): ZeroPad2d((3, 3, 3, 3))
    (1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2))
    (2): BatchNorm2d(64, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
    (3): ReLU()
    (4): ZeroPad2d((1, 1, 1, 1))
    (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (stack1): Stack(
    (layers): Sequential(
      (first): Block(
        (start): Sequential(
          (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (1): BatchNorm2d(256, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (conv1): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1), padding=same, bias=False)
        (bn1): BatchNorm2d(128, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu1): ReLU()
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=same, groups=32, bias=False)
        (bn2): BatchNorm2d(128, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu2): ReLU()
        (conv3): Conv2d(128, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu3): ReLU()
      )
      (layer1): Block(
        (start): Identity()
        (conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), padding=same, bias=False)
        (bn1): BatchNorm2d(128, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu1): ReLU()
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=same, groups=32, bias=False)
        (bn2): BatchNorm2d(128, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu2): ReLU()
        (conv3): Conv2d(128, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu3): ReLU()
      )
      (layer2): Block(
        (start): Identity()
        (conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), padding=same, bias=False)
        (bn1): BatchNorm2d(128, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu1): ReLU()
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=same, groups=32, bias=False)
        (bn2): BatchNorm2d(128, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu2): ReLU()
        (conv3): Conv2d(128, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu3): ReLU()
      )
    )
  )
  (stack2): Stack(
    (layers): Sequential(
      (first): Block(
        (start): Sequential(
          (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm2d(512, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (conv1): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), padding=same, bias=False)
        (bn1): BatchNorm2d(256, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu1): ReLU()
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=same, groups=32, bias=False)
        (bn2): BatchNorm2d(256, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu2): ReLU()
        (conv3): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (bn3): BatchNorm2d(512, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu3): ReLU()
      )
      (layer1): Block(
        (start): Identity()
        (conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), padding=same, bias=False)
        (bn1): BatchNorm2d(256, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu1): ReLU()
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=same, groups=32, bias=False)
        (bn2): BatchNorm2d(256, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu2): ReLU()
        (conv3): Conv2d(256, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(512, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu3): ReLU()
      )
      (layer2): Block(
        (start): Identity()
        (conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), padding=same, bias=False)
        (bn1): BatchNorm2d(256, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu1): ReLU()
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=same, groups=32, bias=False)
        (bn2): BatchNorm2d(256, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu2): ReLU()
        (conv3): Conv2d(256, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(512, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu3): ReLU()
      )
      (layer3): Block(
        (start): Identity()
        (conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), padding=same, bias=False)
        (bn1): BatchNorm2d(256, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu1): ReLU()
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=same, groups=32, bias=False)
        (bn2): BatchNorm2d(256, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu2): ReLU()
        (conv3): Conv2d(256, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(512, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu3): ReLU()
      )
    )
  )
  (stack3): Stack(
    (layers): Sequential(
      (first): Block(
        (start): Sequential(
          (0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm2d(1024, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (conv1): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1), padding=same, bias=False)
        (bn1): BatchNorm2d(512, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu1): ReLU()
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=same, groups=32, bias=False)
        (bn2): BatchNorm2d(512, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu2): ReLU()
        (conv3): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (bn3): BatchNorm2d(1024, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu3): ReLU()
      )
      (layer1): Block(
        (start): Identity()
        (conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), padding=same, bias=False)
        (bn1): BatchNorm2d(512, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu1): ReLU()
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=same, groups=32, bias=False)
        (bn2): BatchNorm2d(512, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu2): ReLU()
        (conv3): Conv2d(512, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(1024, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu3): ReLU()
      )
      (layer2): Block(
        (start): Identity()
        (conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), padding=same, bias=False)
        (bn1): BatchNorm2d(512, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu1): ReLU()
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=same, groups=32, bias=False)
        (bn2): BatchNorm2d(512, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu2): ReLU()
        (conv3): Conv2d(512, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(1024, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu3): ReLU()
      )
      (layer3): Block(
        (start): Identity()
        (conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), padding=same, bias=False)
        (bn1): BatchNorm2d(512, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu1): ReLU()
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=same, groups=32, bias=False)
        (bn2): BatchNorm2d(512, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu2): ReLU()
        (conv3): Conv2d(512, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(1024, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu3): ReLU()
      )
      (layer4): Block(
        (start): Identity()
        (conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), padding=same, bias=False)
        (bn1): BatchNorm2d(512, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu1): ReLU()
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=same, groups=32, bias=False)
        (bn2): BatchNorm2d(512, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu2): ReLU()
        (conv3): Conv2d(512, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(1024, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu3): ReLU()
      )
      (layer5): Block(
        (start): Identity()
        (conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), padding=same, bias=False)
        (bn1): BatchNorm2d(512, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu1): ReLU()
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=same, groups=32, bias=False)
        (bn2): BatchNorm2d(512, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu2): ReLU()
        (conv3): Conv2d(512, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(1024, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu3): ReLU()
      )
    )
  )
  (stack4): Stack(
    (layers): Sequential(
      (first): Block(
        (start): Sequential(
          (0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm2d(2048, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (conv1): Conv2d(1024, 1024, kernel_size=(1, 1), stride=(1, 1), padding=same, bias=False)
        (bn1): BatchNorm2d(1024, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu1): ReLU()
        (conv2): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=same, groups=32, bias=False)
        (bn2): BatchNorm2d(1024, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu2): ReLU()
        (conv3): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (bn3): BatchNorm2d(2048, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu3): ReLU()
      )
      (layer1): Block(
        (start): Identity()
        (conv1): Conv2d(2048, 1024, kernel_size=(1, 1), stride=(1, 1), padding=same, bias=False)
        (bn1): BatchNorm2d(1024, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu1): ReLU()
        (conv2): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=same, groups=32, bias=False)
        (bn2): BatchNorm2d(1024, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu2): ReLU()
        (conv3): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(2048, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu3): ReLU()
      )
      (layer2): Block(
        (start): Identity()
        (conv1): Conv2d(2048, 1024, kernel_size=(1, 1), stride=(1, 1), padding=same, bias=False)
        (bn1): BatchNorm2d(1024, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu1): ReLU()
        (conv2): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=same, groups=32, bias=False)
        (bn2): BatchNorm2d(1024, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu2): ReLU()
        (conv3): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(2048, eps=1.001e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu3): ReLU()
      )
    )
  )
  (avg): AdaptiveAvgPool2d(output_size=5)
  (classifier): Linear(in_features=51200, out_features=2, bias=True)
  (softmax): Softmax(dim=1)
)

打印参数量

summary(model, input_size=(32, 3, 224, 224))
===============================================================================================
Layer (type:depth-idx)                        Output Shape              Param #
===============================================================================================
ResNeXt50                                     [32, 2]                   --
├─Sequential: 1-1                             [32, 64, 56, 56]          --
│    └─ZeroPad2d: 2-1                         [32, 3, 230, 230]         --
│    └─Conv2d: 2-2                            [32, 64, 112, 112]        9,472
│    └─BatchNorm2d: 2-3                       [32, 64, 112, 112]        128
│    └─ReLU: 2-4                              [32, 64, 112, 112]        --
│    └─ZeroPad2d: 2-5                         [32, 64, 114, 114]        --
│    └─MaxPool2d: 2-6                         [32, 64, 56, 56]          --
├─Stack: 1-2                                  [32, 256, 56, 56]         --
│    └─Sequential: 2-7                        [32, 256, 56, 56]         --
│    │    └─Block: 3-1                        [32, 256, 56, 56]         63,488
│    │    └─Block: 3-2                        [32, 256, 56, 56]         71,168
│    │    └─Block: 3-3                        [32, 256, 56, 56]         71,168
├─Stack: 1-3                                  [32, 512, 28, 28]         --
│    └─Sequential: 2-8                        [32, 512, 28, 28]         --
│    │    └─Block: 3-4                        [32, 512, 28, 28]         349,184
│    │    └─Block: 3-5                        [32, 512, 28, 28]         282,624
│    │    └─Block: 3-6                        [32, 512, 28, 28]         282,624
│    │    └─Block: 3-7                        [32, 512, 28, 28]         282,624
├─Stack: 1-4                                  [32, 1024, 14, 14]        --
│    └─Sequential: 2-9                        [32, 1024, 14, 14]        --
│    │    └─Block: 3-8                        [32, 1024, 14, 14]        1,390,592
│    │    └─Block: 3-9                        [32, 1024, 14, 14]        1,126,400
│    │    └─Block: 3-10                       [32, 1024, 14, 14]        1,126,400
│    │    └─Block: 3-11                       [32, 1024, 14, 14]        1,126,400
│    │    └─Block: 3-12                       [32, 1024, 14, 14]        1,126,400
│    │    └─Block: 3-13                       [32, 1024, 14, 14]        1,126,400
├─Stack: 1-5                                  [32, 2048, 7, 7]          --
│    └─Sequential: 2-10                       [32, 2048, 7, 7]          --
│    │    └─Block: 3-14                       [32, 2048, 7, 7]          5,550,080
│    │    └─Block: 3-15                       [32, 2048, 7, 7]          4,497,408
│    │    └─Block: 3-16                       [32, 2048, 7, 7]          4,497,408
├─AdaptiveAvgPool2d: 1-6                      [32, 2048, 5, 5]          --
├─Linear: 1-7                                 [32, 2]                   102,402
├─Softmax: 1-8                                [32, 2]                   --
===============================================================================================
Total params: 23,082,370
Trainable params: 23,082,370
Non-trainable params: 0
Total mult-adds (G): 139.50
===============================================================================================
Input size (MB): 19.27
Forward/backward pass size (MB): 7912.56
Params size (MB): 92.33
Estimated Total Size (MB): 8024.15
===============================================================================================

训练过程与结果

Epoch: 1, TrainLoss: 0.653, TrainAcc: 63.2, TestLoss: 0.611, TestAcc: 66.7, Lr: 1.00e-05
Epoch: 2, TrainLoss: 0.573, TrainAcc: 73.4, TestLoss: 0.578, TestAcc: 74.1, Lr: 1.00e-05
Epoch: 3, TrainLoss: 0.519, TrainAcc: 81.7, TestLoss: 0.557, TestAcc: 75.8, Lr: 1.00e-05
Epoch: 4, TrainLoss: 0.483, TrainAcc: 86.4, TestLoss: 0.553, TestAcc: 78.6, Lr: 1.00e-05
Epoch: 5, TrainLoss: 0.453, TrainAcc: 89.5, TestLoss: 0.536, TestAcc: 78.8, Lr: 1.00e-05
Epoch: 6, TrainLoss: 0.428, TrainAcc: 93.2, TestLoss: 0.538, TestAcc: 79.3, Lr: 1.00e-05
Epoch: 7, TrainLoss: 0.413, TrainAcc: 94.4, TestLoss: 0.516, TestAcc: 78.8, Lr: 1.00e-05
Epoch: 8, TrainLoss: 0.402, TrainAcc: 95.0, TestLoss: 0.506, TestAcc: 81.8, Lr: 1.00e-05
Epoch: 9, TrainLoss: 0.389, TrainAcc: 96.0, TestLoss: 0.506, TestAcc: 81.8, Lr: 1.00e-05
Epoch: 10, TrainLoss: 0.375, TrainAcc: 96.9, TestLoss: 0.493, TestAcc: 82.5, Lr: 1.00e-05
Epoch: 11, TrainLoss: 0.364, TrainAcc: 98.0, TestLoss: 0.488, TestAcc: 83.2, Lr: 1.00e-05
Epoch: 12, TrainLoss: 0.358, TrainAcc: 98.2, TestLoss: 0.485, TestAcc: 84.1, Lr: 1.00e-05
Epoch: 13, TrainLoss: 0.356, TrainAcc: 98.1, TestLoss: 0.477, TestAcc: 83.7, Lr: 1.00e-05
Epoch: 14, TrainLoss: 0.351, TrainAcc: 98.7, TestLoss: 0.473, TestAcc: 85.8, Lr: 1.00e-05
Epoch: 15, TrainLoss: 0.346, TrainAcc: 98.9, TestLoss: 0.471, TestAcc: 84.4, Lr: 1.00e-05
Epoch: 16, TrainLoss: 0.342, TrainAcc: 99.1, TestLoss: 0.470, TestAcc: 86.0, Lr: 1.00e-05
Epoch: 17, TrainLoss: 0.337, TrainAcc: 99.2, TestLoss: 0.465, TestAcc: 86.2, Lr: 1.00e-05
Epoch: 18, TrainLoss: 0.335, TrainAcc: 99.4, TestLoss: 0.459, TestAcc: 86.7, Lr: 1.00e-05
Epoch: 19, TrainLoss: 0.334, TrainAcc: 99.2, TestLoss: 0.465, TestAcc: 84.4, Lr: 1.00e-05
Epoch: 20, TrainLoss: 0.329, TrainAcc: 99.5, TestLoss: 0.470, TestAcc: 84.4, Lr: 1.00e-05
Epoch: 21, TrainLoss: 0.331, TrainAcc: 99.2, TestLoss: 0.458, TestAcc: 86.9, Lr: 1.00e-05
Epoch: 22, TrainLoss: 0.326, TrainAcc: 99.6, TestLoss: 0.458, TestAcc: 86.5, Lr: 1.00e-05
Epoch: 23, TrainLoss: 0.325, TrainAcc: 99.6, TestLoss: 0.462, TestAcc: 84.8, Lr: 1.00e-05
Epoch: 24, TrainLoss: 0.325, TrainAcc: 99.5, TestLoss: 0.461, TestAcc: 85.8, Lr: 1.00e-05
Epoch: 25, TrainLoss: 0.323, TrainAcc: 99.8, TestLoss: 0.465, TestAcc: 84.6, Lr: 1.00e-05
Epoch: 26, TrainLoss: 0.324, TrainAcc: 99.7, TestLoss: 0.458, TestAcc: 86.2, Lr: 1.00e-05
Epoch: 27, TrainLoss: 0.321, TrainAcc: 99.9, TestLoss: 0.468, TestAcc: 83.4, Lr: 1.00e-05
Epoch: 28, TrainLoss: 0.319, TrainAcc: 99.9, TestLoss: 0.453, TestAcc: 86.2, Lr: 1.00e-05
Epoch: 29, TrainLoss: 0.320, TrainAcc: 99.8, TestLoss: 0.459, TestAcc: 85.3, Lr: 1.00e-05
Epoch: 30, TrainLoss: 0.318, TrainAcc: 99.8, TestLoss: 0.459, TestAcc: 85.5, Lr: 1.00e-05
Epoch: 31, TrainLoss: 0.318, TrainAcc: 99.9, TestLoss: 0.460, TestAcc: 85.3, Lr: 1.00e-05
Epoch: 32, TrainLoss: 0.318, TrainAcc: 100.0, TestLoss: 0.459, TestAcc: 83.9, Lr: 1.00e-05
Epoch: 33, TrainLoss: 0.318, TrainAcc: 99.9, TestLoss: 0.448, TestAcc: 88.3, Lr: 1.00e-05
Epoch: 34, TrainLoss: 0.318, TrainAcc: 99.9, TestLoss: 0.454, TestAcc: 85.5, Lr: 1.00e-05
Epoch: 35, TrainLoss: 0.317, TrainAcc: 99.9, TestLoss: 0.451, TestAcc: 86.5, Lr: 1.00e-05
Epoch: 36, TrainLoss: 0.317, TrainAcc: 99.9, TestLoss: 0.448, TestAcc: 86.7, Lr: 1.00e-05
Epoch: 37, TrainLoss: 0.318, TrainAcc: 99.8, TestLoss: 0.449, TestAcc: 86.7, Lr: 1.00e-05
Epoch: 38, TrainLoss: 0.316, TrainAcc: 100.0, TestLoss: 0.441, TestAcc: 87.2, Lr: 1.00e-05
Epoch: 39, TrainLoss: 0.316, TrainAcc: 99.9, TestLoss: 0.452, TestAcc: 86.0, Lr: 1.00e-05
Epoch: 40, TrainLoss: 0.317, TrainAcc: 99.9, TestLoss: 0.454, TestAcc: 85.8, Lr: 1.00e-05
done, best acc: 88.3

训练过程

总结

从结果上来看,模型的过拟合问题严重,可以发现,参数量大的时候,训练集的正确率会达到很高的高度,但是可能只是因为数据集太小,模型记住了所有的训练集导致的,就像这个模型, 在测试集上的表现并不突出。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:/a/369683.html

如若内容造成侵权/违法违规/事实不符,请联系我们进行投诉反馈qq邮箱809451989@qq.com,一经查实,立即删除!

相关文章

聊聊并发编程,另送5本Golang并发编程新书

大家好,我是飞哥! 并发编程并不是一个新话题,但是我觉得在近几年以及未来的时间里,并发编程将显得越来越重要。 为什么这样讲,让我们先回到一个基本的问题上来,为什么我们要采用并发编程?关于这…

数据分析基础之《pandas(2)—基本数据操作》

一、读取一个真实的股票数据 1、读取数据 # 基本数据操作 data pd.read_csv("./stock_day.csv")data# 删除一些列,使数据简洁点 data data.drop([ma5,ma10,ma20,v_ma5,v_ma10,v_ma20], axis1)data 二、索引操作 1、numpy当中我们已经讲过使用索引选取…

Unity 读取指定目录所占内存大小

public static class TxxTool{#region 读取文件大小private static List<string> DirList new List<string>();public static long GetFileSize(string path){DirList new List<string>();DirList.Add(path);GetAllDirecotries(path);long fileSize 0;for…

蓝桥杯省赛无忧 课件87 01tire

01 算法概述 02 问题引入 03 算法分析 04 例题

ReactNative实现的横向滑动条

OK,我们先看下效果图 注意使用到了两个库 1.react-native-linear-gradient 2.react-native-gesture-handler ok,我们看下面的代码 import {Image, TouchableWithoutFeedback, StyleSheet, View} from react-native; import LinearGradient from react-native-linear-grad…

开源软件:技术创新与应用的推动力量

文章目录 每日一句正能量前言开源软件如何推动技术创新开源软件的历史开源软件的开发模式开源软件与闭源软件源代码和开发许可维护特点、支持和成本开源软件的优势减少开支可定制性快速创新发展透明度和安全性 开源软件的应用 常见问题后记 每日一句正能量 不好等待运气降临&am…

vue3-自定义指令

自定义指令 vue 除了内置的制指令&#xff08;v-model v-show 等&#xff09;之外&#xff0c;还允许我们注册自定义的指令。 vue 复用代码的方式&#xff1a; 组件&#xff1a;主要是构建模块。 组合式函数&#xff1a;侧重有状态的逻辑。 自定义指令&#xff1a;主要是为…

大模型ReAct提示工程详解【2023】

普林斯顿大学的教授和谷歌的研究人员最近发表了一篇论文&#xff0c;描述了一种新颖的提示工程方法&#xff0c;该方法使大型语言模型&#xff08;例如 ChatGPT&#xff09;能够在模拟环境中智能地推理和行动。 这种 ReAct 方法模仿了人类在现实世界中的运作方式&#xff0c;即…

欧洲的编程语言三巨头,只剩下一位了!

能把三位大牛的名字都叫出来的人恐怕不多吧&#xff1a; 这三位都是图灵奖获得者&#xff0c;他们的名字和发明散布在各种教科书中&#xff0c;从左到右&#xff0c;依次是&#xff1a; 尼克劳斯沃斯 (Niklaus Wirth) 瑞士人&#xff0c;一生发明了8种编程语言&#xff0c;其中…

CTF赛三层内网渗透

CTF赛三层内网渗透 前言 2019某CTF线下赛真题内网结合WEB攻防题库&#xff0c;涉及WEB攻击&#xff0c;内网代理路由等技术&#xff0c;每台服务器存在Flag&#xff0c;获取一个Flag对应一个积分&#xff0c;获取三个Flag结束。 第一关 Taget1_centos7 1、访问目标网页 发现…

AIGC 为何能火爆全网,赋能智能时代?

Hi&#xff0c;大家好&#xff0c;我是半亩花海。2023年&#xff0c;人工智能新浪潮涌起&#xff0c;AIGC 火爆全网&#xff0c;不断赋能各大行业。从短视频平台上火爆的“AI 绘画”&#xff0c;到智能聊天软件 ChatGPT&#xff0c;都引起了大家的广泛关注。那么 AIGC 到底是什…

浙政钉访接口:k8s+slb容器日志报错(:Temporary failure in name resolution。)

在此我只能说兄弟&#xff0c;浙政钉的扫码接口和用户详情返回这两个接口是不需要白名单的&#xff0c; 我们文明人先确定一件事就是&#xff0c;你代码本地能调用到浙政钉返回。ecs服务器curl浙政钉也通的&#xff1a; 这时候和你说要开通白名单的&#xff0c;请放开你的道德…

r0下进程保护

简介 SSDT 的全称是 System Services Descriptor Table&#xff0c;系统服务描述符表。这个表就是一个把 Ring3 的 Win32 API 和 Ring0 的内核 API 联系起来。SSDT 并不仅仅只包含一个庞大的地址索引表&#xff0c;它还包含着一些其它有用的信息&#xff0c;诸如地址索引的基地…

如何强制关掉系统或应用程序?这里提供详细方法

总的来说,Windows相当可靠,但有时会挂断并崩溃。我们如何在最少麻烦的情况下重返工作或游戏?为此,我们需要强制退出操作系统。 在本教程中,我们将向你展示如何在最坏的情况下安全关闭或重新启动计算机。我们还将向你展示如何在不触摸鼠标的情况下强制关闭应用程序和快速关…

【51单片机】开发板和单片机的介绍(2)

前言 大家好吖&#xff0c;欢迎来到 YY 滴单片机系列 &#xff0c;热烈欢迎&#xff01; 本章主要内容面向接触过单片机的老铁 主要内容含&#xff1a; 欢迎订阅 YY滴C专栏&#xff01;更多干货持续更新&#xff01;以下是传送门&#xff01; YY的《C》专栏YY的《C11》专栏YY的…

挖矿系列:细说Python、conda 和 pip 之间的关系

继续挖矿&#xff0c;挖金矿&#xff01; 1. Python、conda 和 pip Python、conda 和 pip 是在现代数据科学和软件开发中常用的工具&#xff0c;它们各自有不同的作用&#xff0c;但相互之间存在密切的关系&#xff1a; Python&#xff1a;是一种解释型、面向对象的高级程序设…

【数据集】全国地级市-平均受教育年限-男、女数据集(2000-2020年)

平均受教育年限用以衡量地区的人力资本&#xff0c;指对一定时期、一定区域某一人口群体接受学历教育的年数总和的平均值。参考陈熠辉&#xff08;2023&#xff09;等人的计算方式&#xff0c;根据第五次人口普查、第六次人口普查、第七次人口普查结果整理了地级市的平均受教育…

CentOS下安装vlc

一、引言 vlc是一跨多媒体播放器&#xff0c;可以播放本地媒体文件和网络串流&#xff0c;帮助我们排查音视频开发过程中遇到的问题。大部分情况下&#xff0c;我们只需要在Windows系统下安装vlc就可以了。但有一种情况是需要在Linux下安装vlc的&#xff1a;我们的音视频拉流软…

2024美赛C题完整解题教程及代码 网球运动的势头

2024 MCM Problem C: Momentum in Tennis &#xff08;网球运动的势头&#xff09; 注&#xff1a;在网球运动中&#xff0c;"势头"通常指的是比赛中因一系列事件&#xff08;如连续得分&#xff09;而形成的动力或趋势&#xff0c;这可能对比赛结果产生重要影响。球…

打开率超90%的开发信标题,原来要这样写

写开发信时&#xff0c;邮件标题的撰写尤为重要&#xff0c;买家收到邮件的时候&#xff0c;在手机或其它移动设备上弹出来的信息就是邮件标题和正文第一句话。 好的标题能吸引买家打开邮件&#xff0c;开启高回复率的第一步&#xff0c;下面给大家介绍一下如何撰写高打开率的开…