昇思MindSpore学习笔记6-01LLM原理和实践--FCN图像语义分割

摘要:

        记录MindSpore AI框架使用FCN全卷积网络理解图像进行图像语议分割的过程、步骤和方法。包括环境准备、下载数据集、数据集加载和预处理、构建网络、训练准备、模型训练、模型评估、模型推理等。

一、

1.语义分割

图像语义分割

semantic segmentation

        图像处理

        机器视觉

                图像理解

        AI领域重要分支

        应用

                人脸识别

                物体检测

                医学影像

                卫星图像分析

                自动驾驶感知

        目的

                图像每个像素点分类

                输出与输入大小相同的图像

                输出图像的每个像素对应了输入图像每个像素的类别

        图像领域语义

                图像的内容

                对图片意思的理解

实例

2.FCN全卷积网络

Fully Convolutional Networks

图像语义分割框架

        2015年UC Berkeley提出

        端到端(end to end)像素级(pixel level)预测全卷积网络

全卷积神经网络主要使用三种技术:

1.卷积化Convolutional

VGG-16

        FCN的backbone

        输入224*224RGB图像

                固定大小的输入

                丢弃了空间坐标

                产生非空间输出

        输出1000个预测值

卷积层

        输出二维矩阵

        生成输入图片映射的heatmap

2.上采样Upsample

卷积过程

        卷积操作

        池化操作

特征图尺寸变小

上采样操作

        得到原图大小的稠密图像预测

双线性插值参数

初始化上采样逆卷积参数

反向传播学习非线性上采样

3.跳跃结构Skip Layer

将深层的全局信息与浅层的局部信息相结合

                             底层stride 32的预测FCN-32s    2倍上采样

融合(相加)  pool4层stride 16的预测FCN-16s    2倍上采样

融合(相加)  pool3层stride 8的预测FCN-8s

特点:

(1)不含全连接层(fc)的全卷积(fully conv)网络,可适应任意尺寸输入。

(2)增大数据尺寸的反卷积(deconv)层,能够输出精细的结果。

(3)结合不同深度层结果的跳级(skip)结构,同时确保鲁棒性和精确性。

二、环境准备

%%capture captured_output
# 实验环境已经预装了mindspore==2.2.14,如需更换mindspore版本,可更改下面mindspore的版本号
!pip uninstall mindspore -y
!pip install -i https://pypi.mirrors.ustc.edu.cn/simple mindspore==2.2.14
# 查看当前 mindspore 版本
!pip show mindspore

输出:

Name: mindspore
Version: 2.2.14
Summary: MindSpore is a new open source deep learning training/inference framework that could be used for mobile, edge and cloud scenarios.
Home-page: https://www.mindspore.cn
Author: The MindSpore Authors
Author-email: contact@mindspore.cn
License: Apache 2.0
Location: /home/nginx/miniconda/envs/jupyter/lib/python3.9/site-packages
Requires: asttokens, astunparse, numpy, packaging, pillow, protobuf, psutil, scipy
Required-by: 

三、数据处理

1.下载数据集

from download import download
​
url = "https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/dataset_fcn8s.tar"
​
download(url, "./dataset", kind="tar", replace=True)

输出:

Creating data folder...
Downloading data from https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/dataset_fcn8s.tar (537.2 MB)

file_sizes: 100%|█████████████████████████████| 563M/563M [00:03<00:00, 160MB/s]
Extracting tar file...
Successfully downloaded / unzipped to ./dataset
'./dataset'

2.数据预处理

PASCAL VOC 2012数据集图像分辨率不一致

        标准化处理

3.数据加载

混合PASCAL VOC 2012数据集SDB数据集

import numpy as np
import cv2
import mindspore.dataset as ds
​
class SegDataset:
    def __init__(self,
                 image_mean,
                 image_std,
                 data_file='',
                 batch_size=32,
                 crop_size=512,
                 max_scale=2.0,
                 min_scale=0.5,
                 ignore_label=255,
                 num_classes=21,
                 num_readers=2,
                 num_parallel_calls=4):
​
        self.data_file = data_file
        self.batch_size = batch_size
        self.crop_size = crop_size
        self.image_mean = np.array(image_mean, dtype=np.float32)
        self.image_std = np.array(image_std, dtype=np.float32)
        self.max_scale = max_scale
        self.min_scale = min_scale
        self.ignore_label = ignore_label
        self.num_classes = num_classes
        self.num_readers = num_readers
        self.num_parallel_calls = num_parallel_calls
        max_scale > min_scale
​
    def preprocess_dataset(self, image, label):
        image_out = cv2.imdecode(np.frombuffer(image, dtype=np.uint8), cv2.IMREAD_COLOR)
        label_out = cv2.imdecode(np.frombuffer(label, dtype=np.uint8), cv2.IMREAD_GRAYSCALE)
        sc = np.random.uniform(self.min_scale, self.max_scale)
        new_h, new_w = int(sc * image_out.shape[0]), int(sc * image_out.shape[1])
        image_out = cv2.resize(image_out, (new_w, new_h), interpolation=cv2.INTER_CUBIC)
        label_out = cv2.resize(label_out, (new_w, new_h), interpolation=cv2.INTER_NEAREST)
​
        image_out = (image_out - self.image_mean) / self.image_std
        out_h, out_w = max(new_h, self.crop_size), max(new_w, self.crop_size)
        pad_h, pad_w = out_h - new_h, out_w - new_w
        if pad_h > 0 or pad_w > 0:
            image_out = cv2.copyMakeBorder(image_out, 0, pad_h, 0, pad_w, cv2.BORDER_CONSTANT, value=0)
            label_out = cv2.copyMakeBorder(label_out, 0, pad_h, 0, pad_w, cv2.BORDER_CONSTANT, value=self.ignore_label)
        offset_h = np.random.randint(0, out_h - self.crop_size + 1)
        offset_w = np.random.randint(0, out_w - self.crop_size + 1)
        image_out = image_out[offset_h: offset_h + self.crop_size, offset_w: offset_w + self.crop_size, :]
        label_out = label_out[offset_h: offset_h + self.crop_size, offset_w: offset_w+self.crop_size]
        if np.random.uniform(0.0, 1.0) > 0.5:
            image_out = image_out[:, ::-1, :]
            label_out = label_out[:, ::-1]
        image_out = image_out.transpose((2, 0, 1))
        image_out = image_out.copy()
        label_out = label_out.copy()
        label_out = label_out.astype("int32")
        return image_out, label_out
​
    def get_dataset(self):
        ds.config.set_numa_enable(True)
        dataset = ds.MindDataset(self.data_file, columns_list=["data", "label"],
                                 shuffle=True, num_parallel_workers=self.num_readers)
        transforms_list = self.preprocess_dataset
        dataset = dataset.map(operations=transforms_list, input_columns=["data", "label"],
                              output_columns=["data", "label"],
                              num_parallel_workers=self.num_parallel_calls)
        dataset = dataset.shuffle(buffer_size=self.batch_size * 10)
        dataset = dataset.batch(self.batch_size, drop_remainder=True)
        return dataset
​
​
# 定义创建数据集的参数
IMAGE_MEAN = [103.53, 116.28, 123.675]
IMAGE_STD = [57.375, 57.120, 58.395]
DATA_FILE = "dataset/dataset_fcn8s/mindname.mindrecord"
​
# 定义模型训练参数
train_batch_size = 4
crop_size = 512
min_scale = 0.5
max_scale = 2.0
ignore_label = 255
num_classes = 21
​
# 实例化Dataset
dataset = SegDataset(image_mean=IMAGE_MEAN,
                     image_std=IMAGE_STD,
                     data_file=DATA_FILE,
                     batch_size=train_batch_size,
                     crop_size=crop_size,
                     max_scale=max_scale,
                     min_scale=min_scale,
                     ignore_label=ignore_label,
                     num_classes=num_classes,
                     num_readers=2,
                     num_parallel_calls=4)
​
dataset = dataset.get_dataset()

4.训练集可视化

import numpy as np
import matplotlib.pyplot as plt
​
plt.figure(figsize=(16, 8))
​
# 对训练集中的数据进行展示
for i in range(1, 9):
    plt.subplot(2, 4, i)
    show_data = next(dataset.create_dict_iterator())
    show_images = show_data["data"].asnumpy()
    show_images = np.clip(show_images, 0, 1)
# 将图片转换HWC格式后进行展示
    plt.imshow(show_images[0].transpose(1, 2, 0))
    plt.axis("off")
    plt.subplots_adjust(wspace=0.05, hspace=0)
plt.show()

输出:

四、网络构建

FCN网络流程

        输入图像image

        pool1池化

                尺寸变为原始尺寸的1/2

        pool2池化

                尺寸变为原始尺寸的1/4

        pool3池化

                尺寸变为原始尺寸的1/8

        pool4池化

                尺寸变为原始尺寸的1/16

        pool5池化

                尺寸变为原始尺寸的1/32

        conv6-7卷积

                输出尺寸原图的1/32

        FCN-32s

                反卷积扩大到原始尺寸

        FCN-16s

                融合

                        conv7反卷积尺寸扩大两倍至原图的1/16

                        pool4特征图

                反卷积扩大到原始尺寸

        FCN-8s

                融合

                        conv7反卷积尺寸扩大4倍

                        pool4特征图反卷积扩大2倍

                        pool3特征图

                反卷积扩大到原始尺寸

构建FCN-8s网络代码:

import mindspore.nn as nn
​
class FCN8s(nn.Cell):
    def __init__(self, n_class):
        super().__init__()
        self.n_class = n_class
        self.conv1 = nn.SequentialCell(
            nn.Conv2d(in_channels=3, out_channels=64,
                      kernel_size=3, weight_init='xavier_uniform'),
            nn.BatchNorm2d(64),
            nn.ReLU(),
            nn.Conv2d(in_channels=64, out_channels=64,
                      kernel_size=3, weight_init='xavier_uniform'),
            nn.BatchNorm2d(64),
            nn.ReLU()
        )
        self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv2 = nn.SequentialCell(
            nn.Conv2d(in_channels=64, out_channels=128,
                      kernel_size=3, weight_init='xavier_uniform'),
            nn.BatchNorm2d(128),
            nn.ReLU(),
            nn.Conv2d(in_channels=128, out_channels=128,
                      kernel_size=3, weight_init='xavier_uniform'),
            nn.BatchNorm2d(128),
            nn.ReLU()
        )
        self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv3 = nn.SequentialCell(
            nn.Conv2d(in_channels=128, out_channels=256,
                      kernel_size=3, weight_init='xavier_uniform'),
            nn.BatchNorm2d(256),
            nn.ReLU(),
            nn.Conv2d(in_channels=256, out_channels=256,
                      kernel_size=3, weight_init='xavier_uniform'),
            nn.BatchNorm2d(256),
            nn.ReLU(),
            nn.Conv2d(in_channels=256, out_channels=256,
                      kernel_size=3, weight_init='xavier_uniform'),
            nn.BatchNorm2d(256),
            nn.ReLU()
        )
        self.pool3 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv4 = nn.SequentialCell(
            nn.Conv2d(in_channels=256, out_channels=512,
                      kernel_size=3, weight_init='xavier_uniform'),
            nn.BatchNorm2d(512),
            nn.ReLU(),
            nn.Conv2d(in_channels=512, out_channels=512,
                      kernel_size=3, weight_init='xavier_uniform'),
            nn.BatchNorm2d(512),
            nn.ReLU(),
            nn.Conv2d(in_channels=512, out_channels=512,
                      kernel_size=3, weight_init='xavier_uniform'),
            nn.BatchNorm2d(512),
            nn.ReLU()
        )
        self.pool4 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv5 = nn.SequentialCell(
            nn.Conv2d(in_channels=512, out_channels=512,
                      kernel_size=3, weight_init='xavier_uniform'),
            nn.BatchNorm2d(512),
            nn.ReLU(),
            nn.Conv2d(in_channels=512, out_channels=512,
                      kernel_size=3, weight_init='xavier_uniform'),
            nn.BatchNorm2d(512),
            nn.ReLU(),
            nn.Conv2d(in_channels=512, out_channels=512,
                      kernel_size=3, weight_init='xavier_uniform'),
            nn.BatchNorm2d(512),
            nn.ReLU()
        )
        self.pool5 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv6 = nn.SequentialCell(
            nn.Conv2d(in_channels=512, out_channels=4096,
                      kernel_size=7, weight_init='xavier_uniform'),
            nn.BatchNorm2d(4096),
            nn.ReLU(),
        )
        self.conv7 = nn.SequentialCell(
            nn.Conv2d(in_channels=4096, out_channels=4096,
                      kernel_size=1, weight_init='xavier_uniform'),
            nn.BatchNorm2d(4096),
            nn.ReLU(),
        )
        self.score_fr = nn.Conv2d(in_channels=4096, out_channels=self.n_class,
                                  kernel_size=1, weight_init='xavier_uniform')
        self.upscore2 = nn.Conv2dTranspose(in_channels=self.n_class, out_channels=self.n_class,
                                           kernel_size=4, stride=2, weight_init='xavier_uniform')
        self.score_pool4 = nn.Conv2d(in_channels=512, out_channels=self.n_class,
                                     kernel_size=1, weight_init='xavier_uniform')
        self.upscore_pool4 = nn.Conv2dTranspose(in_channels=self.n_class, out_channels=self.n_class,
                                                kernel_size=4, stride=2, weight_init='xavier_uniform')
        self.score_pool3 = nn.Conv2d(in_channels=256, out_channels=self.n_class,
                                     kernel_size=1, weight_init='xavier_uniform')
        self.upscore8 = nn.Conv2dTranspose(in_channels=self.n_class, out_channels=self.n_class,
                                           kernel_size=16, stride=8, weight_init='xavier_uniform')
​
    def construct(self, x):
        x1 = self.conv1(x)
        p1 = self.pool1(x1)
        x2 = self.conv2(p1)
        p2 = self.pool2(x2)
        x3 = self.conv3(p2)
        p3 = self.pool3(x3)
        x4 = self.conv4(p3)
        p4 = self.pool4(x4)
        x5 = self.conv5(p4)
        p5 = self.pool5(x5)
        x6 = self.conv6(p5)
        x7 = self.conv7(x6)
        sf = self.score_fr(x7)
        u2 = self.upscore2(sf)
        s4 = self.score_pool4(p4)
        f4 = s4 + u2
        u4 = self.upscore_pool4(f4)
        s3 = self.score_pool3(p3)
        f3 = s3 + u4
        out = self.upscore8(f3)
        return out

五、训练准备

1.导入VGG-16部分预训练权重

from download import download
from mindspore import load_checkpoint, load_param_into_net
​
url = "https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/fcn8s_vgg16_pretrain.ckpt"
download(url, "fcn8s_vgg16_pretrain.ckpt", replace=True)
def load_vgg16():
    ckpt_vgg16 = "fcn8s_vgg16_pretrain.ckpt"
    param_vgg = load_checkpoint(ckpt_vgg16)
    load_param_into_net(net, param_vgg)

输出:

Downloading data from https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/fcn8s_vgg16_pretrain.ckpt (513.2 MB)

file_sizes: 100%|█████████████████████████████| 538M/538M [00:03<00:00, 179MB/s]
Successfully downloaded file to fcn8s_vgg16_pretrain.ckpt

2.损失函数

交叉熵损失函数

mindspore.nn.CrossEntropyLoss()

计算FCN网络输出与mask之间的交叉熵损失

3.自定义评价指标 Metrics

用于评估模型效果

设共有 K+1个类

        从L_0 到L_{ki}

        其中包含一个空类或背景

P_{ij}表示本属于i类但被预测为j类的像素数量

P_{ii}表示真正的数量

P_{ij}P_{ji}则分别被解释为假正和假负

Pixel Accuracy

PA像素精度

        标记正确的像素占总像素的比例。

PA=\frac{\sum_{i=0}^{k}P_{ii}}{\sum_{i=0}^{k}\sum_{j=0}^{k}P_{ij}}

Mean Pixel Accuracy

MPA均像素精度

计算每个类内正确分类像素数的比例

求所有类的平均

MPA=\frac{1}{K+1}\sum \sum_{i=0}^{k}\frac{P_{ii}}{\sum_{j=0}^{k}P_{ij}}

Mean Intersection over Union

MloU均交并比

        语义分割的标准度量

                计算两个集合的交集和并集之比

                        交集为真实值(ground truth)

                        并集为预测值(predicted segmentation)

                两者之比:正真数 (intersection) /(真正+假负+假正(并集))

                在每个类上计算loU

                平均

MIoU=\frac{1}{K+1} \sum_{i=0}^{k}\frac{p_{ii}}{\sum_{j=0}^{k}p_{ij}+{\sum_{j=0}^{k}p_{ji}}-p_{ii}}

Frequency Weighted Intersection over Union

FWIoU频权交井比

根据每个类出现的频率设置权重

FWIoU=\frac{1}{\sum_{i=0}^{k}\sum_{j=0}^{k}p_{ij}} \sum_{i=0}^{k}\frac{p_{ii}}{\sum_{j=0}^{k}p_{ij}+{\sum_{j=0}^{k}p_{ji}}-p_{ii}}

import numpy as np
import mindspore as ms
import mindspore.nn as nn
import mindspore.train as train
​
class PixelAccuracy(train.Metric):
    def __init__(self, num_class=21):
        super(PixelAccuracy, self).__init__()
        self.num_class = num_class
​
    def _generate_matrix(self, gt_image, pre_image):
        mask = (gt_image >= 0) & (gt_image < self.num_class)
        label = self.num_class * gt_image[mask].astype('int') + pre_image[mask]
        count = np.bincount(label, minlength=self.num_class**2)
        confusion_matrix = count.reshape(self.num_class, self.num_class)
        return confusion_matrix
​
    def clear(self):
        self.confusion_matrix = np.zeros((self.num_class,) * 2)
​
    def update(self, *inputs):
        y_pred = inputs[0].asnumpy().argmax(axis=1)
        y = inputs[1].asnumpy().reshape(4, 512, 512)
        self.confusion_matrix += self._generate_matrix(y, y_pred)
​
    def eval(self):
        pixel_accuracy = np.diag(self.confusion_matrix).sum() / self.confusion_matrix.sum()
        return pixel_accuracy
​
​
class PixelAccuracyClass(train.Metric):
    def __init__(self, num_class=21):
        super(PixelAccuracyClass, self).__init__()
        self.num_class = num_class
​
    def _generate_matrix(self, gt_image, pre_image):
        mask = (gt_image >= 0) & (gt_image < self.num_class)
        label = self.num_class * gt_image[mask].astype('int') + pre_image[mask]
        count = np.bincount(label, minlength=self.num_class**2)
        confusion_matrix = count.reshape(self.num_class, self.num_class)
        return confusion_matrix
​
    def update(self, *inputs):
        y_pred = inputs[0].asnumpy().argmax(axis=1)
        y = inputs[1].asnumpy().reshape(4, 512, 512)
        self.confusion_matrix += self._generate_matrix(y, y_pred)
​
    def clear(self):
        self.confusion_matrix = np.zeros((self.num_class,) * 2)
​
    def eval(self):
        mean_pixel_accuracy = np.diag(self.confusion_matrix) / self.confusion_matrix.sum(axis=1)
        mean_pixel_accuracy = np.nanmean(mean_pixel_accuracy)
        return mean_pixel_accuracy
​
​
class MeanIntersectionOverUnion(train.Metric):
    def __init__(self, num_class=21):
        super(MeanIntersectionOverUnion, self).__init__()
        self.num_class = num_class
​
    def _generate_matrix(self, gt_image, pre_image):
        mask = (gt_image >= 0) & (gt_image < self.num_class)
        label = self.num_class * gt_image[mask].astype('int') + pre_image[mask]
        count = np.bincount(label, minlength=self.num_class**2)
        confusion_matrix = count.reshape(self.num_class, self.num_class)
        return confusion_matrix
​
    def update(self, *inputs):
        y_pred = inputs[0].asnumpy().argmax(axis=1)
        y = inputs[1].asnumpy().reshape(4, 512, 512)
        self.confusion_matrix += self._generate_matrix(y, y_pred)
​
    def clear(self):
        self.confusion_matrix = np.zeros((self.num_class,) * 2)
​
    def eval(self):
        mean_iou = np.diag(self.confusion_matrix) / (
            np.sum(self.confusion_matrix, axis=1) + np.sum(self.confusion_matrix, axis=0) -
            np.diag(self.confusion_matrix))
        mean_iou = np.nanmean(mean_iou)
        return mean_iou
​
​
class FrequencyWeightedIntersectionOverUnion(train.Metric):
    def __init__(self, num_class=21):
        super(FrequencyWeightedIntersectionOverUnion, self).__init__()
        self.num_class = num_class
​
    def _generate_matrix(self, gt_image, pre_image):
        mask = (gt_image >= 0) & (gt_image < self.num_class)
        label = self.num_class * gt_image[mask].astype('int') + pre_image[mask]
        count = np.bincount(label, minlength=self.num_class**2)
        confusion_matrix = count.reshape(self.num_class, self.num_class)
        return confusion_matrix
​
    def update(self, *inputs):
        y_pred = inputs[0].asnumpy().argmax(axis=1)
        y = inputs[1].asnumpy().reshape(4, 512, 512)
        self.confusion_matrix += self._generate_matrix(y, y_pred)
​
    def clear(self):
        self.confusion_matrix = np.zeros((self.num_class,) * 2)
​
    def eval(self):
        freq = np.sum(self.confusion_matrix, axis=1) / np.sum(self.confusion_matrix)
        iu = np.diag(self.confusion_matrix) / (
            np.sum(self.confusion_matrix, axis=1) + np.sum(self.confusion_matrix, axis=0) -
            np.diag(self.confusion_matrix))
​
        frequency_weighted_iou = (freq[freq > 0] * iu[freq > 0]).sum()
        return frequency_weighted_iou

六、模型训练

导入VGG-16预训练参数

实例化损失函数、优化器

Model接口编译网络

训练FCN-8s网络

import mindspore
from mindspore import Tensor
import mindspore.nn as nn
from mindspore.train import ModelCheckpoint, CheckpointConfig, LossMonitor, TimeMonitor, Model
​
device_target = "Ascend"
mindspore.set_context(mode=mindspore.PYNATIVE_MODE, device_target=device_target)
​
train_batch_size = 4
num_classes = 21
# 初始化模型结构
net = FCN8s(n_class=21)
# 导入vgg16预训练参数
load_vgg16()
# 计算学习率
min_lr = 0.0005
base_lr = 0.05
train_epochs = 1
iters_per_epoch = dataset.get_dataset_size()
total_step = iters_per_epoch * train_epochs
​
lr_scheduler = mindspore.nn.cosine_decay_lr(min_lr,
                                            base_lr,
                                            total_step,
                                            iters_per_epoch,
                                            decay_epoch=2)
lr = Tensor(lr_scheduler[-1])
​
# 定义损失函数
loss = nn.CrossEntropyLoss(ignore_index=255)
# 定义优化器
optimizer = nn.Momentum(params=net.trainable_params(), learning_rate=lr, momentum=0.9, weight_decay=0.0001)
# 定义loss_scale
scale_factor = 4
scale_window = 3000
loss_scale_manager = ms.amp.DynamicLossScaleManager(scale_factor, scale_window)
# 初始化模型
if device_target == "Ascend":
    model = Model(net, loss_fn=loss, optimizer=optimizer, loss_scale_manager=loss_scale_manager, metrics={"pixel accuracy": PixelAccuracy(), "mean pixel accuracy": PixelAccuracyClass(), "mean IoU": MeanIntersectionOverUnion(), "frequency weighted IoU": FrequencyWeightedIntersectionOverUnion()})
else:
    model = Model(net, loss_fn=loss, optimizer=optimizer, metrics={"pixel accuracy": PixelAccuracy(), "mean pixel accuracy": PixelAccuracyClass(), "mean IoU": MeanIntersectionOverUnion(), "frequency weighted IoU": FrequencyWeightedIntersectionOverUnion()})
​
# 设置ckpt文件保存的参数
time_callback = TimeMonitor(data_size=iters_per_epoch)
loss_callback = LossMonitor()
callbacks = [time_callback, loss_callback]
save_steps = 330
keep_checkpoint_max = 5
config_ckpt = CheckpointConfig(save_checkpoint_steps=10,
                               keep_checkpoint_max=keep_checkpoint_max)
ckpt_callback = ModelCheckpoint(prefix="FCN8s",
                                directory="./ckpt",
                                config=config_ckpt)
callbacks.append(ckpt_callback)
model.train(train_epochs, dataset, callbacks=callbacks)

输出:

epoch: 1 step: 1, loss is 3.0504844
epoch: 1 step: 2, loss is 3.017057
epoch: 1 step: 3, loss is 2.9523003
epoch: 1 step: 4, loss is 2.9488814
epoch: 1 step: 5, loss is 2.666231
epoch: 1 step: 6, loss is 2.7145326
epoch: 1 step: 7, loss is 1.796408
epoch: 1 step: 8, loss is 1.5167583
epoch: 1 step: 9, loss is 1.6862022
epoch: 1 step: 10, loss is 2.4622822
......
epoch: 1 step: 1141, loss is 1.70966
epoch: 1 step: 1142, loss is 1.434751
epoch: 1 step: 1143, loss is 2.406475
Train epoch time: 762889.258 ms, per step time: 667.445 ms

七、模型评估

IMAGE_MEAN = [103.53, 116.28, 123.675]
IMAGE_STD = [57.375, 57.120, 58.395]
DATA_FILE = "dataset/dataset_fcn8s/mindname.mindrecord"
​
# 下载已训练好的权重文件
url = "https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/FCN8s.ckpt"
download(url, "FCN8s.ckpt", replace=True)
net = FCN8s(n_class=num_classes)
​
ckpt_file = "FCN8s.ckpt"
param_dict = load_checkpoint(ckpt_file)
load_param_into_net(net, param_dict)
​
if device_target == "Ascend":
    model = Model(net, loss_fn=loss, optimizer=optimizer, loss_scale_manager=loss_scale_manager, metrics={"pixel accuracy": PixelAccuracy(), "mean pixel accuracy": PixelAccuracyClass(), "mean IoU": MeanIntersectionOverUnion(), "frequency weighted IoU": FrequencyWeightedIntersectionOverUnion()})
else:
    model = Model(net, loss_fn=loss, optimizer=optimizer, metrics={"pixel accuracy": PixelAccuracy(), "mean pixel accuracy": PixelAccuracyClass(), "mean IoU": MeanIntersectionOverUnion(), "frequency weighted IoU": FrequencyWeightedIntersectionOverUnion()})
​
# 实例化Dataset
dataset = SegDataset(image_mean=IMAGE_MEAN,
                     image_std=IMAGE_STD,
                     data_file=DATA_FILE,
                     batch_size=train_batch_size,
                     crop_size=crop_size,
                     max_scale=max_scale,
                     min_scale=min_scale,
                     ignore_label=ignore_label,
                     num_classes=num_classes,
                     num_readers=2,
                     num_parallel_calls=4)
dataset_eval = dataset.get_dataset()
model.eval(dataset_eval)

输出:

Downloading data from https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/FCN8s.ckpt (1.00 GB)

file_sizes: 100%|██████████████████████████| 1.08G/1.08G [00:10<00:00, 99.7MB/s]
Successfully downloaded file to FCN8s.ckpt
/
{'pixel accuracy': 0.9734831394168291,
 'mean pixel accuracy': 0.9423324801371116,
 'mean IoU': 0.8961453779807752,
 'frequency weighted IoU': 0.9488883312345654}

八、模型推理

使用训练的网络对模型推理结果进行展示。

import cv2
import matplotlib.pyplot as plt
​
net = FCN8s(n_class=num_classes)
# 设置超参
ckpt_file = "FCN8s.ckpt"
param_dict = load_checkpoint(ckpt_file)
load_param_into_net(net, param_dict)
eval_batch_size = 4
img_lst = []
mask_lst = []
res_lst = []
# 推理效果展示(上方为输入图片,下方为推理效果图片)
plt.figure(figsize=(8, 5))
show_data = next(dataset_eval.create_dict_iterator())
show_images = show_data["data"].asnumpy()
mask_images = show_data["label"].reshape([4, 512, 512])
show_images = np.clip(show_images, 0, 1)
for i in range(eval_batch_size):
    img_lst.append(show_images[i])
    mask_lst.append(mask_images[i])
res = net(show_data["data"]).asnumpy().argmax(axis=1)
for i in range(eval_batch_size):
    plt.subplot(2, 4, i + 1)
    plt.imshow(img_lst[i].transpose(1, 2, 0))
    plt.axis("off")
    plt.subplots_adjust(wspace=0.05, hspace=0.02)
    plt.subplot(2, 4, i + 5)
    plt.imshow(res[i])
    plt.axis("off")
    plt.subplots_adjust(wspace=0.05, hspace=0.02)
plt.show()

输出:

九、总结

FCN

        使用全卷积层

        通过学习让图片实现端到端分割。

        优点:

                输入接受任意大小的图像

                高效,避免了由于使用像素块而带来的重复存储和计算卷积的问题。

        待改进之处:

                结果不够精细。比较模糊和平滑,边界处细节不敏感。

                像素分类,没有考虑像素之间的关系(如不连续性和相似性)

                忽略空间规整(spatial regularization)步骤,缺乏空间一致性。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:/a/784505.html

如若内容造成侵权/违法违规/事实不符,请联系我们进行投诉反馈qq邮箱809451989@qq.com,一经查实,立即删除!

相关文章

【易捷海购-注册安全分析报告】

前言 由于网站注册入口容易被黑客攻击&#xff0c;存在如下安全问题&#xff1a; 暴力破解密码&#xff0c;造成用户信息泄露短信盗刷的安全问题&#xff0c;影响业务及导致用户投诉带来经济损失&#xff0c;尤其是后付费客户&#xff0c;风险巨大&#xff0c;造成亏损无底洞…

【2024华为HCIP831 | 高级网络工程师之路】刷题日记(BGP)

个人名片:🪪 🐼作者简介:一名大三在校生,喜欢AI编程🎋 🐻‍❄️个人主页🥇:落798. 🐼个人WeChat:hmmwx53 🕊️系列专栏:🖼️ 零基础学Java——小白入门必备🔥重识C语言——复习回顾🔥计算机网络体系———深度详讲HCIP数通工程师-刷题与实战🔥🔥

windows obdc配置

进入控制面板&#xff1a; 进入管理工具&#xff1a;

java —— JSP 技术

一、JSP &#xff08;一&#xff09;前言 1、.jsp 与 .html 一样属于前端内容&#xff0c;创建在 WebContent 之下&#xff1b; 2、嵌套的 java 语句放置在<% %>里面&#xff1b; 3、嵌套 java 语句的三种语法&#xff1a; ① 脚本&#xff1a;<% java 代码 %>…

顶会FAST24最佳论文|阿里云块存储架构演进的得与失-4.EBS不同架构性能提升思路

3.1 平均延迟与长尾延迟 虚拟磁盘&#xff08;VD&#xff09;的延迟是由其底层架构决定的&#xff0c;具体而言&#xff0c;取决于请求所经历的路径。以EBS2为例&#xff0c;VD的延迟受制于两跳网络&#xff08;从BlockClient到BlockServer&#xff0c;再至ChunkServer&#x…

机器学习统计学基础 - 最大似然估计

最大似然估计&#xff08;Maximum Likelihood Estimation, MLE&#xff09;是一种常用的参数估计方法&#xff0c;其基本原理是通过最大化观测数据出现的概率来寻找最优的参数估计值。具体来说&#xff0c;最大似然估计的核心思想是利用已知的样本结果&#xff0c;反推最有可能…

零知识证明技术:隐私保护的利器

在当今信息时代&#xff0c;数据安全和隐私保护的重要性日益凸显。随着技术的发展&#xff0c;密码学在保障信息安全方面发挥着越来越重要的作用。其中&#xff0c;零知识证明技术作为一种新兴的密码学方法&#xff0c;为隐私保护提供了强有力的支持。本文将简要介绍零知识证明…

一.4 处理器读并解释储存在内存中的指令

此刻&#xff0c;hello.c源程序已经被编译系统翻译成了可执行目标文件hello&#xff0c;并被存放在硬盘上。要想在Unix系统上运行该可执行文件&#xff0c;我们将它的文件名输入到称为shell的应用程序中&#xff1a; linux>./hello hello, world linux> shell是一个命令…

[Flink]二、Flink1.13

7. 处理函数 之前所介绍的流处理 API,无论是基本的转换、聚合,还是更为复杂的窗口操作,其实都是基于 DataStream 进行转换的;所以可以统称为 DataStream API ,这也是 Flink 编程的核心。而我们知道,为了让代码有更强大的表现力和易用性, Flink 本身提供了多…

【面试题】串联探针和旁挂探针有什么区别?

在网络安全领域中&#xff0c;串联探针和旁挂探针&#xff08;通常也被称为旁路探针&#xff09;是两种不同部署方式的监控设备&#xff0c;它们各自具有独特的特性和应用场景。以下是它们之间的主要区别&#xff1a; 部署方式 串联探针&#xff1a;串联探针一般通过网关或者…

@react-google-maps/api实现谷歌地图嵌入React项目中,并且做到点击地图任意一处,获得它的经纬度

1.第一步要加入项目package.json中或者直接yarn install它都可以 "react-google-maps/api": "^2.19.3",2.加入项目中 import AMapLoader from amap/amap-jsapi-loader;import React, { PureComponent } from react; import { GoogleMap, LoadScript, Mar…

【刷题笔记(编程题)05】另类加法、走方格的方案数、井字棋、密码强度等级

1. 另类加法 给定两个int A和B。编写一个函数返回AB的值&#xff0c;但不得使用或其他算数运算符。 测试样例&#xff1a; 1,2 返回&#xff1a;3 示例 1 输入 输出 思路1: 二进制0101和1101的相加 0 1 0 1 1 1 0 1 其实就是 不带进位的结果1000 和进位产生的1010相加 无进位加…

虚拟地址空间划分

记住&#xff1a;任何编程语言编译之后产生汇编指令数据 每一个进程的用户空间是私有的&#xff0c;内核空间是共享的(管道通信的原理) X86 32为linux环境下,虚拟地址空间结构 只读区&#xff1a; .text段:指令段&#xff0c;存放汇编指令 .rodata段:常量段&#xff0c;存放常…

Linux环境部署Python Web服务

“姑娘&#xff0c;再见面就要靠运气了&#xff0c;可别装作不认识&#xff0c;那句“好久不见”可干万别打颤…” 将使用 Python 编写的后端 API 部署到 Linux 环境中&#xff0c;可以按照以下详细步骤操作。本文将涵盖环境准备、API 编写、使用 Gunicorn 作为 WSGI 服务器、配…

C++编译链接原理

从底层剖析程序从编译到运行的整个过程 三个阶段 一、编译阶段二、链接阶段三、运行阶段 为了方便解释&#xff0c;给出两端示例代码&#xff0c;下面围绕代码进行实验&#xff1a; //sum.cpp int gdata 10; int sum(int a,int b) {return ab; }//main.cpp extern int gdata…

49.实现调试器HOOK机制

免责声明&#xff1a;内容仅供学习参考&#xff0c;请合法利用知识&#xff0c;禁止进行违法犯罪活动&#xff01; 上一个内容&#xff1a;47.HOOK引擎优化支持CALL与JMP位置做HOOK 以 47.HOOK引擎优化支持CALL与JMP位置做HOOK 它的代码为基础进行修改 效果图&#xff1a;游…

Mysql8.0.36 Centos8环境安装

下载安装包 官网地址&#xff1a;MySQL :: Download MySQL Community Server (Archived Versions) 可以直接下载后再传到服务器&#xff0c;也可以在服务器采用wget下载。如下&#xff1a; wget https://downloads.mysql.com/archives/get/p/23/file/mysql-8.0.36-linux-glib…

mp4视频太大怎么压缩不影响画质,mp4文件太大怎么变小且清晰度高

在数字化时代&#xff0c;我们常常面临视频文件过大的问题。尤其是mp4格式的视频&#xff0c;文件大小往往令人望而却步。那么&#xff0c;如何在不影响画质的前提下&#xff0c;有效地压缩mp4视频呢&#xff1f;本文将为您揭秘几种简单实用的压缩技巧。 在分享和存储视频时&am…

ELK+Filebeat+Kafka+Zookeeper

本实验基于ELFK已经搭好的情况下 ELK日志分析 架构解析 第一层、数据采集层 数据采集层位于最左边的业务服务器集群上&#xff0c;在每个业务服务器上面安装了filebeat做日志收集&#xff0c;然后把采集到的原始日志发送到Kafkazookeeper集群上。第二层、消息队列层 原始日志发…

运维锅总详解系统设计原则

本文对CAP、BASE、ACID、SOLID 原则、12-Factor 应用方法论等12种系统设计原则进行分析举例&#xff0c;希望对您在进行系统设计、理解系统运行背后遵循的原理有所帮助&#xff01; 一、CAP、BASE、ACID简介 以下是 ACID、CAP 和 BASE 系统设计原则的详细说明及其应用举例&am…