昇思25天学习打卡营第5天|网络构建

一、简介：

神经网络模型是由神经网络层和Tensor操作构成的，mindspore.nn提供了常见神经网络层的实现，在MindSpore中，Cell类是构建所有网络的基类（这个类和pytorch中的modul类是一样的作用），也是网络的基本单元。一个神经网络模型表示为一个Cell，它由不同的子Cell构成。使用这样的嵌套结构，可以简单地使用面向对象编程的思维，对神经网络结构进行构建和管理。

二、环境准备：

import mindspore
import time
from mindspore import nn, ops

没有下载mindspore的宝子，还是回看我的昇思25天学习打卡营第1天|快速入门-CSDN博客，先下载好再进行下面的操作。

三、神经网络搭建：

1、定义模型类：

我们首先要继承nn.Cell类，并再__init__方法中进行子Cell的实例化和管理，并再construct方法（和pytorch中的forward方法一致）中实现前向计算：

class Network(nn.Cell):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.dense_relu_sequential = nn.SequentialCell(
            nn.Dense(28*28, 512, weight_init="normal", bias_init="zeros"),
            nn.ReLU(),
            nn.Dense(512, 512, weight_init="normal", bias_init="zeros"),
            nn.ReLU(),
            nn.Dense(512, 10, weight_init="normal", bias_init="zeros")
        )

    def construct(self, x):
        x = self.flatten(x)
        logits = self.dense_relu_sequential(x)
        return logits

# 实例化并打印
model = Network()
print(model)
print(time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(time.time())), "VertexGeek")

①self.flatten = nn.Flatten()：创建一个Flatten层，并将其作为类的属性。Flatten层的作用是将输入的数据“压平”，即不管输入数据的原始形状如何，输出都将是沿着特定维度的连续数组。

② self.dense_relu_sequential = nn.SequentialCell(...)：创建一个SequentialCell，它是一种特殊的Cell，可以顺序地执行其中包含的多个层。这个SequentialCell包含了三个全连接层（Dense），每个全连接层后面跟着一个ReLU激活函数层，除了最后一个全连接层：

第一个nn.Dense(28*28, 512, weight_init="normal", bias_init="zeros")：这是一个全连接层，它接受28*28=784个输入，并产生512个输出。权重（weight_init）和偏置（bias_init）分别使用正态分布和零值进行初始化。
nn.ReLU()：ReLU激活函数，其数学表达式为f(x) = max(0, x)，即负值输出为零，正值保持不变。
接下来的两个nn.Dense与对应的nn.ReLU层与第一个类似，它们分别接收512个输入并再次输出512个值，以及最终输出10个值，这可能对应于10个类别。

我们构造一个数据，并使用softmax预测其概率：

X = ops.ones((1, 28, 28), mindspore.float32)
logits = model(X)
# print logits
print(logits)

pred_probab = nn.Softmax(axis=1)(logits)
y_pred = pred_probab.argmax(1)
print(f"Predicted class: {y_pred}")
print(time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(time.time())), "VertexGeek")

2、模型层详解：

（1）nn.Flatten:

nn.Flantten方法用于将输入数据“压平”，以便后续处理：

input_image = ops.ones((3, 28, 28), mindspore.float32)
print(input_image.shape)

flatten = nn.Flatten()
flat_image = flatten(input_image)
print(flat_image.shape)

print(time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(time.time())), "VertexGeek")

（2）nn.Dense：

nn.Dense层作为全连接层，用于对输入的数据进行线性变换和处理：

layer1 = nn.Dense(in_channels=28*28, out_channels=20)
hidden1 = layer1(flat_image)
print(hidden1.shape)

print(time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(time.time())), "VertexGeek")

（3）nn.Relu：

nn.Relu是本次实验中使用的激活函数，用于对神经网络的权重进行处理，以缓解欠拟合和过拟合的发生，常见的激活函数处了Relu，还有：Sigmoid, Tanh等：

print(f"Before ReLU: {hidden1}\n\n")
hidden1 = nn.ReLU()(hidden1)
print(f"After ReLU: {hidden1}")

print(time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(time.time())), "VertexGeek")

（4）nn.SequentialCell:

nn.SequentialCell和pytorch中的nn.Sequential的作用一样，用于存放dense全连接层和激活函数层的组合，以方便在前向计算中使用：

seq_modules = nn.SequentialCell(
    flatten,
    layer1,
    nn.ReLU(),
    nn.Dense(20, 10)
)

logits = seq_modules(input_image)
print(logits.shape)

print(time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(time.time())), "VertexGeek")

（5）nn.Softmax:

nn.softmax方法将神经网络最后一个全连接层返回的logits的值缩放为[0, 1]，表示每个类别的预测概率。axis指定的维度数值和为1。

softmax = nn.Softmax(axis=1)
pred_probab = softmax(logits)
print(pred_probab)
# argmax函数返回指定维度上最大值的索引
y_pred = pred_probab.argmax(1)
print(f"Predicted class: {y_pred}")
print(time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(time.time())), "VertexGeek")

3、模型参数：

网络内部神经网络层具有权重参数和偏置参数（如nn.Dense），这些参数会在训练过程中不断进行优化，可通过 model.parameters_and_names() 来获取参数名及对应的参数详情。

print(f"Model structure: {model}\n\n")

for name, param in model.parameters_and_names():
    print(f"Layer: {name}\nSize: {param.shape}\nValues : {param[:2]} \n")
    
print(time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(time.time())), "VertexGeek")