本案例介绍如何在MNIST手写数字分类场景中,使用名为MistNet的聚合算法训练联邦学习作业。数据分散在不同的地方(如边缘节点、摄像头等),由于数据隐私和带宽的原因,无法在服务器上聚合。因此,我们不能将所有数据都用于训练。在某些情况下,边缘节点的计算资源有限,甚至没有训练能力。边缘无法从训练过程中获取更新的权重。因此,传统算法(例如,联合平均算法)通常聚合由不同边缘客户端训练的更新权重,在这种情况下无法工作。MistNet 被提议解决这个问题。
MistNet 将 DNN 模型分为两部分,边缘侧的轻量级特征提取器用于从原始数据生成有意义的特征,以及包含云中最多模型层的分类器,用于针对特定任务进行迭代训练。MistNet 实现了可接受的模型效用,同时大大减少了已发布的中间功能造成的隐私泄露。
物体检测实验
假设有两个边缘节点和一个云节点。由于隐私问题,边缘节点上的数据无法迁移到云中。基于此场景,我们将演示mnist示例
安装Sedna
准备数据集
Create data interface for EDGE1_NODE
.
mkdir -p /data/1
cd /data/1
wget https://github.com/ultralytics/yolov5/releases/download/v1.0/coco128.zip
unzip coco128.zip -d COCO
Create data interface for EDGE2_NODE
.
准备镜像
此示例使用以下映像:
聚合工作器:kubeedge/sedna-example-federated-learning-mistnet-yolo-aggregator:v0.4.0
训练工作器:kubeedge/sedna-example-federated-learning-mistnet-yolo-client:v0.4.0
这些图像是由脚本build_images.sh生成的。
创建联合学习作业
创建用于$EDGE1_NODE和$EDGE2_NODE的数据集
kubectl create -f - <<EOF
apiVersion: sedna.io/v1alpha1
kind: Dataset
metadata:
name: "coco-dataset-1"
spec:
url: "/data/1/COCO"
format: "dir"
nodeName: kubeedge1
EOF
kubectl create -f - <<EOF
apiVersion: sedna.io/v1alpha1
kind: Dataset
metadata:
name: "coco-dataset-2"
spec:
url: "/data/2/COCO"
format: "dir"
nodeName: kubeedge2
EOF
创建模型
在 $EDGE 1_NODE 和 $EDGE 2_NODE 中创建目录 /model 和 /pretrained 。
mkdir -p /model
mkdir -p /pretrained
在$CLOUD_NODE主机上创建目录/model和/pretrained(下载链接在这里)
# on the cloud side
mkdir -p /model
mkdir -p /pretrained
cd /pretrained
wget https://kubeedge.obs.cn-north-1.myhuaweicloud.com/examples/yolov5_coco128_mistnet/yolov5.pth
创建模型
kubectl create -f - <<EOF
apiVersion: sedna.io/v1alpha1
kind: Model
metadata:
name: "yolo-v5-model"
spec:
url: "/model/yolov5.pth"
format: "pth"
EOF
kubectl create -f - <<EOF
apiVersion: sedna.io/v1alpha1
kind: Model
metadata:
name: "yolo-v5-pretrained-model"
spec:
url: "/pretrained/yolov5.pth"
format: "pth"
EOF
使用您的S3用户凭据创建一个密钥。(可选)
开始联邦学习
kubectl create -f - <<EOF
apiVersion: sedna.io/v1alpha1
kind: FederatedLearningJob
metadata:
name: yolo-v5
spec:
pretrainedModel: # option
name: "yolo-v5-pretrained-model"
transmitter: # option
ws: { } # option, by default
s3: # optional, but at least one
aggDataPath: "s3://sedna/fl/aggregation_data"
credentialName: mysecret
aggregationWorker:
model:
name: "yolo-v5-model"
template:
spec:
nodeName: $CLOUD_NODE
containers:
- image: kubeedge/sedna-example-federated-learning-mistnet-yolo-aggregator:v0.4.0
name: agg-worker
imagePullPolicy: IfNotPresent
env: # user defined environments
- name: "cut_layer"
value: "4"
- name: "epsilon"
value: "100"
- name: "aggregation_algorithm"
value: "mistnet"
- name: "batch_size"
value: "32"
- name: "epochs"
value: "100"
resources: # user defined resources
limits:
memory: 8Gi
trainingWorkers:
- dataset:
name: "coco-dataset-1"
template:
spec:
nodeName: $EDGE1_NODE
containers:
- image: kubeedge/sedna-example-federated-learning-mistnet-yolo-client:v0.4.0
name: train-worker
imagePullPolicy: IfNotPresent
args: [ "-i", "1" ]
env: # user defined environments
- name: "cut_layer"
value: "4"
- name: "epsilon"
value: "100"
- name: "aggregation_algorithm"
value: "mistnet"
- name: "batch_size"
value: "32"
- name: "learning_rate"
value: "0.001"
- name: "epochs"
value: "1"
resources: # user defined resources
limits:
memory: 2Gi
- dataset:
name: "coco-dataset-2"
template:
spec:
nodeName: $EDGE2_NODE
containers:
- image: kubeedge/sedna-example-federated-learning-mistnet-yolo-client:v0.4.0
name: train-worker
imagePullPolicy: IfNotPresent
args: [ "-i", "2" ]
env: # user defined environments
- name: "cut_layer"
value: "4"
- name: "epsilon"
value: "100"
- name: "aggregation_algorithm"
value: "mistnet"
- name: "batch_size"
value: "32"
- name: "learning_rate"
value: "0.001"
- name: "epochs"
value: "1"
resources: # user defined resources
limits:
memory: 2Gi
EOF