背景:断电重启经常会导致磁盘io错误,甚至出现磁盘坏块
这时可以使用xfs_repair来修复磁盘,但是修复过程可能会导致部分数据丢失
xfs_repair -f -L /dev/sdc
问题一:
Apr 15 19:27:15 Centos7.6 systemd[1]: Unit docker.service entered failed state.
Apr 15 19:27:15 Centos7.6 systemd[1]: docker.service failed.
Apr 15 19:27:20 Centos7.6 systemd[1]: docker.service holdoff time over, scheduling restart.
Apr 15 19:27:20 Centos7.6 systemd[1]: Stopped Docker Application Container Engine.
-- Subject: Unit docker.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit docker.service has finished shutting down.
Apr 15 19:27:20 Centos7.6 systemd[1]: Starting Docker Application Container Engine...
-- Subject: Unit docker.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit docker.service has begun starting up.
Apr 15 19:27:21 Centos7.6 systemd[1]: Started Docker Application Container Engine.
-- Subject: Unit docker.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit docker.service has finished starting up.
--
-- The start-up result is done.
Apr 15 19:27:21 Centos7.6 dockerd[1594]: time="2024-04-15T19:27:21.152308434+08:00" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.devmapp
Apr 15 19:27:21 Centos7.6 dockerd[1594]: time="2024-04-15T19:27:21.152552054+08:00" level=warning msg="could not use snapshotter devmapper in metadata plugin" er
Apr 15 19:27:21 Centos7.6 dockerd[1594]: panic: freepages: failed to get all reachable pages (the first key[0]=(hex)637265617465646174 on branch page(1164) needs
Apr 15 19:27:21 Centos7.6 dockerd[1594]: goroutine 87 [running]:
Apr 15 19:27:21 Centos7.6 dockerd[1594]: github.com/containerd/containerd/vendor/go.etcd.io/bbolt.(*DB).freepages.func2()
Apr 15 19:27:21 Centos7.6 dockerd[1594]: /go/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/db.go:1178 +0x99
Apr 15 19:27:21 Centos7.6 dockerd[1594]: created by github.com/containerd/containerd/vendor/go.etcd.io/bbolt.(*DB).freepages
Apr 15 19:27:21 Centos7.6 dockerd[1594]: /go/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/db.go:1176 +0x1ea
Apr 15 19:27:21 Centos7.6 dockerd[1594]: time="2024-04-15T19:27:21.158698774+08:00" level=error msg="containerd did not exit successfully" error="exit status 2"
查看meta.db文件
[root@Centos7 data2]# find / -name meta.db
/data2/kube/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db
操作方式
mv /data2/kube/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db /tmp
mv /data/kube/docker/network/files/local-kv.db /tmp/
重启docker服务
问题二、启动报错权限问题
问题原因:XC操作系统安装装安全套件,可以通过getstatus查看状态
解决方式:给docker权限
kysec_set exectl -v kysoft -f /data/kube/bin/docker
kysec_set exectl -v kysoft -f /data/kube/bin/dockerd