Ubuntu K8s

https://serious-lose.notion.site/Ubuntu-K8s-d8d6a978ad784c1baa2fc8c531fbce68?pvs=74

2 核 2G Ubuntu 20.4 IP 172.24.53.10

	kubeadm	kubelet	kubectl
版本	1.23.0	1.23.0	1.23.0

kubeadm、kubelet 和 kubectl 是 Kubernetes 生态系统中的三个重要组件

kubeadm：

主要用于 Kubernetes 集群的安装与管理。它提供了一种简单而快速的方法来创建一个 Kubernetes 集群，包括初始化控制平面节点、添加工作节点，以及管理集群的生命周期。
通过 kubeadm init 命令可以在主节点上初始化一个新的集群，而通过 kubeadm join 可以将工作节点加入到已有的集群中。

kubelet：

是 Kubernetes 中的核心组件之一，负责在每个工作节点上管理 Pods 的生命周期。
它负责从 API 服务器获取 Pod 的配置，并确保它们在节点上运行。具体来说，kubelet 会监控容器的状态，并确保它们按照期望的状态运行。
kubelet 也负责执行健康检查，报告节点和容器的状态。

kubectl：

是 Kubernetes 的命令行工具，允许用户与 Kubernetes 集群进行交互。
用户可以使用 kubectl 来部署应用、管理集群资源、排查故障等。例如，kubectl get pods 可以列出当前运行的 Pods，kubectl apply -f <file> 可以根据配置文件创建或更新资源。

Kubernetes（K8s）支持多种容器运行时（Container Runtime），这些运行时负责管理容器的生命周期，包括拉取镜像、创建、运行和停止容器。

以下是一些常见的 Kubernetes 容器运行时：

1. Docker

描述：Docker 是最常用的容器运行时，它通过提供一个简单的方法来打包、分发和运行应用程序，以便在不同环境中保持一致的行为。
状态：虽然 Docker 仍然被广泛使用，但自 Kubernetes 1.20 版本起，Kubernetes 将不再直接支持 Docker 作为容器运行时。Docker 通过其底层的容器运行时（如 containerd）被 Kubernetes 支持。

2. containerd

描述：containerd 是一个高性能的容器运行时，负责管理容器生命周期的基本功能。它是 Docker 中的一个组件，后来被单独发展，以便可以独立使用。
特点：它支持 OCI（Open Container Initiative）标准，因此与 Kubernetes 兼容性好。

3. CRI-O

描述：CRI-O 是一个轻量级的容器运行时，专门为 Kubernetes 的 CRI（Container Runtime Interface）设计。它允许 Kubernetes 使用任何支持 OCI 标准的容器镜像，并提供与 K8s 的良好集成。
特点：CRIo 旨在提供一个简化的环境，消除不必要的组件，从而优化 Kubernetes 的性能。

4. Podman

描述：Podman 是一个无守护进程的容器工具，允许用户以非特权模式运行容器。它与 Docker 的 CLI 接口相似，因此容易上手。
特点：Podman 支持管理 pods 以及单个容器，并具有与 Kubernetes 的集成能力，但它不是 Kubernetes 默认支持的运行时。

5. rkt

描述：rkt（发音为 "rocket"）是 CoreOS 开发的容器运行时，旨在提供更强的安全性和灵活性。尽管 rkt 曾与 Kubernetes 集成，但自 Kubernetes 1.20 版本起，主要得益于 containerd 和 CRI-O 的流行支持。
状态：rkt 在社区中的使用有所减少。

修改主机名

hostnamectl set-hostname master

关闭swap

// 临时关闭
swapoff -a
//  永久关闭

关闭防火墙

ufw disable

查看防火墙状态

ufw status

Status: inactive

设置网桥参数

cat << EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF

安装 kubelet kubeadm kubectl

// 最新版本
sudo apt install -y kubelet kubeadm kubectl

or

// 指定版本
sudo apt  install -y kubelet=1.23.0-00 kubeadm=1.23.0-00 kubectl=1.23.0-00

Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  kubeadm kubectl kubelet
0 upgraded, 3 newly installed, 0 to remove and 85 not upgraded.
Need to get 37.0 MB of archives.
After this operation, 216 MB of additional disk space will be used.
Get:1 <https://mirrors.aliyun.com/kubernetes/apt> kubernetes-xenial/main amd64 kubelet amd64 1.23.0-00 [19.5 MB]
Get:2 <https://mirrors.aliyun.com/kubernetes/apt> kubernetes-xenial/main amd64 kubectl amd64 1.23.0-00 [8,932 kB]
Get:3 <https://mirrors.aliyun.com/kubernetes/apt> kubernetes-xenial/main amd64 kubeadm amd64 1.23.0-00 [8,588 kB]
Fetched 37.0 MB in 2s (17.8 MB/s)    
Selecting previously unselected package kubelet.
(Reading database ... 133517 files and directories currently installed.)
Preparing to unpack .../kubelet_1.23.0-00_amd64.deb ...
Unpacking kubelet (1.23.0-00) ...
Selecting previously unselected package kubectl.
Preparing to unpack .../kubectl_1.23.0-00_amd64.deb ...
Unpacking kubectl (1.23.0-00) ...
Selecting previously unselected package kubeadm.
Preparing to unpack .../kubeadm_1.23.0-00_amd64.deb ...
Unpacking kubeadm (1.23.0-00) ...
Setting up kubectl (1.23.0-00) ...
Setting up kubelet (1.23.0-00) ...
Created symlink /etc/systemd/system/multi-user.target.wants/kubelet.service → /lib/systemd/system/kubelet.service.
Setting up kubeadm (1.23.0-00) ...

锁定版本

sudo apt-mark hold kubelet kubeadm kubectl

预检查，确保您的环境适合 Kubernetes 集群的运行。

sudo kubeadm init phase preflight

I1210 22:20:42.740114  277592 version.go:255] remote version is much newer: v1.31.3; falling back to: stable-1.23
W1210 22:20:52.741562  277592 version.go:103] could not fetch a Kubernetes version from the internet: unable to get URL "<https://dl.k8s.io/release/stable-1.23.txt>": Get "<https://cdn.dl.k8s.io/release/stable-1.23.txt>": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
W1210 22:20:52.741591  277592 version.go:104] falling back to the local client version: v1.23.0
[preflight] Running pre-flight checks
        [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 27.3.1. Latest validated version: 20.10
error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR Port-6443]: Port 6443 is in use  // 端口号被占用
        [ERROR Port-10259]: Port 10259 is in use // 端口号被占用
        [ERROR Port-10257]: Port 10257 is in use // 端口号被占用
        [ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists // 文件已存在
        [ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists // 文件已存在
        [ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists // 文件已存在
        [ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists // 文件已存在
        [ERROR Port-10250]: Port 10250 is in use // 端口号被占用
        [ERROR Port-2379]: Port 2379 is in use // 端口号被占用
        [ERROR Port-2380]: Port 2380 is in use // 端口号被占用
        [ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

kubeadm init 注意 apiserver-advertise-address 改为自己 IP， kubernetes-version 版本改为自己版本

sudo kubeadm init \\
--apiserver-advertise-address=172.24.53.10 \\
--apiserver-bind-port=6443 \\
--image-repository=registry.aliyuncs.com/google_containers \\
--kubernetes-version=v1.23.0 \\
--service-cidr=10.96.0.0/12 \\
--pod-network-cidr=10.244.0.0/16 \\
--ignore-preflight-errors=all

[init] Using Kubernetes version: v1.23.0
[preflight] Running pre-flight checks
        [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 27.3.1. Latest validated version: 20.10
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [izuf6dy59yl2x7ri2b5dmqz kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.24.53.10]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [izuf6dy59yl2x7ri2b5dmqz localhost] and IPs [172.24.53.10 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [izuf6dy59yl2x7ri2b5dmqz localhost] and IPs [172.24.53.10 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 10.012070 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.23" in namespace kube-system with the configuration for the kubelets in the cluster
NOTE: The "kubelet-config-1.23" naming of the kubelet ConfigMap is deprecated. Once the UnversionedKubeletConfigMap feature gate graduates to Beta the default name will become just "kubelet-config". Kubeadm upgrade will handle this transition transparently.
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node izuf6dy59yl2x7ri2b5dmqz as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node izuf6dy59yl2x7ri2b5dmqz as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: j1fth4.uofits58xvvkcfmm
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  <https://kubernetes.io/docs/concepts/cluster-administration/addons/>

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 172.24.53.10:6443 --token j1fth4.uofits58xvvkcfmm \\
        --discovery-token-ca-cert-hash sha256:a2cf513f8205220c3a912467550cf607ac281716b7c0109a96db203898fe58f8

复制配置文件到 .kube 下

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Kubernetes 集群中所有命名空间（namespaces）下的 Pods

kubectl get pods --all-namespaces

NAMESPACE     NAME                                              READY   STATUS    RESTARTS   AGE
kube-system   coredns-6d8c4cb4d-hrfqc                           0/1     Pending   0          24h
kube-system   coredns-6d8c4cb4d-xkvwn                           0/1     Pending   0          24h
kube-system   etcd-izuf6dy59yl2x7ri2b5dmqz                      1/1     Running   1          24h
kube-system   kube-apiserver-izuf6dy59yl2x7ri2b5dmqz            1/1     Running   1          24h
kube-system   kube-controller-manager-izuf6dy59yl2x7ri2b5dmqz   1/1     Running   1          24h
kube-system   kube-proxy-666qb                                  1/1     Running   0          24h
kube-system   kube-scheduler-izuf6dy59yl2x7ri2b5dmqz            1/1     Running   1          24h

Kubernetes 集群中的所有节点（nodes）

kubectl get nodes

NAME                      STATUS     ROLES                  AGE   VERSION
izuf6dy59yl2x7ri2b5dmqz   NotReady   control-plane,master   24h   v1.23.0

安装calico

kubectl apply -f calico.yaml

kubectl get pods --all-namespaces

NAMESPACE     NAME                                              READY   STATUS    RESTARTS      AGE
kube-system   calico-kube-controllers-6d768559b-dpcmf           1/1     Running   0             85s
kube-system   calico-node-w8vlc                                 1/1     Running   0             85s
kube-system   coredns-6d8c4cb4d-hrfqc                           1/1     Running   0             25h
kube-system   coredns-6d8c4cb4d-xkvwn                           1/1     Running   0             25h
kube-system   etcd-izuf6dy59yl2x7ri2b5dmqz                      1/1     Running   1             25h
kube-system   kube-apiserver-izuf6dy59yl2x7ri2b5dmqz            1/1     Running   1             25h
kube-system   kube-controller-manager-izuf6dy59yl2x7ri2b5dmqz   1/1     Running   2 (43s ago)   25h
kube-system   kube-proxy-666qb                                  1/1     Running   0             25h
kube-system   kube-scheduler-izuf6dy59yl2x7ri2b5dmqz            1/1     Running   2 (42s ago)   25h

重新安装 kubelet

卸载 kubelet kubeadm kubectl

sudo apt-get remove --purge kubelet kubeadm kubectl

停止kubelet服务

sudo systemctl stop kubelet

重置kubeadm

sudo kubeadm reset

[reset] Reading configuration from the cluster...
[reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
W1209 23:08:52.957867  100146 reset.go:101] [reset] Unable to fetch the kubeadm-config ConfigMap from cluster: failed to get config map: configmaps "kubeadm-config" not found
[reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: y
[preflight] Running pre-flight checks
W1209 23:08:56.665154  100146 removeetcdmember.go:80] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[reset] Deleting contents of stateful directories: [/var/lib/etcd /var/lib/kubelet /var/lib/dockershim /var/run/kubernetes /var/lib/cni]

The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d

The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.

If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.

The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.

删除/var/lib/kubelet目录下的数据：kubeadm reset命令虽然会删除很多数据，但是/var/lib/kubelet目录下的数据并不会被删除,为了完全恢复到初始状态，需要手动删除这个目录下的数据。

sudo rm -rf /var/lib/kubelet

删除 ~/.kube

sudo rm -rf ~/.kube

master初始化

sudo kubeadm init \\
--apiserver-advertise-address=172.24.53.10 \\
--apiserver-bind-port=6443 \\
--image-repository=registry.aliyuncs.com/google_containers \\
--kubernetes-version=v1.23.0 \\
--service-cidr=10.96.0.0/12 \\
--pod-network-cidr=10.244.0.0/16 \\
--ignore-preflight-errors=all

[init] Using Kubernetes version: v1.23.0
[preflight] Running pre-flight checks
        [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 27.3.1. Latest validated version: 20.10
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [izuf6dy59yl2x7ri2b5dmqz kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.24.53.10]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [izuf6dy59yl2x7ri2b5dmqz localhost] and IPs [172.24.53.10 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [izuf6dy59yl2x7ri2b5dmqz localhost] and IPs [172.24.53.10 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 10.012070 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.23" in namespace kube-system with the configuration for the kubelets in the cluster
NOTE: The "kubelet-config-1.23" naming of the kubelet ConfigMap is deprecated. Once the UnversionedKubeletConfigMap feature gate graduates to Beta the default name will become just "kubelet-config". Kubeadm upgrade will handle this transition transparently.
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node izuf6dy59yl2x7ri2b5dmqz as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node izuf6dy59yl2x7ri2b5dmqz as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: j1fth4.uofits58xvvkcfmm
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  <https://kubernetes.io/docs/concepts/cluster-administration/addons/>

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 172.24.53.10:6443 --token j1fth4.uofits58xvvkcfmm \\
        --discovery-token-ca-cert-hash sha256:a2cf513f8205220c3a912467550cf607ac281716b7c0109a96db203898fe58f8

ERROR

[init] Using Kubernetes version: v1.23.0
[preflight] Running pre-flight checks
        [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 27.3.1. Latest validated version: 20.10
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [izuf6dy59yl2x7ri2b5dmqz kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.24.53.10]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [izuf6dy59yl2x7ri2b5dmqz localhost] and IPs [172.24.53.10 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [izuf6dy59yl2x7ri2b5dmqz localhost] and IPs [172.24.53.10 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL <http://localhost:10248/healthz>' failed with error: Get "<http://localhost:10248/healthz>": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL <http://localhost:10248/healthz>' failed with error: Get "<http://localhost:10248/healthz>": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL <http://localhost:10248/healthz>' failed with error: Get "<http://localhost:10248/healthz>": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL <http://localhost:10248/healthz>' failed with error: Get "<http://localhost:10248/healthz>": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL <http://localhost:10248/healthz>' failed with error: Get "<http://localhost:10248/healthz>": dial tcp [::1]:10248: connect: connection refused.

        Unfortunately, an error has occurred:
                timed out waiting for the condition

        This error is likely caused by:
                - The kubelet is not running
                - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

        If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
                - 'systemctl status kubelet'
                - 'journalctl -xeu kubelet'

        Additionally, a control plane component may have crashed or exited when started by the container runtime.
        To troubleshoot, list all containers using your preferred container runtimes CLI.

        Here is one example how you may list all Kubernetes containers running in docker:
                - 'docker ps -a | grep kube | grep -v pause'
                Once you have found the failing container, you can inspect its logs with:
                - 'docker logs CONTAINERID'

error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

[init] Using Kubernetes version: v1.23.0
[preflight] Running pre-flight checks
        [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 27.3.1. Latest validated version: 20.10
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [izuf6dy59yl2x7ri2b5dmqz kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.24.53.10]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [izuf6dy59yl2x7ri2b5dmqz localhost] and IPs [172.24.53.10 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [izuf6dy59yl2x7ri2b5dmqz localhost] and IPs [172.24.53.10 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL <http://localhost:10248/healthz>' failed with error: Get "<http://localhost:10248/healthz>": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL <http://localhost:10248/healthz>' failed with error: Get "<http://localhost:10248/healthz>": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL <http://localhost:10248/healthz>' failed with error: Get "<http://localhost:10248/healthz>": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL <http://localhost:10248/healthz>' failed with error: Get "<http://localhost:10248/healthz>": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL <http://localhost:10248/healthz>' failed with error: Get "<http://localhost:10248/healthz>": dial tcp [::1]:10248: connect: connection refused.

        Unfortunately, an error has occurred:
                timed out waiting for the condition

        This error is likely caused by:
                - The kubelet is not running
                - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

        If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
                - 'systemctl status kubelet'
                - 'journalctl -xeu kubelet'

        Additionally, a control plane component may have crashed or exited when started by the container runtime.
        To troubleshoot, list all containers using your preferred container runtimes CLI.

        Here is one example how you may list all Kubernetes containers running in docker:
                - 'docker ps -a | grep kube | grep -v pause'
                Once you have found the failing container, you can inspect its logs with:
                - 'docker logs CONTAINERID'

error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

查看详细日志

journalctl -xeu kubelet or init  命令后增加 --v=5 再次执行，查看详细日志

ERROR

“Failed to run kubelet” err=“failed to run Kubelet: misconfiguration: kubelet cgroup driver: “systemd” is different from docker cgroup driver: “cgroupfs”

kubelet 的 cgroup 驱动默认是systemd 修改docker配置文件

vim /etc/docker/daemon.json

增加 "exec-opts": ["native.cgroupdriver=systemd"] 配置 native.cgroupdriver=systemd 为 systemd

{
	"exec-opts": ["native.cgroupdriver=systemd"]
}

重新加载服务配置文件和重启 docker 服务

systemctl daemon-reload && systemctl restart docker

查看docker配置是否修改成功

docker info

Client: Docker Engine - Community
 Version:    27.3.1
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.17.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.29.7
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 13
  Running: 10
  Paused: 0
  Stopped: 3
 Images: 44
 Server Version: 27.3.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 57f17b0a6295a39009d861b89e3b3b87b005ca27
 runc version: v1.1.14-0-g2c9f560
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
 Kernel Version: 5.4.0-182-generic
 Operating System: Ubuntu 20.04.6 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 2
 Total Memory: 1.846GiB
 Name: iZuf6dy59yl2x7ri2b5dmqZ
 ID: b3e06c55-1127-4727-b716-ace762d5ba1b
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: true
 Insecure Registries:
  106.15.127.43
  172.24.53.10
  harbor.net.com
  127.0.0.0/8
 Registry Mirrors:
  <https://lbccmm6e.mirror.aliyuncs.com/>
  <https://docker.registry.cyou/>
  <https://docker-cf.registry.cyou/>
  <https://dockercf.jsdelivr.fyi/>
  <https://docker.jsdelivr.fyi/>
  <https://dockertest.jsdelivr.fyi/>
  <https://mirror.aliyuncs.com/>
  <https://dockerproxy.com/>
  <https://mirror.baidubce.com/>
  <https://docker.m.daocloud.io/>
  <https://docker.nju.edu.cn/>
  <https://docker.mirrors.sjtug.sjtu.edu.cn/>
  <https://docker.mirrors.ustc.edu.cn/>
  <https://mirror.iscas.ac.cn/>
  <https://docker.rainbond.cc/>
 Live Restore Enabled: false

停止Kubernetes

sudo systemctl stop kubelet

重置和清理由 kubeadm 创建的集群状态

sudo kubeadm reset 是 Kubernetes 中的一个命令，用于重置和清理由 kubeadm 创建的集群状态。这个命令非常有用，尤其是在需要删除现有的 Kubernetes 集群或重新初始化集群时。以下是该命令的详细解释：

功能和作用

清除集群状态：
- kubeadm reset 会删除集群中的所有 Kubernetes 配置和状态信息，包括所有通过 kubeadm 安装的组件（如 API server、controller manager、scheduler 等）。
恢复到初始状态：
- 该命令的执行将 Kubernetes 节点恢复到未初始化状态，允许用户重新开始集群的安装或升级过程。
删除 Kubelet 和 CNI 组件：
- Kubelet 的配置文件及其他由 kubeadm 提供的组件（如网络插件）都将被删除。这意味着如果您在节点上安装了网络附加组件（如 Flannel、Calico 等），这些组件也会被移除。
清除 iptables 和网络设置：
- 该命令会清理所有与 Kubernetes 相关的iptables规则，确保从头开始没有多余的网络干扰。

使用场景

重新初始化集群：在测试或开发环境中，您可能需要频繁重置集群以尝试不同的设置或配置。
修复问题：当集群出现无法修复的故障或问题时，可能需要重置集群并重新开始设置。
拆除多余的配置：如果您尝试过多个配置和参数并想开始一个干净的环境，该命令将帮助您实现这一点。

注意事项

数据丢失：使用 kubeadm reset 将删除 Kubernetes 相关的信息和设置，这可能导致丢失所有的 Pod、ReplicaSet、Deployments、Services 等。
持久存储：虽然 kubeadm reset 删除的是集群状态和配置，但如果存储卷是以持久化的方式挂载的，它们可能需要手动清理。
依赖于 Kubelet：在执行此命令之前，请确保 Kubelet 是正在运行的状态。如果 Kubelet 停止工作，可能会导致重置失败。

sudo kubeadm reset

[reset] Reading configuration from the cluster...
[reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
W1209 23:08:52.957867  100146 reset.go:101] [reset] Unable to fetch the kubeadm-config ConfigMap from cluster: failed to get config map: configmaps "kubeadm-config" not found
[reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: y
[preflight] Running pre-flight checks
W1209 23:08:56.665154  100146 removeetcdmember.go:80] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[reset] Deleting contents of stateful directories: [/var/lib/etcd /var/lib/kubelet /var/lib/dockershim /var/run/kubernetes /var/lib/cni]

The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d

The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.

If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.

The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.

重新init

sudo kubeadm init \\
--apiserver-advertise-address=172.24.53.10 \\
--apiserver-bind-port=6443 \\
--image-repository=registry.aliyuncs.com/google_containers \\
--kubernetes-version=v1.23.0 \\
--service-cidr=10.96.0.0/12 \\
--pod-network-cidr=10.244.0.0/16 \\
--ignore-preflight-errors=all \\
--v=5

确认/etc/systemd/system/ 下是否存在 kubelet.service 文件

复制文件到kubelet.service /etc/systemd/system/，注意 /root/kubelet.service 这个是之前备份的路径，或者新建也行 kubelet.service

cp /root/kubelet.service /etc/systemd/system/

vim /etc/systemd/system/kubelet.service

[Unit]
Description=kubelet: The Kubernetes Node Agent
Documentation=http://kubernetes.io/docs/

[Service]
ExecStart=/usr/bin/kubelet
#ExecStartPre=/usr/bin/kubelet-pre-start.sh
Restart=always
StartLimitInterval=0
RestartSec=10

[Install]
WantedBy=multi-user.target

安装 kubectl dashboard

下载配置文件并修改名称

wget <https://raw.githubusercontent.com/kubernetes/dashboard/v2.4.0/aio/deploy/recommended.yaml>
mv recommended.yaml dashboard.yaml

# Copyright 2017 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     <http://www.apache.org/licenses/LICENSE-2.0>
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: v1
kind: Namespace
metadata:
  name: kubernetes-dashboard

---

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard

---

kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
spec:
  ports:
    - port: 443
      targetPort: 8443
      nodePort: 30001  // 增加
  selector:
    k8s-app: kubernetes-dashboard
  type: NodePort // 增加
---

apiVersion: v1
kind: Secret
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-certs
  namespace: kubernetes-dashboard
type: Opaque

---

apiVersion: v1
kind: Secret
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-csrf
  namespace: kubernetes-dashboard
type: Opaque
data:
  csrf: ""

---

apiVersion: v1
kind: Secret
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-key-holder
  namespace: kubernetes-dashboard
type: Opaque

---

kind: ConfigMap
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-settings
  namespace: kubernetes-dashboard

---

kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
rules:
  # Allow Dashboard to get, update and delete Dashboard exclusive secrets.
  - apiGroups: [""]
    resources: ["secrets"]
    resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs", "kubernetes-dashboard-csrf"]
    verbs: ["get", "update", "delete"]
    # Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map.
  - apiGroups: [""]
    resources: ["configmaps"]
    resourceNames: ["kubernetes-dashboard-settings"]
    verbs: ["get", "update"]
    # Allow Dashboard to get metrics.
  - apiGroups: [""]
    resources: ["services"]
    resourceNames: ["heapster", "dashboard-metrics-scraper"]
    verbs: ["proxy"]
  - apiGroups: [""]
    resources: ["services/proxy"]
    resourceNames: ["heapster", "http:heapster:", "https:heapster:", "dashboard-metrics-scraper", "http:dashboard-metrics-scraper"]
    verbs: ["get"]

---

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
rules:
  # Allow Metrics Scraper to get metrics from the Metrics server
  - apiGroups: ["metrics.k8s.io"]
    resources: ["pods", "nodes"]
    verbs: ["get", "list", "watch"]

---

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: kubernetes-dashboard
subjects:
  - kind: ServiceAccount
    name: kubernetes-dashboard
    namespace: kubernetes-dashboard

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kubernetes-dashboard
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kubernetes-dashboard
subjects:
  - kind: ServiceAccount
    name: kubernetes-dashboard
    namespace: kubernetes-dashboard

---

kind: Deployment
apiVersion: apps/v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: kubernetes-dashboard
  template:
    metadata:
      labels:
        k8s-app: kubernetes-dashboard
    spec:
      containers:
        - name: kubernetes-dashboard
          image: swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/kubernetesui/dashboard:v2.7.0
          imagePullPolicy: Always
          ports:
            - containerPort: 8443
              protocol: TCP
          args:
            - --auto-generate-certificates
            - --namespace=kubernetes-dashboard
            # Uncomment the following line to manually specify Kubernetes API server Host
            # If not specified, Dashboard will attempt to auto discover the API server and connect
            # to it. Uncomment only if the default does not work.
            # - --apiserver-host=http://my-address:port
          volumeMounts:
            - name: kubernetes-dashboard-certs
              mountPath: /certs
              # Create on-disk volume to store exec logs
            - mountPath: /tmp
              name: tmp-volume
          livenessProbe:
            httpGet:
              scheme: HTTPS
              path: /
              port: 8443
            initialDelaySeconds: 30
            timeoutSeconds: 30
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
            runAsUser: 1001
            runAsGroup: 2001
      volumes:
        - name: kubernetes-dashboard-certs
          secret:
            secretName: kubernetes-dashboard-certs
        - name: tmp-volume
          emptyDir: {}
      serviceAccountName: kubernetes-dashboard
      nodeSelector:
        "kubernetes.io/os": linux
      # Comment the following tolerations if Dashboard must not be deployed on master
      tolerations:
        - key: node-role.kubernetes.io/master
          effect: NoSchedule

---

kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: dashboard-metrics-scraper
  name: dashboard-metrics-scraper
  namespace: kubernetes-dashboard
spec:
  ports:
    - port: 8000
      targetPort: 8000
  selector:
    k8s-app: dashboard-metrics-scraper

---

kind: Deployment
apiVersion: apps/v1
metadata:
  labels:
    k8s-app: dashboard-metrics-scraper
  name: dashboard-metrics-scraper
  namespace: kubernetes-dashboard
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: dashboard-metrics-scraper
  template:
    metadata:
      labels:
        k8s-app: dashboard-metrics-scraper
    spec:
      securityContext:
        seccompProfile:
          type: RuntimeDefault
      containers:
        - name: dashboard-metrics-scraper
          image: swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/kubernetesui/metrics-scraper:v1.0.8
          ports:
            - containerPort: 8000
              protocol: TCP
          livenessProbe:
            httpGet:
              scheme: HTTP
              path: /
              port: 8000
            initialDelaySeconds: 30
            timeoutSeconds: 30
          volumeMounts:
          - mountPath: /tmp
            name: tmp-volume
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
            runAsUser: 1001
            runAsGroup: 2001
      serviceAccountName: kubernetes-dashboard
      nodeSelector:
        "kubernetes.io/os": linux
      # Comment the following tolerations if Dashboard must not be deployed on master
      tolerations:
        - key: node-role.kubernetes.io/master
          effect: NoSchedule
      volumes:
        - name: tmp-volume
          emptyDir: {}

检查镜像

grep "image:" dashboard.yaml

          image: swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/kubernetesui/dashboard:v2.7.0
          image: swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/kubernetesui/metrics-scraper:v1.0.8

配置为国内镜像

sed -i 's#kubernetesui/dashboard:v2.4.0#swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/kubernetesui/dashboard:v2.7.0#' dashboard.yaml
sed -i 's#kubernetesui/metrics-scraper:v1.0.7#swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/kubernetesui/metrics-scraper:v1.0.8#' dashboard.yaml

部署dashboard

kubectl apply -f dashboard.yaml

namespace/kubernetes-dashboard created
serviceaccount/kubernetes-dashboard created
service/kubernetes-dashboard created
secret/kubernetes-dashboard-certs created
secret/kubernetes-dashboard-csrf created
secret/kubernetes-dashboard-key-holder created
configmap/kubernetes-dashboard-settings created
role.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrole.rbac.authorization.k8s.io/kubernetes-dashboard created
rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
deployment.apps/kubernetes-dashboard created
service/dashboard-metrics-scraper created
deployment.apps/dashboard-metrics-scraper created

检查服务状态

kubectl get pods -n kubernetes-dashboard

NAME                                         READY   STATUS    RESTARTS   AGE
dashboard-metrics-scraper-5b6f6c8f45-wp9dq   1/1     Running   0          13m
kubernetes-dashboard-6c685b564c-gqrf2        1/1     Running   0          13m

网址为

<https://106.15.127.43:30001/#/login>

获取 token

# 创建用户
kubectl create serviceaccount dashboard-admin -n kube-system

serviceaccount/dashboard-admin created


# 用户授权
kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin

clusterrolebinding.rbac.authorization.k8s.io/dashboard-admin created

# 获取用户Token
kubectl describe secrets -n kube-system $(kubectl -n kube-system get secret | awk '/dashboard-admin/{print $1}')

Name:         dashboard-admin-token-m2hps
Namespace:    kube-system
Labels:       <none>
Annotations:  kubernetes.io/service-account.name: dashboard-admin
              kubernetes.io/service-account.uid: 3802d503-3f30-439b-82c8-1a875e115f6c

Type:  kubernetes.io/service-account-token

Data
====
ca.crt:     1099 bytes
namespace:  11 bytes
token:      eyJhbGciOiJSUzI1NiIsImtpZCI6Ind4R0daN0Rla3lFSEdzaWRMS3JpRl9vRWpkT1lieGtXeDVrbS1FdkhyYVkifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkYXNoYm9hcmQtYWRtaW4tdG9rZW4tbTJocHMiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGFzaGJvYXJkLWFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiMzgwMmQ1MDMtM2YzMC00MzliLTgyYzgtMWE4NzVlMTE1ZjZjIiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmUtc3lzdGVtOmRhc2hib2FyZC1hZG1pbiJ9.MB85CeNXsLW1vNH_mKwSFxFcl57RzRc9Ou9CoqrTEVSNC42B4O0qLPdzx6_WD5KLR3jWSB3ynmSgkMHV1uzuve7CeHDK6UeoyR4v4o0Sldk8PGGVaHykacVzKg14kX0qoBHHSXBLJZp8s_-dzzIBNeexUEZiIjoUVKgOJUjp2k-_LGzH4GN8d6oKfo5M37XsDZ5cFQvfedGRNAho-GQkVCssVXt1SJlDwCdefhlmJa_D-awLlE3khrYimt_1CmFIibBoSfHI2Q55CgNveAagCvBM0c6HvmtZDHwdRfT96XwTyN1hJDZwVGFHq3uv-QtLhEAPia1CMWxOvvBQQZQUA

部署应用测试部署一个nginx应用，并以 NodePort 形式暴露此nginx。创建一个 nginx-deploy.yaml 文件

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx-deploy
  name: nginx-deploy
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx-deploy
  template:
    metadata:
      labels:
        app: nginx-deploy
    spec:
      containers:
      - image: registry.cn-shenzhen.aliyuncs.com/xiaohh-docker/nginx:1.25.4
        name: nginx
        ports:
        - containerPort: 80

---

apiVersion: v1
kind: Service
metadata:
  labels:
    app: nginx-deploy
  name: nginx-svc
spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 80
    nodePort: 30080
  selector:
    app: nginx-deploy
  type: NodePort

部署

kubectl apply -f nginx-deploy.yaml

deployment.apps/nginx-deploy created
service/nginx-svc created

查看 pod

kubectl get pods --all-namespaces

NAMESPACE     NAME                                              READY   STATUS    RESTARTS         AGE
default       nginx-deploy-69d69b77bb-gqm6s                     1/1     Running   0                10m // 成功
kube-system   calico-kube-controllers-6d768559b-dpcmf           1/1     Running   3 (30m ago)      22h
kube-system   calico-node-w8vlc                                 1/1     Running   6 (15m ago)      22h
kube-system   coredns-6d8c4cb4d-hrfqc                           1/1     Running   0                2d
kube-system   coredns-6d8c4cb4d-xkvwn                           1/1     Running   0                2d
kube-system   etcd-izuf6dy59yl2x7ri2b5dmqz                      1/1     Running   1                2d
kube-system   kube-apiserver-izuf6dy59yl2x7ri2b5dmqz            1/1     Running   4 (30m ago)      2d
kube-system   kube-controller-manager-izuf6dy59yl2x7ri2b5dmqz   1/1     Running   14 (8m47s ago)   2d
kube-system   kube-proxy-666qb                                  1/1     Running   0                2d
kube-system   kube-scheduler-izuf6dy59yl2x7ri2b5dmqz            1/1     Running   16 (8m49s ago)   2d

删除 Deployment

kubectl delete deployment nginx-deploy --namespace=default

deployment.apps "nginx-deploy" deleted

删除 pod

kubectl delete pod nginx-deploy-69d69b77bb-gqm6s --namespace=default

pod "nginx-deploy-69d69b77bb-gqm6s" deleted

查看 pod

kubectl get pods --all-namespaces

NAMESPACE     NAME                                              READY   STATUS    RESTARTS         AGE
kube-system   calico-kube-controllers-6d768559b-dpcmf           1/1     Running   3 (30m ago)      22h
kube-system   calico-node-w8vlc                                 1/1     Running   6 (15m ago)      22h
kube-system   coredns-6d8c4cb4d-hrfqc                           1/1     Running   0                2d
kube-system   coredns-6d8c4cb4d-xkvwn                           1/1     Running   0                2d
kube-system   etcd-izuf6dy59yl2x7ri2b5dmqz                      1/1     Running   1                2d
kube-system   kube-apiserver-izuf6dy59yl2x7ri2b5dmqz            1/1     Running   4 (30m ago)      2d
kube-system   kube-controller-manager-izuf6dy59yl2x7ri2b5dmqz   1/1     Running   14 (8m47s ago)   2d
kube-system   kube-proxy-666qb                                  1/1     Running   0                2d
kube-system   kube-scheduler-izuf6dy59yl2x7ri2b5dmqz            1/1     Running   16 (8m49s ago)   2d

master

查看当前污点

kubectl describe node <master-node-name>

在输出中查找“Taints”部分，通常 master 节点上会有类似以下的污点：

node-role.kubernetes.io/master:NoSchedule

移除污点

kubectl taint nodes <master-node-name> node-role.kubernetes.io/master:NoSchedule-

kubectl taint nodes izuf6dy59yl2x7ri2b5dmqz node-role.kubernetes.io/master:NoSchedule-

node/izuf6dy59yl2x7ri2b5dmqz untainted

重启Kubernetes

sudo systemctl restart kubelet

Linux虚拟机搭建K8S环境_linux 部署k8环境-CSDN博客

在Ubuntu22.04 LTS上搭建Kubernetes集群-阿里云开发者社区

Ubuntu22安装K8S实战-CSDN博客

Ubuntu下Kubernetes(k8s)集群搭建_ubuntu安装kubernetes-CSDN博客

https://zhuanlan.zhihu.com/p/709347673?utm_campaign=shareopn&utm_medium=social&utm_psn=1849349834034769920&utm_source=wechat_session

https://y2k38.github.io/use-kubeadm-to-deploy-k8s-cluster/#部署k8s

Linux下minikube启动失败(It seems like the kubelet isn‘t running or healthy)(1)-阿里云开发者社区

解决k8s kubeadm init初始化报 - The kubelet is not running - The kubelet is unhealthy due_kubernetes_Long long ago.-K8S/Kubernetes

kuberuntime_sandbox.go:70] “Failed to create sandbox for pod“ err=“rpc error_golang_喝醉酒的小白-K8S/Kubernetes

Ubuntu安装K8S(1.28版本，基于containrd） - 自学精灵

Ubuntu20.04安装/卸载K8S 1.23.17版本_ubuntu 2204 删除kubernetes-CSDN博客

K8S服务搭建过程中出现的憨批错误_failed to load kubelet config file" err="failed to-CSDN博客

Kubernetes集群重置与初始化：kubeadm reset命令-CSDN博客

Kubernetes K8s 解决 This error is likely caused by: - The kubelet is not running-CSDN博客

ubuntu系统安装k8s1.28精简步骤_unbuntu 精简-CSDN博客

Kubernetes因限制内存配置引发的错误 - 墨天轮

pod sandbox rpc error 这个错误通常是由于 Kubernetes 网络配置出现问题导致_mob64ca12df277e的技术博客_51CTO博客

kubeasz/docs/guide/dashboard.md at master · easzlab/kubeasz · GitHub

Ubuntu K8s

重新安装 kubelet

相关文章

【ARM】ARM架构麒麟V10安装jdk1.8

Redis--高并发分布式结构

Repo管理

python 数据分析之地图数据绘制

分治算法（单选题）

探索 HTTP 请求头中的 “Host” 字段及其安全风险

将 Ubuntu 22.04 LTS 升级到 24.04 LTS

渗透测试-前端验签绕过之SHA256

解锁医学数据分析新姿势：堆叠图的奇妙世界

接口测试Day01-HTTP请求

04、GC基础知识

一、STM32MP257开发板初体验

分析M0G突破后急剧下跌内因，x.game阐述不利面延续多久

P8772 求和 P8716 回文日期

【优选算法】二分算法（在排序数组中查找元素的第一个和最后一个位置，寻找峰值，寻找排序数组中的最小值）

升级Ubuntu 24.04 LTS报错“Oh no! Something has gone wrong.”

大模型底座 Transformer 的核心技术解析

2.生成Transformation

1. 机器学习基本知识(3)——机器学习的主要挑战

TypeScript学习路线图