CentOS8.4 部署 k8s 1.31.2

文章目录

配置 aliyun 源
配置时间同步
- 查看
安装 docker
- 下载一些必备工具
- 配置 aliyun 的源
- 更新源
- 删除旧的 podman
- 安装 docker
- 设置开机启动
配置 hosts 表
- 多主机协同可以不写
关闭 swap 分区
配置 iptables
配置 k8s 源
- 初始化 master 节点
- 初始化 node 节点
查看集群状态

[!warning] System CentOS8.4

配置 aliyun 源

cd /etc/yum.repos.d/
mkdir bak
mv *.repo bak
wget -O /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-vault-8.5.2111.repo
yum install -y https://mirrors.aliyun.com/epel/epel-release-latest-8.noarch.rpm
sed -i 's|^#baseurl=https://download.example/pub|baseurl=https://mirrors.aliyun.com|' /etc/yum.repos.d/epel*
sed -i 's|^metalink|#metalink|' /etc/yum.repos.d/epel*
cd

配置时间同步

yum install -y chrony

cp /etc/chrony.conf /etc/chrony.conf.bak

sed -i '3ipool ntp.tencent.com iburst' /etc/chrony.conf
sed -i '4d' /etc/chrony.conf

systemctl restart chronyd.service

查看

chronyc sources -v

看到 ^* 106.55.184.199 2 6 277 40 +638us[ +711us] +/- 43ms 则表示时间同步成功

安装 docker

yum remove docker \
                  docker-client \
                  docker-client-latest \
                  docker-common \
                  docker-latest \
                  docker-latest-logrotate \
                  docker-logrotate \
                  docker-engine

下载一些必备工具

yum install -y yum-utils device-mapper-persistent-data lvm2

配置 aliyun 的源

yum-config-manager \
    --add-repo \
    https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo

更新源

yum makecache

删除旧的 podman

yum erase podman buildah -y

安装 docker

yum install -y docker-ce docker-ce-cli containerd.io

设置开机启动

systemctl enable docker --now

配置 hosts 表

cat >> /etc/hosts << EOF
192.168.142.139 k8s-master
192.168.142.140 k8s-slave1
192.168.142.141 k8s-slave2
EOF

多主机协同可以不写

for i in 140 141 ; do scp /etc/hosts 192.168.142.$i:/etc/hosts ; done

关闭 swap 分区

swapoff -a ;sed -i '/swap/d' /etc/fstab

配置 iptables

设置iptables不对bridge的数据进行处理，启用IP路由转发功能

cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1 
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
sysctl -p /etc/sysctl.d/k8s.conf

配置 k8s 源

cat << EOF | tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.31/rpm/
enabled=1
gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.31/rpm/repodata/repomd.xml.key
EOF
setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
yum install -y kubelet kubeadm kubectl
systemctl enable kubelet && systemctl start kubelet
getenforce

关闭防火墙

systemctl disabled firewalld.service --now

[!Note] 查看 k8s 有哪些版本

yum list --showduplicates kubeadm --disableexcludes=kubernetes

管理 containerd 容器运行时，以确保它能够与 Kubernetes 集群正确交互。

crictl config image-endpoint unix:///run/containerd/containerd.sock
containerd config default > /etc/containerd/config.toml
systemctl restart containerd.service
crictl completion >>/root/.bash_profile

sed -i 's/config_path = \"\"/config_path = \"\/etc\/containerd\/certs.d\"/g' /etc/containerd/config.toml
cat /etc/containerd/config.toml |grep -B1 'config_path'

结果

    [plugins."io.containerd.grpc.v1.cri".registry]
      config_path = "/etc/containerd/certs.d"

针对哪个镜像站加速，就创建哪个文件夹，例如针对docker.io加速，就要在certs.d下创建docker.io文件夹，hosts.toml是固定的名字

mkdir -p /etc/containerd/certs.d/docker.io
cat << EOF | tee /etc/containerd/certs.d/docker.io/hosts.toml
server = "https://docker.io
[host."https://fb273a16b77a4b0f8e84856a8043410d.mirror.swr.myhuaweicloud.com"]
capabilities = ["pull", "resolve"]
EOF

查看

cat /etc/containerd/certs.d/docker.io/hosts.toml

结果：注意格式

server = "https://docker.io
[host."https://fb273a16b77a4b0f8e84856a8043410d.mirror.swr.myhuaweicloud.com"]
capabilities = ["pull", "resolve"]

初始化 master 节点

拉取不下来
k8s_image_master.tar.gz
k8s_image_node.tar.gz
flannel.tar

导入镜像

ctr -n k8s.io images import k8s_image_master.tar.gz

之后会出现一个 root 目录 node 节点同理

将 node 节点发送给 140 141 主机

for i in 140 141 ; do scp k8s_image_node.tar.gz 192.168.142.$i:/root ; done

初始化 master

tar -zxf k8s_image_master.tar.gz

解压之后会有一个 root 目录

mv flannel.tar root

cd root

coredns:v1.11.3.tar.gz                     kube-apiserver:v1.31.0.tar.gz           pause:3.10.tar.gz
etcd:3.5.15-0.tar.gz                       kube-controller-manager:v1.31.0.tar.gz  pause:3.6.tar.gz
flannel-cni-plugin:v1.5.1-flannel2.tar.gz  kube-proxy:v1.31.0.tar.gz
flannel:v0.25.6.tar.gz                     kube-scheduler:v1.31.0.tar.gz
flannel.tar

ctr -n k8s.io images import coredns:v1.11.3.tar.gz
ctr -n k8s.io images import kube-apiserver:v1.31.0.tar.gz
ctr -n k8s.io images import pause:3.10.tar.gz
ctr -n k8s.io images import pause:3.6.tar.gz
ctr -n k8s.io images import etcd:3.5.15-0.tar.gz
ctr -n k8s.io images import kube-controller-manager:v1.31.0.tar.gz
ctr -n k8s.io images import flannel-cni-plugin:v1.5.1-flannel2.tar.gz
ctr -n k8s.io images import flannel:v0.25.6.tar.gz
ctr -n k8s.io images import flannel.tar
ctr -n k8s.io images import scheduler:v1.31.0.tar.gz
ctr -n k8s.io images import kube-proxy:v1.31.0.tar.gz
ctr -n k8s.io images import kube-scheduler:v1.31.0.tar.gz

再次初始化

yum install iproute-tc -y

如果不是第一次初始化就需要执行这条命令

kubeadm reset

结果

W1106 17:20:29.891394   41035 preflight.go:56] [reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: y
[preflight] Running pre-flight checks
W1106 17:20:31.171391   41035 removeetcdmember.go:106] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] Deleted contents of the etcd data directory: /var/lib/etcd
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of directories: [/etc/kubernetes/manifests /var/lib/kubelet /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/super-admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]

The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d

The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.

If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.

The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.

rm -rf $HOME/.kube/*

初始化

kubeadm init --apiserver-advertise-address=192.168.142.139 --image-repository registry.aliyuncs.com/google_containers --pod-network-cidr=10.244.0.0/16

for i in 140 141 ; do scp flannel.tar 192.168.142.$i:/root

配置 master 节点

mkdir /root/.kube
cp /etc/kubernetes/admin.conf /root/.kube/config
kubectl completion bash >> /root/.bash_profile

注意,它会返回一个哈希

kubeadm join 192.168.142.139:6443 --token esar7a.1zafybsw63nlugi7 \
        --discovery-token-ca-cert-hash sha256:97c33c979b8b2de34f26d66c65cec740d46408e7f8d04a9a81cd3f78f5c6f858

master 安装 flannel

cat > kube-flannel.yml << EOF
apiVersion: v1
kind: Namespace
metadata:
  labels:
    k8s-app: flannel
    pod-security.kubernetes.io/enforce: privileged
  name: kube-flannel
---
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: flannel
  name: flannel
  namespace: kube-flannel
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: flannel
  name: flannel
rules:
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - nodes/status
  verbs:
  - patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: flannel
  name: flannel
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: flannel
subjects:
- kind: ServiceAccount
  name: flannel
  namespace: kube-flannel
---
apiVersion: v1
data:
  cni-conf.json: |
    {
      "name": "cbr0",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "portmap",
          "capabilities": {
            "portMappings": true
          }
        }
      ]
    }
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "EnableNFTables": false,
      "Backend": {
        "Type": "vxlan"
      }
    }
kind: ConfigMap
metadata:
  labels:
    app: flannel
    k8s-app: flannel
    tier: node
  name: kube-flannel-cfg
  namespace: kube-flannel
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    app: flannel
    k8s-app: flannel
    tier: node
  name: kube-flannel-ds
  namespace: kube-flannel
spec:
  selector:
    matchLabels:
      app: flannel
      k8s-app: flannel
  template:
    metadata:
      labels:
        app: flannel
        k8s-app: flannel
        tier: node
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/os
                operator: In
                values:
                - linux
      containers:
      - args:
        - --ip-masq
        - --kube-subnet-mgr
        command:
        - /opt/bin/flanneld
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: EVENT_QUEUE_DEPTH
          value: "5000"
        image: docker.io/flannel/flannel:v0.26.0
        name: kube-flannel
        resources:
          requests:
            cpu: 100m
            memory: 50Mi
        securityContext:
          capabilities:
            add:
            - NET_ADMIN
            - NET_RAW
          privileged: false
        volumeMounts:
        - mountPath: /run/flannel
          name: run
        - mountPath: /etc/kube-flannel/
          name: flannel-cfg
        - mountPath: /run/xtables.lock
          name: xtables-lock
      hostNetwork: true
      initContainers:
      - args:
        - -f
        - /flannel
        - /opt/cni/bin/flannel
        command:
        - cp
        image: docker.io/flannel/flannel-cni-plugin:v1.5.1-flannel2
        name: install-cni-plugin
        volumeMounts:
        - mountPath: /opt/cni/bin
          name: cni-plugin
      - args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        command:
        - cp
        image: docker.io/flannel/flannel:v0.26.0
        name: install-cni
        volumeMounts:
        - mountPath: /etc/cni/net.d
          name: cni
        - mountPath: /etc/kube-flannel/
          name: flannel-cfg
      priorityClassName: system-node-critical
      serviceAccountName: flannel
      tolerations:
      - effect: NoSchedule
        operator: Exists
      volumes:
      - hostPath:
          path: /run/flannel
        name: run
      - hostPath:
          path: /opt/cni/bin
        name: cni-plugin
      - hostPath:
          path: /etc/cni/net.d
        name: cni
      - configMap:
          name: kube-flannel-cfg
        name: flannel-cfg
      - hostPath:
          path: /run/xtables.lock
          type: FileOrCreate
        name: xtables-lock
EOF
kubectl apply -f kube-flannel.yml

查看主节点情况，都是 running 就没问题

kubectl get pod -A
kube-flannel   kube-flannel-ds-kk4m2                1/1     Running   0          77s
kube-system    coredns-855c4dd65d-gvdxq             1/1     Running   0          12m
kube-system    coredns-855c4dd65d-wd97x             1/1     Running   0          12m
kube-system    etcd-k8s-master                      1/1     Running   0          12m
kube-system    kube-apiserver-k8s-master            1/1     Running   0          12m
kube-system    kube-controller-manager-k8s-master   1/1     Running   0          12m
kube-system    kube-proxy-d9z9b                     1/1     Running   0          12m
kube-system    kube-scheduler-k8s-master            1/1     Running   0          12m

如果在主节点查看 pod 的时候出现这个报错

kubectl get pod -A

E1106 18:45:35.690658   55755 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"http://localhost:8080/api?timeout=32s\": dial tcp [::1]:8080: connect: connection refused"
E1106 18:45:35.692103   55755 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"http://localhost:8080/api?timeout=32s\": dial tcp [::1]:8080: connect: connection refused"
E1106 18:45:35.693780   55755 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"http://localhost:8080/api?timeout=32s\": dial tcp [::1]:8080: connect: connection refused"
E1106 18:45:35.695284   55755 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"http://localhost:8080/api?timeout=32s\": dial tcp [::1]:8080: connect: connection refused"
E1106 18:45:35.696826   55755 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"http://localhost:8080/api?timeout=32s\": dial tcp [::1]:8080: connect: connection refused"
The connection to the server localhost:8080 was refused - did you specify the right host or port?

解决办法

mkdir ~/.kube
cp /etc/kubernetes/kubelet.conf  ~/.kube/config

再查看

初始化 node 节点

tar -zxf k8s_image_node.tar.gz
mv flannel.tar root/

cd root
ctr -n k8s.io images import flannel-cni-plugin\:v1.5.1-flannel2.tar.gz
ctr -n k8s.io images import flannel.tar
ctr -n k8s.io images import flannel:v0.25.6.tar.gz
ctr -n k8s.io images import kube-proxy:v1.31.0.tar.gz 
ctr -n k8s.io images import pause\:3.6.tar.gz

上面的那个哈希在两个从节点上分别执行

kubeadm join 192.168.142.139:6443 --token ya76as.8n04pysuhk5c3kxx \
        --discovery-token-ca-cert-hash sha256:26e39428118d1d7a8c638e6a5503cc919302747330557b163e84ac0543e7812f

运行结果

[preflight] Running pre-flight checks
        [WARNING FileExisting-tc]: tc not found in system path
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 501.384149ms
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

如果加不进去节点

kubeadm join 192.168.142.139:6443 --token esar7a.1zafybsw63nlugi7 \ > --discovery-token-ca-cert-hash sha256:97c33c979b8b2de34f26d66c65cec740d46408e7f8d04a9a81cd3f78f5c6f858 

# 报错信息
[preflight] Running pre-flight checks error execution phase preflight: [preflight] Some fatal errors occurred: [ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists [ERROR Port-10250]: Port 10250 is in use [ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...` To see the stack trace of this error execute with --v=5 or higher

解决办法

mv /etc/kubernetes/kubelet.conf /etc/kubernetes/kubelet.conf.backup

sudo lsof -i :10250
COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
kubelet 42410 root   11u  IPv6 180886      0t0  TCP *:10250 (LISTEN)

ss -tulnp | awk -F'[:,]' '/10250/ {match($0,/pid=[0-9]+/); if (RSTART)print substr($0, RSTART+4, RLENGTH-4)}' | xargs kill -9

mv /etc/kubernetes/pki/ca.crt /etc/kubernetes/pki/ca.crt.backup

再次加入节点

kubeadm join 192.168.142.139:6443 --token ya76as.8n04pysuhk5c3kxx \
        --discovery-token-ca-cert-hash sha256:26e39428118d1d7a8c638e6a5503cc919302747330557b163e84ac0543e7812f

查看集群状态

节点都加入之后在主节点查看集群状态

kubectl get nodes

NAME         STATUS   ROLES           AGE     VERSION
k8s-master   Ready    control-plane   3m36s   v1.31.2
k8s-slave1   Ready    <none>          54s     v1.31.2
k8s-slave2   Ready    <none>          5s      v1.31.2