文章目录
- 配置 aliyun 源
- 配置时间同步
- 查看
- 安装 docker
- 下载一些必备工具
- 配置 aliyun 的源
- 更新源
- 删除旧的 podman
- 安装 docker
- 设置开机启动
- 配置 hosts 表
- 多主机协同可以不写
- 关闭 swap 分区
- 配置 iptables
- 配置 k8s 源
- 初始化 master 节点
- 初始化 node 节点
- 查看集群状态
[!warning] System CentOS8.4
配置 aliyun 源
cd /etc/yum.repos.d/
mkdir bak
mv *.repo bak
wget -O /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-vault-8.5.2111.repo
yum install -y https://mirrors.aliyun.com/epel/epel-release-latest-8.noarch.rpm
sed -i 's|^#baseurl=https://download.example/pub|baseurl=https://mirrors.aliyun.com|' /etc/yum.repos.d/epel*
sed -i 's|^metalink|#metalink|' /etc/yum.repos.d/epel*
cd
配置时间同步
yum install -y chrony
cp /etc/chrony.conf /etc/chrony.conf.bak
sed -i '3ipool ntp.tencent.com iburst' /etc/chrony.conf
sed -i '4d' /etc/chrony.conf
systemctl restart chronyd.service
查看
chronyc sources -v
看到 ^* 106.55.184.199 2 6 277 40 +638us[ +711us] +/- 43ms
则表示时间同步成功
安装 docker
yum remove docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-engine
下载一些必备工具
yum install -y yum-utils device-mapper-persistent-data lvm2
配置 aliyun 的源
yum-config-manager \
--add-repo \
https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
更新源
yum makecache
删除旧的 podman
yum erase podman buildah -y
安装 docker
yum install -y docker-ce docker-ce-cli containerd.io
设置开机启动
systemctl enable docker --now
配置 hosts 表
cat >> /etc/hosts << EOF
192.168.142.139 k8s-master
192.168.142.140 k8s-slave1
192.168.142.141 k8s-slave2
EOF
多主机协同可以不写
for i in 140 141 ; do scp /etc/hosts 192.168.142.$i:/etc/hosts ; done
关闭 swap 分区
swapoff -a ;sed -i '/swap/d' /etc/fstab
配置 iptables
设置iptables不对bridge的数据进行处理,启用IP路由转发功能
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
sysctl -p /etc/sysctl.d/k8s.conf
配置 k8s 源
cat << EOF | tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.31/rpm/
enabled=1
gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.31/rpm/repodata/repomd.xml.key
EOF
setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
yum install -y kubelet kubeadm kubectl
systemctl enable kubelet && systemctl start kubelet
getenforce
关闭防火墙
systemctl disabled firewalld.service --now
[!Note] 查看 k8s 有哪些版本
yum list --showduplicates kubeadm --disableexcludes=kubernetes
管理 containerd
容器运行时,以确保它能够与 Kubernetes 集群正确交互。
crictl config image-endpoint unix:///run/containerd/containerd.sock
containerd config default > /etc/containerd/config.toml
systemctl restart containerd.service
crictl completion >>/root/.bash_profile
sed -i 's/config_path = \"\"/config_path = \"\/etc\/containerd\/certs.d\"/g' /etc/containerd/config.toml
cat /etc/containerd/config.toml |grep -B1 'config_path'
结果
[plugins."io.containerd.grpc.v1.cri".registry]
config_path = "/etc/containerd/certs.d"
针对哪个镜像站加速,就创建哪个文件夹,例如针对docker.io加速,就要在certs.d下创建docker.io文件夹,hosts.toml是固定的名字
mkdir -p /etc/containerd/certs.d/docker.io
cat << EOF | tee /etc/containerd/certs.d/docker.io/hosts.toml
server = "https://docker.io
[host."https://fb273a16b77a4b0f8e84856a8043410d.mirror.swr.myhuaweicloud.com"]
capabilities = ["pull", "resolve"]
EOF
查看
cat /etc/containerd/certs.d/docker.io/hosts.toml
结果: 注意格式
server = "https://docker.io
[host."https://fb273a16b77a4b0f8e84856a8043410d.mirror.swr.myhuaweicloud.com"]
capabilities = ["pull", "resolve"]
初始化 master 节点
拉取不下来
k8s_image_master.tar.gz
k8s_image_node.tar.gz
flannel.tar
导入镜像
ctr -n k8s.io images import k8s_image_master.tar.gz
之后会出现一个 root 目录 node 节点同理
将 node 节点 发送给 140 141 主机
for i in 140 141 ; do scp k8s_image_node.tar.gz 192.168.142.$i:/root ; done
初始化 master
tar -zxf k8s_image_master.tar.gz
解压之后会有一个 root 目录
mv flannel.tar root
cd root
coredns:v1.11.3.tar.gz kube-apiserver:v1.31.0.tar.gz pause:3.10.tar.gz
etcd:3.5.15-0.tar.gz kube-controller-manager:v1.31.0.tar.gz pause:3.6.tar.gz
flannel-cni-plugin:v1.5.1-flannel2.tar.gz kube-proxy:v1.31.0.tar.gz
flannel:v0.25.6.tar.gz kube-scheduler:v1.31.0.tar.gz
flannel.tar
ctr -n k8s.io images import coredns:v1.11.3.tar.gz
ctr -n k8s.io images import kube-apiserver:v1.31.0.tar.gz
ctr -n k8s.io images import pause:3.10.tar.gz
ctr -n k8s.io images import pause:3.6.tar.gz
ctr -n k8s.io images import etcd:3.5.15-0.tar.gz
ctr -n k8s.io images import kube-controller-manager:v1.31.0.tar.gz
ctr -n k8s.io images import flannel-cni-plugin:v1.5.1-flannel2.tar.gz
ctr -n k8s.io images import flannel:v0.25.6.tar.gz
ctr -n k8s.io images import flannel.tar
ctr -n k8s.io images import scheduler:v1.31.0.tar.gz
ctr -n k8s.io images import kube-proxy:v1.31.0.tar.gz
ctr -n k8s.io images import kube-scheduler:v1.31.0.tar.gz
再次初始化
yum install iproute-tc -y
如果不是第一次初始化就需要执行这条命令
kubeadm reset
结果
W1106 17:20:29.891394 41035 preflight.go:56] [reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: y
[preflight] Running pre-flight checks
W1106 17:20:31.171391 41035 removeetcdmember.go:106] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] Deleted contents of the etcd data directory: /var/lib/etcd
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of directories: [/etc/kubernetes/manifests /var/lib/kubelet /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/super-admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d
The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.
If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.
The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.
rm -rf $HOME/.kube/*
初始化
kubeadm init --apiserver-advertise-address=192.168.142.139 --image-repository registry.aliyuncs.com/google_containers --pod-network-cidr=10.244.0.0/16
for i in 140 141 ; do scp flannel.tar 192.168.142.$i:/root
配置 master 节点
mkdir /root/.kube
cp /etc/kubernetes/admin.conf /root/.kube/config
kubectl completion bash >> /root/.bash_profile
注意,它会返回一个 哈希
kubeadm join 192.168.142.139:6443 --token esar7a.1zafybsw63nlugi7 \
--discovery-token-ca-cert-hash sha256:97c33c979b8b2de34f26d66c65cec740d46408e7f8d04a9a81cd3f78f5c6f858
master 安装 flannel
cat > kube-flannel.yml << EOF
apiVersion: v1
kind: Namespace
metadata:
labels:
k8s-app: flannel
pod-security.kubernetes.io/enforce: privileged
name: kube-flannel
---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: flannel
name: flannel
namespace: kube-flannel
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: flannel
name: flannel
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: flannel
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-flannel
---
apiVersion: v1
data:
cni-conf.json: |
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "10.244.0.0/16",
"EnableNFTables": false,
"Backend": {
"Type": "vxlan"
}
}
kind: ConfigMap
metadata:
labels:
app: flannel
k8s-app: flannel
tier: node
name: kube-flannel-cfg
namespace: kube-flannel
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
app: flannel
k8s-app: flannel
tier: node
name: kube-flannel-ds
namespace: kube-flannel
spec:
selector:
matchLabels:
app: flannel
k8s-app: flannel
template:
metadata:
labels:
app: flannel
k8s-app: flannel
tier: node
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
containers:
- args:
- --ip-masq
- --kube-subnet-mgr
command:
- /opt/bin/flanneld
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: EVENT_QUEUE_DEPTH
value: "5000"
image: docker.io/flannel/flannel:v0.26.0
name: kube-flannel
resources:
requests:
cpu: 100m
memory: 50Mi
securityContext:
capabilities:
add:
- NET_ADMIN
- NET_RAW
privileged: false
volumeMounts:
- mountPath: /run/flannel
name: run
- mountPath: /etc/kube-flannel/
name: flannel-cfg
- mountPath: /run/xtables.lock
name: xtables-lock
hostNetwork: true
initContainers:
- args:
- -f
- /flannel
- /opt/cni/bin/flannel
command:
- cp
image: docker.io/flannel/flannel-cni-plugin:v1.5.1-flannel2
name: install-cni-plugin
volumeMounts:
- mountPath: /opt/cni/bin
name: cni-plugin
- args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
command:
- cp
image: docker.io/flannel/flannel:v0.26.0
name: install-cni
volumeMounts:
- mountPath: /etc/cni/net.d
name: cni
- mountPath: /etc/kube-flannel/
name: flannel-cfg
priorityClassName: system-node-critical
serviceAccountName: flannel
tolerations:
- effect: NoSchedule
operator: Exists
volumes:
- hostPath:
path: /run/flannel
name: run
- hostPath:
path: /opt/cni/bin
name: cni-plugin
- hostPath:
path: /etc/cni/net.d
name: cni
- configMap:
name: kube-flannel-cfg
name: flannel-cfg
- hostPath:
path: /run/xtables.lock
type: FileOrCreate
name: xtables-lock
EOF
kubectl apply -f kube-flannel.yml
查看主节点情况,都是 running 就没问题
kubectl get pod -A
kube-flannel kube-flannel-ds-kk4m2 1/1 Running 0 77s
kube-system coredns-855c4dd65d-gvdxq 1/1 Running 0 12m
kube-system coredns-855c4dd65d-wd97x 1/1 Running 0 12m
kube-system etcd-k8s-master 1/1 Running 0 12m
kube-system kube-apiserver-k8s-master 1/1 Running 0 12m
kube-system kube-controller-manager-k8s-master 1/1 Running 0 12m
kube-system kube-proxy-d9z9b 1/1 Running 0 12m
kube-system kube-scheduler-k8s-master 1/1 Running 0 12m
如果在主节点查看 pod 的时候出现这个报错
kubectl get pod -A
E1106 18:45:35.690658 55755 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"http://localhost:8080/api?timeout=32s\": dial tcp [::1]:8080: connect: connection refused"
E1106 18:45:35.692103 55755 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"http://localhost:8080/api?timeout=32s\": dial tcp [::1]:8080: connect: connection refused"
E1106 18:45:35.693780 55755 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"http://localhost:8080/api?timeout=32s\": dial tcp [::1]:8080: connect: connection refused"
E1106 18:45:35.695284 55755 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"http://localhost:8080/api?timeout=32s\": dial tcp [::1]:8080: connect: connection refused"
E1106 18:45:35.696826 55755 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"http://localhost:8080/api?timeout=32s\": dial tcp [::1]:8080: connect: connection refused"
The connection to the server localhost:8080 was refused - did you specify the right host or port?
解决办法
mkdir ~/.kube
cp /etc/kubernetes/kubelet.conf ~/.kube/config
再查看
初始化 node 节点
tar -zxf k8s_image_node.tar.gz
mv flannel.tar root/
cd root
ctr -n k8s.io images import flannel-cni-plugin\:v1.5.1-flannel2.tar.gz
ctr -n k8s.io images import flannel.tar
ctr -n k8s.io images import flannel:v0.25.6.tar.gz
ctr -n k8s.io images import kube-proxy:v1.31.0.tar.gz
ctr -n k8s.io images import pause\:3.6.tar.gz
上面的那个哈希在两个从节点上分别执行
kubeadm join 192.168.142.139:6443 --token ya76as.8n04pysuhk5c3kxx \
--discovery-token-ca-cert-hash sha256:26e39428118d1d7a8c638e6a5503cc919302747330557b163e84ac0543e7812f
运行结果
[preflight] Running pre-flight checks
[WARNING FileExisting-tc]: tc not found in system path
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 501.384149ms
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
如果加不进去节点
kubeadm join 192.168.142.139:6443 --token esar7a.1zafybsw63nlugi7 \ > --discovery-token-ca-cert-hash sha256:97c33c979b8b2de34f26d66c65cec740d46408e7f8d04a9a81cd3f78f5c6f858
# 报错信息
[preflight] Running pre-flight checks error execution phase preflight: [preflight] Some fatal errors occurred: [ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists [ERROR Port-10250]: Port 10250 is in use [ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...` To see the stack trace of this error execute with --v=5 or higher
解决办法
mv /etc/kubernetes/kubelet.conf /etc/kubernetes/kubelet.conf.backup
sudo lsof -i :10250
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
kubelet 42410 root 11u IPv6 180886 0t0 TCP *:10250 (LISTEN)
ss -tulnp | awk -F'[:,]' '/10250/ {match($0,/pid=[0-9]+/); if (RSTART)print substr($0, RSTART+4, RLENGTH-4)}' | xargs kill -9
mv /etc/kubernetes/pki/ca.crt /etc/kubernetes/pki/ca.crt.backup
再次加入节点
kubeadm join 192.168.142.139:6443 --token ya76as.8n04pysuhk5c3kxx \
--discovery-token-ca-cert-hash sha256:26e39428118d1d7a8c638e6a5503cc919302747330557b163e84ac0543e7812f
查看集群状态
节点都加入之后在主节点查看集群状态
kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready control-plane 3m36s v1.31.2
k8s-slave1 Ready <none> 54s v1.31.2
k8s-slave2 Ready <none> 5s v1.31.2