部署说明
- 步骤1~4 master和node都需执行
- 步骤 5.1 三台master都执行,步骤 5.2 随便一台机器执行
- 步骤5.3根据需要选择部署etcd;堆叠etcd更简单部署更快,外部etcd部署麻烦方便管理;
- 步骤5.4 根据选择部署的etcd方式选择k8s集群初始化方式,建议在以存在虚ip集群上操作;
- 步骤5.5 随便一台master部署calico
- 步骤5.6 随便一台master验证集群证书有效期
- 步骤5.7 三台master都验证一下集群高可用
- 步骤6 node加入k8s集群
- 步骤7 master controller-manager需要开启证书自动签发,master和node需要开启kubelet证书轮换功能然后验证
1、部署版本信息
docker版本信息
名称 | 版本 |
---|---|
docker | 18.09.1 |
kubernetes版本信息
名称 | 版本 |
---|---|
kube-apiserver | v1.20.0 |
kube-controller-manager | v1.20.0 |
kube-scheduler | v1.20.0 |
kubeadm | v1.20.0 |
kube-proxy | v1.20.0 |
kubectl | v1.20.0 |
etcd | v3.4.13 |
VIP(虚ip) | 192.168.3.100 |
机器信息
操作系统 | 机器ip | 规格 | 内核版本 | k8s角色 |
---|---|---|---|---|
centos7.8 | 192.168.3.101 | 2C/4G | 3.10 | master01 |
centos7.8 | 192.168.3.102 | 2C/4G | 3.10 | master02 |
centos7.8 | 192.168.3.103 | 2C/4G | 3.10 | master03 |
centos7.8 | 192.168.3.104 | 2C/4G | 3.10 | node01 |
2、初始化
2.1、命令补全
yum -y install bash-completion
source /usr/share/bash-completion/bash_completion
echo 'source /usr/share/bash-completion/bash_completion' >> ~/.bashrc
2.2、修改主机名
mater执行
hostnamectl set-hostname master01
cat >> /etc/hosts << EOF
192.168.3.101 master01
EOF
node执行
hostnamectl set-hostname node01
cat >> /etc/hosts << EOF
192.168.3.101 master01
192.168.3.102 node01
EOF
2.3、关闭swap
swapoff -a
sed -i 's/.*swap.*/#&/' /etc/fstab
2.4、关闭SELinux
setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
2.5、关闭firewalld
systemctl stop firewalld
systemctl disable firewalld
2.6、修改时区
查看时区
timedatectl
修改时区为亚洲/上海
方法1:
sudo timedatectl set-timezone Asia/Shanghai
方法2:
sudo ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
2.7、时间同步
yum -y install ntp
cat >> /etc/ntp.conf <<EOF
server ntp.aliyun.com
EOF
systemctl restart ntpd
systemctl enable ntpd
2.8、内核优化
cat > /etc/sysctl.conf <<EOF
###################################################################
#路由转发
net.ipv4.ip_forward = 1
net.ipv4.conf.all.rp_filter=0
net.ipv4.conf.default.rp_filter=0
#表示开启重用。允许将TIME-WAIT sockets重新用于新的TCP连接,默认为0,表示关闭;
net.ipv4.tcp_syncookies = 1
#一个布尔类型的标志,控制着当有很多的连接请求时内核的行为。启用的话,如果服务超载,内核将主动地发送RST包。
net.ipv4.tcp_abort_on_overflow = 1
#表示系统同时保持TIME_WAIT的最大数量,如果超过这个数字,TIME_WAIT将立刻被清除并打印警告信息。
#默认为180000,改为6000。对于Apache、Nginx等服务器,此项参数可以控制TIME_WAIT的最大数量,服务器被大量的TIME_WAIT拖死
net.ipv4.tcp_max_tw_buckets = 6000
#有选择的应答
net.ipv4.tcp_sack = 1
#该文件表示设置tcp/ip会话的滑动窗口大小是否可变。参数值为布尔值,为1时表示可变,为0时表示不可变。tcp/ip通常使用的窗口最大可达到65535 字节,对于高速网络.
#该值可能太小,这时候如果启用了该功能,可以使tcp/ip滑动窗口大小增大数个数量级,从而提高数据传输的能力。
net.ipv4.tcp_window_scaling = 1
#TCP接收缓冲区
net.ipv4.tcp_rmem = 4096 3145728 6291456
#TCP发送缓冲区
net.ipv4.tcp_wmem = 4096 66384 4194304
# # Out of socket memory
net.ipv4.tcp_mem = 94500000 915000000 927000000
#该文件表示每个套接字所允许的最大缓冲区的大小。
net.core.optmem_max = 81920
#该文件指定了发送套接字缓冲区大小的缺省值(以字节为单位)。
net.core.wmem_default = 8388608
#指定了发送套接字缓冲区大小的最大值(以字节为单位)。
net.core.wmem_max = 16777216
#指定了接收套接字缓冲区大小的缺省值(以字节为单位)。
net.core.rmem_default = 8388608
#指定了接收套接字缓冲区大小的最大值(以字节为单位)。
net.core.rmem_max = 16777216
#表示SYN队列的长度,默认为1024,加大队列长度为10200000,可以容纳更多等待连接的网络连接数。
net.ipv4.tcp_max_syn_backlog = 1020000
#每个网络接口接收数据包的速率比内核处理这些包的速率快时,允许送到队列的数据包的最大数目。
net.core.netdev_max_backlog = 862144
#web 应用中listen 函数的backlog 默认会给我们内核参数的net.core.somaxconn 限制到128,而nginx 定义的NGX_LISTEN_BACKLOG 默认为511,所以有必要调整这个值。
net.core.somaxconn = 65535
#系统中最多有多少个TCP 套接字不被关联到任何一个用户文件句柄上。如果超过这个数字,孤儿连接将即刻被复位并打印出警告信息。
#这个限制仅仅是为了防止简单的DoS 攻击,不能过分依靠它或者人为地减小这个值,更应该增加这个
net.ipv4.tcp_max_orphans = 327680
#时间戳可以避免序列号的卷绕。一个1Gbps 的链路肯定会遇到以前用过的序列号。时间戳能够让内核接受这种“异常”的数据包。这里需要将其关掉。
net.ipv4.tcp_timestamps = 0
#为了打开对端的连接,内核需要发送一个SYN 并附带一个回应前面一个SYN 的ACK。也就是所谓三次握手中的第二次握手。这个设置决定了内核放弃连接之前发送SYN+ACK 包的数量。
net.ipv4.tcp_synack_retries = 1
#在内核放弃建立连接之前发送SYN 包的数量。
net.ipv4.tcp_syn_retries = 1
#表示开启TCP连接中TIME-WAIT sockets的快速回收,默认为0,表示关闭;
net.ipv4.tcp_tw_reuse = 1
#表示开启重用。允许将TIME-WAIT sockets重新用于新的TCP连接,默认为0,表示关闭;
net.ipv4.tcp_tw_reuse = 1
#修改系統默认的 TIMEOUT 时间。
net.ipv4.tcp_fin_timeout = 15
#表示当keepalive起用的时候,TCP发送keepalive消息的频度。缺省是2小时,建议改为20分钟。
net.ipv4.tcp_keepalive_time = 30
#表示用于向外连接的端口范围。缺省情况下很小:32768到61000,改为10000到65000。(注意:这里不要将最低值设的太低,否则可能会占用掉正常的端口!)
net.ipv4.ip_local_port_range = 1024 65000
#最大限度使用物理内存,然后才是swap空间
vm.swappiness = 0
#开启ipv4和ipv6网桥模式
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
###################################################################
EOF
sysctl -p
# 如果执行sysctl -p报错sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: 没有那个文件或目录
modprobe br_netfilter (手动加载)
#开机自动加载
chmod +x /etc/rc.local
vim /etc/rc.local
# 在exit 0之前加如下
modprobe br_netfilter
systemctl start rc-local.service
cat > /etc/rc.local <<EOF
ulimit -SHn 655350
EOF
cat > /etc/profile <<EOF
ulimit -SHn 655350
EOF
source /etc/profile
2.9、检查DNS配置
cat > /etc/resolv.conf <<EOF
nameserver 114.114.114.114
EOF
3、安装docker
yum install -y yum-utils
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
yum makecache fast
yum -y install docker-ce-18.09.1-3.el7
mkdir -p /data1/msp
mkdir /etc/docker
cat >/etc/docker/daemon.json << EOF
{
"registry-mirrors": ["https://xxxxxx"],
"insecure-registries": ["http://xxxxxx"],
"data-root": "/xxxxxx",
"log-driver": "json-file",
"log-opts": {"max-size": "30m","max-file": "3"},
"exec-opts": ["native.cgroupdriver=systemd"]
}
EOF
registry-mirrors: 公有仓库修改为自己的,也可以不设置
insecure-registries: 私有仓库修改为自己的,也可以不设置
data-root: 数据存储目录改为自己的
log-driver: "json-file" 指定docker日志存储驱,默认
log-opts: {"max-size": "30m","max-file": "3"} 限制容器生成日志大小及个数
systemctl daemon-reload && systemctl restart docker.service
systemctl enable docker.service
4、安装Kubernetes组件
4.1、安装kubelet、kubeadm、kubectl
阿里云旧版本Kubernetes repo源,
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
yum makecache fast
yum install -y kubelet-1.20.0-0 kubeadm-1.20.0-0 kubectl-1.20.0-0
systemctl enable kubelet
4.2、安装ipvsadm
# 系统加载IPVS 所需的内核模块
cat > /etc/modules-load.d/ipvs.conf <<EOF
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack_ipv4
EOF
# 手动加载ipvsadm模块
sudo modprobe ip_vs
sudo modprobe ip_vs_rr
sudo modprobe ip_vs_wrr
sudo modprobe ip_vs_sh
sudo modprobe nf_conntrack_ipv4
# 开机加载ipvsadm模块
vim /etc/rc.local
# 在exit 0之前加如下
sudo modprobe ip_vs
sudo modprobe ip_vs_rr
sudo modprobe ip_vs_wrr
sudo modprobe ip_vs_sh
sudo modprobe nf_conntrack_ipv4
# 验证是否加载成功
lsmod | grep ip_vs
lsmod | grep nf_conntrack_ipv4
# 安装ipvsadm
yum install -y ipvsadm
5、master操作
5.1、master节点部署高可用工具
部署keepalived+haproxy
三台master节点都部署haproxy
- 三台maser节点haproxy.cfg配置文件一样
yum install -y haproxy
mv /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.bak
cat > /etc/haproxy/haproxy.cfg <<EOF
global
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
stats socket /var/lib/haproxy/stats
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
listen k8s-apiserver
bind *:16443 # 本机端口
mode tcp
balance roundrobin
timeout server 900s
timeout connect 15s
server app1 192.168.3.101:6443 check port 6443 inter 5000 fall 5 # 监听matser01 6443端口
server app2 192.168.3.102:6443 check port 6443 inter 5000 fall 5 # 监听matser02 6443端口
server app3 192.168.3.103:6443 check port 6443 inter 5000 fall 5 # 监听matser03 6443端口
frontend stats-front
bind *:8081
mode http
default_backend stats-back
backend stats-back
mode http
balance roundrobin
stats hide-version
stats uri /haproxy/stats
stats auth admin:admin
EOF
systemctl start haproxy.service
systemctl enable haproxy.service
三台master节点都部署keepalived
- 三台mater都部署keepalived,mater01为主节点 master02为备节点比备节点master03权重大
yum install -y keepalived
mv /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak
主节点master01配置文件
cat > /etc/keepalived/keepalived.conf <<EOF
vrrp_script chk_haproxy_port { # 配置检查haproxy脚本内容
script "/usr/local/scripts/haproxy_pid.sh"
interval 2 # 康检查的间隔时间为2s
weight -15 # 健康检查失败后权重值会降低15
}
vrrp_instance k8s {
state MASTER # 节点角色,主节点(MASTER) 备节点(BACKUP)
interface ens33 # 主机网卡名称
virtual_router_id 50 # 组id
priority 100 # 本机权重值
authentication {
auth_type PASS # 认证方式为密码认证
auth_pass 123456 # 认证密码
}
track_script {
chk_haproxy__port # 配置执行检查haproxy脚本名称
}
virtual_ipaddress {
192.168.3.100 #vip地址
}
}
EOF
备节点master02配置文件
cat > /etc/keepalived/keepalived.conf <<EOF
vrrp_script chk_haproxy_port { # 配置检查haproxy脚本内容
script "/usr/local/scripts/haproxy_pid.sh"
interval 2 # 康检查的间隔时间为2s
weight -15 # 健康检查失败后权重值会降低15
}
vrrp_instance k8s {
state BACKUP # 节点角色,主节点(MASTER) 备节点(BACKUP)
interface ens33 # 主机网卡名称
virtual_router_id 50 # 组id
priority 90 # 本机权重值
authentication {
auth_type PASS # 认证方式为密码认证
auth_pass 123456 # 认证密码
}
track_script {
chk_haproxy__port # 配置执行检查haproxy脚本名称
}
virtual_ipaddress {
192.168.3.100 #vip地址
}
}
EOF
备节点master03配置文件
cat > /etc/keepalived/keepalived.conf <<EOF
vrrp_script chk_haproxy_port { # 配置检查haproxy脚本内容
script "/usr/local/scripts/haproxy_pid.sh"
interval 2 # 康检查的间隔时间为2s
weight -15 # 健康检查失败后权重值会降低15
}
vrrp_instance k8s {
state BACKUP # 节点角色,主节点(MASTER) 备节点(BACKUP)
interface ens33 # 主机网卡名称
virtual_router_id 50 # 组id
priority 80 # 本机权重值
authentication {
auth_type PASS # 认证方式为密码认证
auth_pass 123456 # 认证密码
}
track_script {
chk_haproxy__port # 配置执行检查haproxy脚本名称
}
virtual_ipaddress {
192.168.3.100 #vip地址
}
}
EOF
创建haproxy健康检查脚本
mkdir -p /usr/local/scripts
cat > /usr/local/scripts/haproxy_pid.sh << 'EOF'
#!/bin/bash
A=`pgrep haproxy|wc -l`
if [ $A -eq 0 ];then
systemctl start haproxy.service #尝试重新启动haproxy
sleep 2 #睡眠2秒
if [ `pgrep haproxy | wc -l` -eq 0 ]; then
pkill haproxy #启动失败,将haproxy服务杀死。将vip漂移到其它备份节点
fi
fi
EOF
chmod +x /usr/local/scripts/haproxy_pid.sh
systemctl start keepalived.service
systemctl enable keepalived.service
验证
1、重启master01看vip是否在master02
2、重启master01和master02看vip是否在master03
3、重启master02、master03看vip是否在master01
5.2、编译kubeadm
安装Go语言环境和工具
yum install -y git make gcc gcc-c++ wget
下载并安装Go
wget https://dl.google.com/go/go1.15.linux-amd64.tar.gz
sudo tar -C /usr/local -xzf go1.15.linux-amd64.tar.gz
export PATH=$PATH:/usr/local/go/bin
export GOPATH=$HOME/go
载特定版本源码:https://github.com/kubernetes/kubernetes/releases
wget https://github.com/kubernetes/kubernetes/archive/refs/tags/v1.20.0.tar.gz
tar xf v1.20.0.tar.gz
mv kubernetes-1.20.0 kubernetes
cd kubernetes
修改有效期代码
vim ./staging/src/k8s.io/client-go/util/cert/cert.go
NotAfter: now.Add(duration365d * 100).UTC(),
vim ./cmd/kubeadm/app/constants/constants.go
CertificateValidity = time.Hour * 24 * 365 * 100 # 普通证书时间
编译kubeadm
cd /root/kubernetes
make WHAT=cmd/kubeadm
替换系统中的kubeadm
sudo mv _output/bin/kubeadm /usr/local/bin/kubeadm
sudo chmod +x /usr/bin/kubeadm
验证版本
kubeadm version
5.3、部署etcd
etcd部署方式有堆叠和外部,根据自己需要选择部署
堆叠etcd
- 堆叠(Stacked) HA 集群是一种这样的拓扑,其中 etcd 分布式数据存储集群堆叠在 kubeadm 管理的控制平面节点上,作为控制平面的一个组件运行。
- 每个控制平面节点运行 kube-apiserver,kube-scheduler 和 kube-controller-manager 实例。
kube-apiserver 使用负载均衡器暴露给工作节点。 - 每个控制平面节点创建一个本地 etcd 成员(member),这个 etcd 成员只与该节点的 kube-apiserver 通信。这同样适用于本地 kube-controller-manager 和 kube-scheduler 实例。
拓扑
堆叠etcd不需要手动部署,只需在k8s集群init时加入如下内容
# 在k8s集群初始化文件添加如下内容
vim kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
......
etcd:
local: # etcd使用本地模式
dataDir: "/data1/etcd/data" # etcd存储目录
......
外部etcd
- 具有外部 etcd 的 HA 集群是一种这样的拓扑,其中 etcd 分布式数据存储集群在独立于控制平面节点的其他节点上运行。
- 就像堆叠的 etcd 拓扑一样,外部 etcd 拓扑中的每个控制平面节点都运行 kube-apiserver,kube-scheduler 和 kube-controller-manager 实例。同样, kube-apiserver 使用负载均衡器暴露给工作节点。但是,etcd 成员在不同的主机上运行,每个 etcd 主机与每个控制平面节点的 kube-apiserver 通信。
- 这种拓扑结构解耦了控制平面和 etcd 成员。因此,它提供了一种 HA 设置,其中失去控制平面实例或者 etcd 成员的影响较小,并且不会像堆叠的 HA 拓扑那样影响集群冗余。
- 此拓扑需要两倍于堆叠 HA 拓扑的主机数量。具有此拓扑的 HA 集群至少需要三个用于控制平面节点的主机和三个用于 etcd 节点的主机。
拓扑
制作etcd证书
cfssl_1.6.0版本工具
下载cfssl
wget https://github.com/cloudflare/cfssl/releases/download/v1.6.0/cfssl_1.6.0_linux_amd64
wget https://github.com/cloudflare/cfssl/releases/download/v1.6.0/cfssljson_1.6.0_linux_amd64
wget https://github.com/cloudflare/cfssl/releases/download/v1.6.0/cfssl-certinfo_1.6.0_linux_amd64
mv cfssl_1.6.0_linux_amd64 cfssl
mv cfssl-certinfo_1.6.0_linux_amd64 cfssl-certinfo
mv cfssljson_1.6.0_linux_amd64 cfssljson
chmod +x cfssl*
mv cfssl* /usr/bin/
cfssl version
Version: 1.6.0
Runtime: go1.12.12
创建生成证书临时目录
mkdir /root/etcd
# 创建生成ca需要文件,并设置ca有效期为10年
cat > ca-config.json <<EOF
{
"signing": {
"default": {
"expiry": "876000h"
},
"profiles": {
"server": {
"expiry": "876000h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
},
"client": {
"expiry": "876000h",
"usages": [
"signing",
"key encipherment",
"client auth"
]
},
"peer": {
"expiry": "876000h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
EOF
cat > ca-csr.json <<EOF
{
"CN": "etcd",
"key": {
"algo": "rsa",
"size": 2048
},
"ca": {
"expiry": "876000h"
}
}
EOF
生成CA证书和私钥
cfssl gencert -initca ca-csr.json | cfssljson -bare ca
查看当前有哪些文件
ls
ca-config.json ca-csr.json ca-key.pem ca.pem ca.csr
生成客户端证书
cat > client.json <<EOF
{
"CN": "client",
"key": {
"algo": "ecdsa",
"size": 256
}
}
EOF
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client client.json | cfssljson -bare client -
查看当前有哪些文件
ls
ca-config.json ca.csr ca-csr.json ca-key.pem ca.pem client.csr client.json client-key.pem client.pem
生成etcd server、peer证书,hosts需要填写所有etcd节点ip
cat > etcd.json <<EOF
{
"CN": "etcd",
"hosts": [
"127.0.0.1",
"192.168.3.101",
"192.168.3.102",
"192.168.3.103"
],
"key": {
"algo": "ecdsa",
"size": 256
},
"names": [{
"C": "CN",
"L": "SH",
"ST": "SH"
}]
}
EOF
查看当前有哪些文件
ls
ca-config.json ca-csr.json ca.pem client.json client.pem peer.csr peer.pem server-key.pem
ca.csr ca-key.pem client.csr client-key.pem etcd.json peer-key.pem server.csr server.pem
所有master节点创建k8s etcd证书存储路径
mkdir -p /etc/kubernetes/pki/etcd
分发证书到每个matser(确保每个master目录结构如下)
scp /root/etcd/*.pem root@192.168.3.101:/etc/kubernetes/pki/etcd/
scp /root/etcd/*.pem root@192.168.3.102:/etc/kubernetes/pki/etcd/
scp /root/etcd/*.pem root@192.168.3.103:/etc/kubernetes/pki/etcd/
tree /etc/kubernetes/pki/etcd/
/etc/kubernetes/pki/etcd/
├── ca-key.pem
├── ca.pem
├── client-key.pem
├── client.pem
├── peer-key.pem
├── peer.pem
├── server-key.pem
└── server.pem
检查证书有效期
openssl x509 -in /etc/kubernetes/pki/etcd/ca.pem -noout -text | grep Not
Not Before: Nov 10 08:47:00 2024 GMT
Not After : Oct 17 08:47:00 2124 GMT
openssl x509 -in /etc/kubernetes/pki/etcd/client.pem -noout -text | grep Not
Not Before: Nov 10 08:50:00 2024 GMT
Not After : Oct 17 08:50:00 2124 GMT
openssl x509 -in /etc/kubernetes/pki/etcd/server.pem -noout -text | grep Not
Not Before: Nov 10 08:54:00 2024 GMT
Not After : Oct 17 08:54:00 2124 GMT
openssl x509 -in /etc/kubernetes/pki/etcd/peer.pem -noout -text | grep Not
Not Before: Nov 10 08:54:00 2024 GMT
Not After : Oct 17 08:54:00 2124 GMT
安装etcd
下载3.4.13版本etcd
wget https://github.com/etcd-io/etcd/releases/download/v3.4.13/etcd-v3.4.13-linux-amd64.tar.gz
解压etcd包
tar zxvf etcd-v3.4.13-linux-amd64.tar.gz
复制etcd工具到所有master节点
scp /root/etcd-v3.4.13-linux-amd64/etcd root@192.168.3.101:/usr/local/bin/
scp /root/etcd-v3.4.13-linux-amd64/etcdctl root@192.168.3.101:/usr/local/bin/
scp /root/etcd-v3.4.13-linux-amd64/etcd root@192.168.3.101:/usr/local/bin/
scp /root/etcd-v3.4.13-linux-amd64/etcdctl root@192.168.3.101:/usr/local/bin/
scp /root/etcd-v3.4.13-linux-amd64/etcd root@192.168.3.101:/usr/local/bin/
scp /root/etcd-v3.4.13-linux-amd64/etcdctl root@192.168.3.101:/usr/local/bin/
所有master节点创建etcd数据存储目录及工作目录
mkdir -pv /data1/etcd/data
mkdir -pv /var/lib/etcd
创建etcd服务文件
master01
cat > /usr/lib/systemd/system/etcd.service <<EOF
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos
[Service]
Type=notify
WorkingDirectory=/var/lib/etcd
ExecStart=/usr/local/bin/etcd \
--data-dir=/data1/etcd/data \
--name=master01 \
--cert-file=/etc/kubernetes/pki/etcd/server.pem \
--key-file=/etc/kubernetes/pki/etcd/server-key.pem \
--trusted-ca-file=/etc/kubernetes/pki/etcd/ca.pem \
--peer-cert-file=/etc/kubernetes/pki/etcd/peer.pem \
--peer-key-file=/etc/kubernetes/pki/etcd/peer-key.pem \
--peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.pem \
--listen-peer-urls=https://192.168.3.101:2380 \
--initial-advertise-peer-urls=https://192.168.3.101:2380 \
--listen-client-urls=https://192.168.3.101:2379,http://127.0.0.1:2379 \
--advertise-client-urls=https://192.168.3.101:2379 \
--initial-cluster-token=etcd-cluster \
--initial-cluster=master01=https://192.168.3.101:2380,master02=https://192.168.3.102:2380,master03=https://192.168.3.103:2380 \
--initial-cluster-state=new \
--heartbeat-interval=250 \
--election-timeout=2000
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
master02
cat > /usr/lib/systemd/system/etcd.service <<EOF
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos
[Service]
Type=notify
WorkingDirectory=/var/lib/etcd
ExecStart=/usr/local/bin/etcd \
--data-dir=/data1/etcd/data \
--name=master02 \
--cert-file=/etc/kubernetes/pki/etcd/server.pem \
--key-file=/etc/kubernetes/pki/etcd/server-key.pem \
--trusted-ca-file=/etc/kubernetes/pki/etcd/ca.pem \
--peer-cert-file=/etc/kubernetes/pki/etcd/peer.pem \
--peer-key-file=/etc/kubernetes/pki/etcd/peer-key.pem \
--peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.pem \
--listen-peer-urls=https://192.168.3.102:2380 \
--initial-advertise-peer-urls=https://192.168.3.102:2380 \
--listen-client-urls=https://192.168.3.102:2379,http://127.0.0.1:2379 \
--advertise-client-urls=https://192.168.3.102:2379 \
--initial-cluster-token=etcd-cluster \
--initial-cluster=master01=https://192.168.3.101:2380,master02=https://192.168.3.102:2380,master03=https://192.168.3.103:2380 \
--initial-cluster-state=new \
--heartbeat-interval=250 \
--election-timeout=2000
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
master03
cat > /usr/lib/systemd/system/etcd.service <<EOF
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos
[Service]
Type=notify
WorkingDirectory=/var/lib/etcd
ExecStart=/usr/local/bin/etcd \
--data-dir=/data1/etcd/data \
--name=master03 \
--cert-file=/etc/kubernetes/pki/etcd/server.pem \
--key-file=/etc/kubernetes/pki/etcd/server-key.pem \
--trusted-ca-file=/etc/kubernetes/pki/etcd/ca.pem \
--peer-cert-file=/etc/kubernetes/pki/etcd/peer.pem \
--peer-key-file=/etc/kubernetes/pki/etcd/peer-key.pem \
--peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.pem \
--listen-peer-urls=https://192.168.3.103:2380 \
--initial-advertise-peer-urls=https://192.168.3.103:2380 \
--listen-client-urls=https://192.168.3.103:2379,http://127.0.0.1:2379 \
--advertise-client-urls=https://192.168.3.103:2379 \
--initial-cluster-token=etcd-cluster \
--initial-cluster=master01=https://192.168.3.101:2380,master02=https://192.168.3.102:2380,master03=https://192.168.3.103:2380 \
--initial-cluster-state=new \
--heartbeat-interval=250 \
--election-timeout=2000
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
启动所有etcd节点
systemctl daemon-reload && systemctl enable etcd && systemctl start etcd && systemctl status etcd
etcd服务文件介绍
vim /usr/lib/systemd/system/etcd.service
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos
[Service]
Type=notify
WorkingDirectory=/var/lib/etcd # etcd工作目录
ExecStart=/usr/local/bin/etcd \ # etcd命令路径
--data-dir=/data1/etcd/data \ # etcd瞬数据存储目录
--name=master1 \ # 当前节点etcd名称,不同节点名称不一样
--cert-file=/etc/kubernetes/pki/etcd/server.pem \ # etcd server.pem文件路径
--key-file=/etc/kubernetes/pki/etcd/server-key.pem \ # etcd server-key.pem文件路径
--trusted-ca-file=/etc/kubernetes/pki/etcd/ca.pem \ # etcd ca.pem文件路径
--peer-cert-file=/etc/kubernetes/pki/etcd/peer.pem \ # etcd peer.pem文件路径
--peer-key-file=/etc/kubernetes/pki/etcd/peer-key.pem \ # etcd peer-key.pem文件路径
--peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.pem \ # etcd ca.pem文件路径
--listen-peer-urls=https://192.168.3.101:2380 \ # etcd本机ip:2380
--initial-advertise-peer-urls=https://192.168.3.101:2380 \ # etcd本机ip:2380
--listen-client-urls=https://192.168.3.101:2379,http://127.0.0.1:2379 \ # etcd本机ip:2379
--advertise-client-urls=https://192.168.3.101:2379 \ # etcd本机ip:2379
--initial-cluster-token=etcd-cluster \ # etcd集群初始化的唯一标识符,所以etcd节点名称需要保持一致
--initial-cluster=master01=https://192.168.3.101:2380,master02=https://192.168.3.102:2380,master03=https://192.168.3.103:
2380 \ # etcd 集群信息,需要填写所有etcd节点ip:2380
--initial-cluster-state=new \ # 设置etcd集群为新创建
--heartbeat-interval=250 \
--election-timeout=2000
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
检查集群状态是否正常
ETCDCTL_API=3 etcdctl --cert=/etc/kubernetes/pki/etcd/server.pem --key=/etc/kubernetes/pki/etcd/server-key.pem --cacert=/etc/kubernetes/pki/etcd/ca.pem --endpoints=https://192.168.3.102:2379 endpoint health
5.4、初始化Kubernetes集群
堆叠etcd初始化集群
master01操作
创建k8s初始化集群配置文件
cat > kubeadm-config.yaml <<EOF
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: "192.168.3.101" # 本机apiserver ip地址
bindPort: 6443 # 本机端口
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: master01 # 本机集群名称
taints: # 设置master禁止调度污点
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki # 集群证书存放地址
clusterName: kubernetes # k8s集群名称
controllerManager: {}
imageRepository: registry.aliyuncs.com/google_containers # 集群镜像获取仓库
kind: ClusterConfiguration
kubernetesVersion: v1.20.0 # k8s集群版本号
controlPlaneEndpoint: "192.168.3.100:16443" # 设置虚ip地址及端口
apiServer: # apiserver地址,如果是集群需要配置多个IP
CertSANs:
- "192.168.3.101"
- "192.168.3.102"
- "192.168.3.103"
- "192.168.3.100"
- "master01"
- "master02"
- "master03"
etcd:
local: # etcd使用本地模式
dataDir: "/data1/etcd/data" # etcd存储目录
networking:
podSubnet: "10.244.0.0/16" # Pod 网络地址
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs" # 设置 kube-proxy 为 IPVS 模式
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: "systemd" # 确保 Kubelet 使用 systemd
kubeconfig: "/var/lib/kubelet/kubeconfig" # 设置kubeconfig
bootstrapKubeconfig: "/var/lib/kubelet/bootstrap-kubeconfig" # 设置bootstrap文件路径
EOF
通过kubeadm-config.yaml文件初始化集群
kubeadm init --config kubeadm-config.yaml
配置kubectl
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
source <(kubectl completion bash)
echo 'source <(kubectl completion bash)' >> ~/.bashrc
查看集群状态
kubectl get node
NAME STATUS ROLES AGE VERSION
master01 NotReady control-plane,master 17m v1.20.0
生成加加入集群的命令
kubeadm token create --print-join-command
kubeadm join 192.168.3.100:16443 --token xxxxxx --discovery-token-ca-cert-hash xxxxxx
新master节点加入集群只需在kubeadm join xxx 后面加入 --control-plane 即可,不过matser节点需要提前获取ca证书才能执行
node节点之间执行kubeadm join xxx
配置有问题可以执行节点reset集群
sudo kubeadm reset
可以使用kubeadm生成集群初始化文件然后修改
kubeadm config print init-defaults > /root/kubeadm-config.yaml
master02操作
创建k8s集群证书存放目录
mkdir -p /etc/kubernetes/pki/etcd
拷贝master01 ca证书到到本机
scp root@192.168.3.101:/etc/kubernetes/pki/ca.* /etc/kubernetes/pki
scp root@192.168.3.101:/etc/kubernetes/pki/sa.* /etc/kubernetes/pki
scp root@192.168.3.101:/etc/kubernetes/pki/front-proxy-ca.* /etc/kubernetes/pki
scp root@192.168.3.101:/etc/kubernetes/pki/etcd/ca.* /etc/kubernetes/pki/etcd
master01 执行kubeadm token create --print-join-command
获取加入集群集群命令
kubeadm join xxx 后面加入 --control-plane
配置kubectl
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
source <(kubectl completion bash)
echo 'source <(kubectl completion bash)' >> ~/.bashrc
kubectl get node
NAME STATUS ROLES AGE VERSION
master01 NotReady control-plane,master 42m v1.20.0
master02 NotReady control-plane,master 31m v1.20.0
master03操作
创建k8s集群证书存放目录
mkdir -p /etc/kubernetes/pki/etcd
拷贝master01 ca证书到到本机
scp root@192.168.3.101:/etc/kubernetes/pki/ca.* /etc/kubernetes/pki
scp root@192.168.3.101:/etc/kubernetes/pki/sa.* /etc/kubernetes/pki
scp root@192.168.3.101:/etc/kubernetes/pki/front-proxy-ca.* /etc/kubernetes/pki
scp root@192.168.3.101:/etc/kubernetes/pki/etcd/ca.* /etc/kubernetes/pki/etcd
master01 执行kubeadm token create --print-join-command
获取加入集群集群命令
kubeadm join xxx 后面加入 --control-plane
配置kubectl
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
source <(kubectl completion bash)
echo 'source <(kubectl completion bash)' >> ~/.bashrc
kubectl get node
NAME STATUS ROLES AGE VERSION
master01 NotReady control-plane,master 42m v1.20.0
master02 NotReady control-plane,master 31m v1.20.0
master03 NotReady control-plane,master 7m25s v1.20.0
外部etcd初始化集群
- 前提提前部署好etcd集群
master01操作
创建k8s初始化集群配置文件
cat > kubeadm-config.yaml <<EOF
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: "192.168.3.101" # 本机apiserver ip地址
bindPort: 6443 # 本机端口
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: master01 # 本机集群名称
taints: # 设置master禁止调度污点
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki # 集群证书存放地址
clusterName: kubernetes # k8s集群名称
controllerManager: {}
imageRepository: registry.aliyuncs.com/google_containers # 集群镜像获取仓库
kind: ClusterConfiguration
kubernetesVersion: v1.20.0 # k8s集群版本号
controlPlaneEndpoint: "192.168.3.100:16443" # 设置虚ip地址及端口
apiServer: # apiserver地址,如果是集群需要配置多个IP
CertSANs:
- "192.168.3.101"
- "192.168.3.102"
- "192.168.3.103"
- "192.168.3.100"
- "master01"
- "master02"
- "master03"
etcd:
external: # etcd使用外部集群模式
endpoints:
- https://192.168.3.101:2379
- https://192.168.3.102:2379
- https://192.168.3.103:2379
caFile: /etc/kubernetes/pki/etcd/ca.pem
certFile: /etc/kubernetes/pki/etcd/client.pem
keyFile: /etc/kubernetes/pki/etcd/client-key.pem
networking:
podSubnet: "10.244.0.0/16" # Pod 网络地址
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs" # 设置 kube-proxy 为 IPVS 模式
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: "systemd" # 确保 Kubelet 使用 systemd
kubeconfig: "/var/lib/kubelet/kubeconfig" # 设置kubeconfig
bootstrapKubeconfig: "/var/lib/kubelet/bootstrap-kubeconfig" # 设置bootstrap文件路径
EOF
通过kubeadm-config.yaml文件初始化集群
kubeadm init --config kubeadm-config.yaml
配置kubectl
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
source <(kubectl completion bash)
echo 'source <(kubectl completion bash)' >> ~/.bashrc
查看集群状态
kubectl get node
NAME STATUS ROLES AGE VERSION
master01 NotReady control-plane,master 17m v1.20.0
生成加加入集群的命令
kubeadm token create --print-join-command
kubeadm join 192.168.3.100:16443 --token xxxxxx --discovery-token-ca-cert-hash xxxxxx
新master节点加入集群只需在kubeadm join xxx 后面加入 --control-plane 即可,不过matser节点需要提前获取ca证书才能执行
node节点之间执行kubeadm join xxx
配置有问题可以执行节点reset集群
sudo kubeadm reset
可以使用kubeadm生成集群初始化文件然后修改
kubeadm config print init-defaults > /root/kubeadm-config.yaml
master02操作
创建k8s集群证书存放目录
mkdir -p /etc/kubernetes/pki/etcd
拷贝master01 ca证书到到本机
scp root@192.168.3.101:/etc/kubernetes/pki/ca.* /etc/kubernetes/pki
scp root@192.168.3.101:/etc/kubernetes/pki/sa.* /etc/kubernetes/pki
scp root@192.168.3.101:/etc/kubernetes/pki/front-proxy-ca.* /etc/kubernetes/pki
master01 执行kubeadm token create --print-join-command
获取加入集群集群命令
kubeadm join xxx 后面加入 --control-plane
配置kubectl
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
source <(kubectl completion bash)
echo 'source <(kubectl completion bash)' >> ~/.bashrc
kubectl get node
NAME STATUS ROLES AGE VERSION
master01 NotReady control-plane,master 42m v1.20.0
master02 NotReady control-plane,master 31m v1.20.0
master03操作
创建k8s集群证书存放目录
mkdir -p /etc/kubernetes/pki/etcd
拷贝master01 ca证书到到本机
scp root@192.168.3.101:/etc/kubernetes/pki/ca.* /etc/kubernetes/pki
scp root@192.168.3.101:/etc/kubernetes/pki/sa.* /etc/kubernetes/pki
scp root@192.168.3.101:/etc/kubernetes/pki/front-proxy-ca.* /etc/kubernetes/pki
master01 执行kubeadm token create --print-join-command
获取加入集群集群命令
kubeadm join xxx 后面加入 --control-plane
配置kubectl
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
source <(kubectl completion bash)
echo 'source <(kubectl completion bash)' >> ~/.bashrc
kubectl get node
NAME STATUS ROLES AGE VERSION
master01 NotReady control-plane,master 42m v1.20.0
master02 NotReady control-plane,master 31m v1.20.0
master03 NotReady control-plane,master 7m25s v1.20.0
5.5、安装calico网络插件
- master其中一个节点执行
- 使用etcd存储calico数据
calico yaml下载地址
curl https://docs.projectcalico.org/archive/v3.20/manifests/calico-etcd.yaml -o calico.yaml
vim calico.yaml
# 修改如下,将etcd认证密钥挂载到容器
kind: Secret
......
etcd-key: #输入执行命令获取的内容 cat /etc/kubernetes/pki/etcd/server-key.pem | base64 | tr -d '\n'
etcd-cert: #输入执行命令获取的内容 cat /etc/kubernetes/pki/etcd/server.pem | base64 | tr -d '\n'
etcd-ca: #输入执行命令获取的内容 cat /etc/kubernetes/pki/etcd/ca.pem | base64 | tr -d '\n'
......
kind: ConfigMap
......
data:
# 填写etcd集群所有节点地址,前提是所有master节点都进行初始化了
etcd_endpoints: "https://192.168.3.101:2379,https://192.168.3.102:2379,https://192.168.3.103:2379"
calico_backend: "bird"
ipip_enabled: "false" # 关闭ipip模式启用bgp模式
etcd_ca: "/calico-secrets/etcd-ca"
etcd_cert: "/calico-secrets/etcd-cert"
etcd_key: "/calico-secrets/etcd-key"
kind: DaemonSet
......
- name: CALICO_IPV4POOL_CIDR
value: "10.244.0.0/16" # 设置pod网络和集群init网段保持一致
kubectl apply -f calico.yaml
下载calico镜像,然后把calico.yaml镜像替换为如下
docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/calico/cni:v3.20.6
docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/calico/pod2daemon-flexvol:v3.20.6
docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/calico/node:v3.20.6
docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/calico/kube-controllers:v3.20.6
验证集群状态
kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-6bc4c7656b-mkmpf 1/1 Running 0 47s
kube-system calico-node-mtpcb 1/1 Running 0 48s
kube-system calico-node-n2jjv 1/1 Running 0 48s
kube-system calico-node-rdztc 1/1 Running 0 48s
kube-system coredns-7f89b7bc75-ggfns 1/1 Running 0 70m
kube-system coredns-7f89b7bc75-gw9cx 1/1 Running 0 70m
kube-system etcd-master01 1/1 Running 0 70m
kube-system etcd-master02 1/1 Running 0 59m
kube-system etcd-master03 1/1 Running 0 35m
kube-system kube-apiserver-master01 1/1 Running 0 70m
kube-system kube-apiserver-master02 1/1 Running 0 59m
kube-system kube-apiserver-master03 1/1 Running 0 35m
kube-system kube-controller-manager-master01 1/1 Running 1 70m
kube-system kube-controller-manager-master02 1/1 Running 0 59m
kube-system kube-controller-manager-master03 1/1 Running 0 35m
kube-system kube-proxy-fldxf 1/1 Running 0 70m
kube-system kube-proxy-gmx4t 1/1 Running 0 35m
kube-system kube-proxy-l4wv7 1/1 Running 0 59m
kube-system kube-scheduler-master01 1/1 Running 0 70m
kube-system kube-scheduler-master02 1/1 Running 0 59m
kube-system kube-scheduler-master03 1/1 Running 0 35m
5.6、检查集群证书有效期
查看集群中所有证书有效期
kubeadm certs check-expiration
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Oct 15, 2124 12:43 UTC 99y no
apiserver Oct 15, 2124 12:43 UTC 99y ca no
apiserver-etcd-client Oct 15, 2124 12:43 UTC 99y etcd-ca no
apiserver-kubelet-client Oct 15, 2124 12:43 UTC 99y ca no
controller-manager.conf Oct 15, 2124 12:43 UTC 99y no
etcd-healthcheck-client Oct 15, 2124 12:43 UTC 99y etcd-ca no
etcd-peer Oct 15, 2124 12:43 UTC 99y etcd-ca no
etcd-server Oct 15, 2124 12:43 UTC 99y etcd-ca no
front-proxy-client Oct 15, 2124 12:43 UTC 99y front-proxy-ca no
scheduler.conf Oct 15, 2124 12:43 UTC 99y no
CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
ca Oct 15, 2124 12:43 UTC 99y no
etcd-ca Oct 15, 2124 12:43 UTC 99y no
front-proxy-ca Oct 15, 2124 12:43 UTC 99y no
查看指定证书有效期
openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -text | grep Not
Not Before: Jul 23 06:47:01 2024 GMT
Not After : Jun 29 06:52:01 2124 GMT
5.7、验证集群apiserver是否使用虚ip代理
# 任意master节点访问虚ip:端口获取集群信息
curl -k https://192.168.3.100:16443/version
{
"major": "1",
"minor": "20",
"gitVersion": "v1.20.0",
"gitCommit": "af46c47ce925f4c4ad5cc8d1fca46c7b77d13b38",
"gitTreeState": "clean",
"buildDate": "2020-12-08T17:51:19Z",
"goVersion": "go1.15.5",
"compiler": "gc",
"platform": "linux/amd64"
master01 关闭haproxy、keepalived,先关闭haproxy不然keepalived健康检查脚本会启动haproxy
systemctl stop haproxy.service
systemctl stop keepalived.service
结果:
1、虚ip漂移到master02
2、在master01节点执行kubectl命令不影响操作
master01和matser02 关闭haproxy、keepalived,先关闭haproxy不然keepalived健康检查脚本会启动haproxy
systemctl stop haproxy.service
systemctl stop keepalived.service
结果:
1、虚ip漂移到master03
2、在master01节点执行kubectl命令不影响操作
查看kubelet配置
grep -r '192.168.3.100' /etc/kubernetes/
/etc/kubernetes/admin.conf: server: https://192.168.3.100:16443
/etc/kubernetes/kubelet.conf: server: https://192.168.3.100:16443
6、node操作
master节点生成临时加入集群命令
kubeadm token create --print-join-command
node节点执行加集群命令
kubeadm join 192.168.3.101:6443 --token xxxxxx --discovery-token-ca-cert-hash xxxxxx
7、集群开启证书轮换功能
kubelet-client证书默认有效期为1年,低版本需要创建rbac
master配置controller-manager
所有master节点controller-manager配置自动签发kubelet-client
vim /etc/kubernetes/manifests/kube-controller-manager.yaml
spec:
containers:
- command:
......
- --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt # 确保存在
- --cluster-signing-key-file=/etc/kubernetes/pki/ca.key # 确保存在
- --experimental-cluster-signing-duration=87600h0m0s # 添加,设置kubelet-client有效期为10年,时间可自定义
- --feature-gates=RotateKubeletServerCertificate=true # 自动签发kubelet-client证书
master节点重启controller-manager,建议每个节点重启间间隔时间为1分钟
docker restart controller-manager 或 kubectl delete pod kube-controller-manager -n kube-system
配置kubelet
配置kubelet开启证书轮换功能
- master和node都需要开启
vim /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
[Service]
......
# 新增
Environment="KUBELET_EXTRA_ARGS=--feature-gates=RotateKubeletServerCertificate=true"
# 重启kubelet
systemctl daemon-reload && systemctl restart kubelet.service
验证
使用node节点做验证
查看当前kubelet-client证书有效期
openssl x509 -in /var/lib/kubelet/pki/kubelet-client-2024-11-10-15-20-07.pem -noout -text | grep Not
Not Before: Nov 10 07:15:07 2024 GMT
Not After : Nov 10 07:15:07 2025 GMT
修改node机器时间为接近证书快到期,千万不要修改证书时间超过当前证书有效期
date -s '2025-11-09 12:00'
重启kubelet
systemctl restart kubelet.service
查看kubelet-client证书时间(默认会生成一个带时间新的证书软连接到kubelet-client-current.pem)
openssl x509 -in /var/lib/kubelet/pki/kubelet-client-2025-11-09-12-00-04.pem -noout -text | grep Not
Not Before: Nov 10 07:21:30 2024 GMT
Not After : Nov 8 07:21:30 2034 GMT
ls
-rw------- 1 root root 1110 11月 10 2024 kubelet-client-2024-11-10-15-20-07.pem
-rw------- 1 root root 1110 11月 9 12:00 kubelet-client-2025-11-09-12-00-04.pem
lrwxrwxrwx 1 root root 59 11月 9 12:00 kubelet-client-current.pem -> /var/lib/kubelet/pki/kubelet-client-2025-11-09-12-00-04.pem
-rw-r--r-- 1 root root 2237 11月 10 2024 kubelet.crt
-rw------- 1 root root 1679 11月 10 2024 kubelet.key
新加入节点的kubelet-client证书有效期也为10年
openssl x509 -in /var/lib/kubelet/pki/kubelet-client-2024-11-10-14-52-34.pem -noout -text | grep Not
Not Before: Nov 10 06:47:34 2024 GMT
Not After : Nov 8 06:47:34 2034 GMT
8、部署遇到问题
master节点剔除集群后重新加入集群报错
# 报错信息
Failed to get etcd status for https://192.168.3.103:2379: failed to dial endpoint https://192.168.3.103:2379 with maintenance client: context deadline exceeded
# 解决办法
master节点剔除后etcd还保留节点连接信息,找一个正常的etcd节点进入容器删除异常连接节点
docker exec -it etcd容器id sh
# 配置环境变量
export ETCDCTL_API=3
# 设置etcd执行参数
alias etcdctl='etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key'
查看 etcd 集群成员列表
etcdctl member list
删除异常节点
etcdctl member remove 异常节点id
master节点重新加入集群
kubeadm join xxxxxx
etcd报错request cluster ID mismatch
错误信息:request cluster ID mismatch (got 2a40defc84b50129 want 4c52452e0e3e69c8)
解决方法:
删除所有etcd节点数据目录,然后重启etcd
systemctl daemon-reload && systemctl restart etcd.service