Ubuntu部署k8s集群(基于docker)
本文总结一下部署k8s集群踩的坑以及部署流程。
相关版本:docker-v27.4.1、cri-dockerd-v0.3.16、kubeadm-v1.28.15
注意本人的机器是arm64的,x86已经amd64的可以参考
目前仅完成至基础配置阶段(到集群初始化)
k8s介绍
Kubernetes 是一个开源的容器编排引擎,用来对容器化应用进行自动化部署、扩缩和管理。
简单来说,k8s就是一个用来管理不同机器的docker容器构成的集群。使得众多的容器通过k8s统一管理。
常用下面这几种镜像来构建集群,这里使用的是docker。
- containerd
- CRI-O
- Docker Engine
- Mirantis Container Runtime
集群准备
集群规划
主机名 | 节点 IP | 角色 |
---|---|---|
k8s-master | 192.168.223.129 | k8s-master |
k8s-slave1 | * | k8s-slave1 |
k8s-slave2 | * | k8s-slave1 |
网络规划
类型 | 网络范围 |
---|---|
Pod 网络 | 10.244.0.0/16 |
Service 网络 | 10.96.0.0/12 |
节点网络 | 192.168.223.0/24 |
时间同步(三个主机都要)
# Step1: 查看时间
date
> Thu Sep 7 05:39:21 AM UTC 2024
# Step2: 更换时区
timedatectl set-timezone Asia/Shanghai
date
> Thu Sep 7 01:39:51 PM CST 2024
# Step3: 安装ntpdate时间同步工具
apt install ntpdate
# Step4: 通过linux计划任务完成同步
crontab -e
> Select an editor. To change later, run 'select-editor'.
> 1. /bin/nano <---- easiest
> 2. /usr/bin/vim.basic
> 3. /usr/bin/vim.tiny
> 4. /bin/ed
> Choose 1-4 [1]: 2(选2)
0 0 * * * ntpdate ntp.aliyun.com
# Step5: 查看定时任务是否设置成功
crontab -l
> 0 0 * * * ntpdate ntp.aliyun.com
基础主机配置
设置主机名及hosts配置
# 更改主机名
hostnamectl set-hostname k8s-master
# 设置hosts解析
vim /etc/hosts
> 127.0.0.1 localhost
>
> # The following lines are desirable for IPv6 capable hosts
> ::1 ip6-localhost ip6-loopback
> fe00::0 ip6-localnet
> ff00::0 ip6-mcastprefix
> ff02::1 ip6-allnodes
> ff02::2 ip6-allrouters
>
> 192.168.223.129 k8s-master
> 192.168.223.* k8s-slave1
> 192.168.223.* k8s-slave2
验证
ping k8s-master
> PING k8s-master (192.168.223.129) 56(84) bytes of data.
> 64 bytes from k8s-master (192.168.223.129): icmp_seq=1 ttl=64 time=0.109 ms
> 64 bytes from k8s-master (192.168.223.129): icmp_seq=2 ttl=64 time=0.174 ms
> 64 bytes from k8s-master (192.168.223.129): icmp_seq=3 ttl=64 time=0.139 ms
> 64 bytes from k8s-master (192.168.223.129): icmp_seq=4 ttl=64 time=0.121 ms
> --- k8s-master ping statistics ---
> 4 packets transmitted, 4 received, 0% packet loss, time 3051ms
> rtt min/avg/max/mdev = 0.109/0.135/0.174/0.024 ms
# 剩下的两个节点也去ping一下
设置iptables
iptables -P FORWARD ACCEPT
关闭swap挂载、防火墙及selinux
# Step1: 禁止swap挂载
swapoff -a
# 防止开机自动挂载 swap 分区
sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
# Step2: selinux关闭
sed -ri 's#(SELINUX=).*#\1disabled#' /etc/selinux/config
setenforce 0
systemctl disable firewalld && systemctl stop firewalld
# Step3: 关闭防火墙
ufw disable
修改内核参数允许流量传输
# 内核参数调整,开机加载模块
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
# 设置所需的 sysctl 参数,参数在重新启动后保持不变,允许流量传输
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
# 注意验证
sudo sysctl --system
> * Applying /etc/sysctl.d/k8s.conf ...
> net.bridge.bridge-nf-call-ip6tables = 1
> net.bridge.bridge-nf-call-iptables = 1
> net.ipv4.ip_forward = 1
lsmod | grep br_netfilter
> br_netfilter 28672 0
> bridge 233472 1 br_netfilter
lsmod | grep overlay
> overlay 143360 10
开启ipvs并加载模块
# 安装相关模块
apt install -y ipset ipvsadm
# 配置相关模块开机自动加载
cat <<EOF | sudo tee /etc/modules-load.d/ipvs.conf
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack
EOF
# 加载模块
sudo modprobe ip_vs
sudo modprobe ip_vs_rr
sudo modprobe ip_vs_wrr
sudo modprobe ip_vs_sh
sudo modprobe nf_conntrack
# 注意验证加载成功
lsmod |grep -e ip_vs -e nf_conntrack
> ip_vs_sh 20480 0
> ip_vs_wrr 20480 0
> ip_vs_rr 20480 0
> ip_vs 192512 6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
> nf_conntrack 184320 5 xt_conntrack,nf_nat,xt_nat,xt_MASQUERADE,ip_vs
> nf_defrag_ipv6 24576 2 nf_conntrack,ip_vs
> nf_defrag_ipv4 16384 1 nf_conntrack
> libcrc32c 16384 5 nf_conntrack,nf_nat,btrfs,raid456,ip_vs
docker安装
k8s就是来管理容器的。这里采用的容器是docker
安装相关依赖
sudo apt-get install \
apt-transport-https \
ca-certificates \
curl \
gnupg-agent \
software-properties-common
安装信任 Docker 的 GPG 公钥
用的阿里云的源,可以去它的源官网搜索docker-ce
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
写入软件源信息
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://mirrors.aliyun.com/docker-ce/linux/ubuntu \
"$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
安装Docker
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
(附)查找指定docker版本安装
apt-cache madison docker-ce
# 选择指定版本安装
apt-get -y install docker-ce=[VERSION]
修改配置文件
# 创建配置文件夹
sudo mkdir -p /etc/docker
# 写入以下信息
sudo cat > /etc/docker/daemon.json <<EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"registry-mirrors": [
"https://8xpk5wnt.mirror.aliyuncs.com"
]
}
EOF
# 读取配置文件并重启
sudo systemctl daemon-reload && sudo systemctl restart docker
- "exec-opts": ["native.cgroupdriver=systemd"]:docker默认的cgroupdriver是cgroup,官网推荐使用systemd
- 后面那个是阿里云的镜像源
设置自启动
systemctl enable --now docker
验证
# 验证cgroupdrive配置情况
docker info | grep Cgroup
# 拉个镜像跑就行(docker hub的不太能拉,懂得都懂)
docker run --name some-nginx -d -p 8080:80 registry.openanolis.cn/openanolis/nginx:1.14.1-8.6
cri-dockerd安装
从k8s-v1.20左右版本,docker已经不支持直接连接了,需要使用cri-dockerd提供的api进行连接。
简单来说,直接用会报错,k8s默认的连接容器是containd。
这里就展示tar包的下载安装流程,rpm应该更简单点,可参考:https://aluopy.cn/docker/cri-dockerd-install/
下载安装包
官网在这:https://github.com/Mirantis/cri-dockerd/releases/tag/v0.3.16
有魔法的直接wget去下载就行。注意自己服务器的架构
我后面丢个amd64-tar和rpm的包吧,这里解释tar包的安装。
# 放在opt下
sudo wget -P /opt https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.16/cri-dockerd-0.3.16.amd64.tgz
# 解压
tar -xvzf cri-dockerd-0.3.16.arm64.tgz -C /tmp
sudo mv /tmp/cri-dockerd /usr/local/bin/
书写配置文件
主要写两个配置文件,让systemctl能够管理:cri-docker.service、cri-docker.socket
# cri-docker.service
cat >> /lib/systemd/system/cri-docker.service << EOF
[Unit]
Description=CRI Interface for Docker Application Container Engine
Documentation=https://docs.mirantis.com
After=network-online.target firewalld.service docker.service
Wants=network-online.target
Requires=cri-docker.socket
[Service]
Type=notify
ExecStart=/usr/local/bin/cri-dockerd --container-runtime-endpoint fd:// --pod-infra-container-image registry.aliyuncs.com/google_containers/pause:3.9
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always
# Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229.
# Both the old, and new location are accepted by systemd 229 and up, so using the old location
# to make them work for either version of systemd.
StartLimitBurst=3
# Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230.
# Both the old, and new name are accepted by systemd 230 and up, so using the old name to make
# this option work for either version of systemd.
StartLimitInterval=60s
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Comment TasksMax if your systemd version does not support it.
# Only systemd 226 and above support this option.
TasksMax=infinity
Delegate=yes
KillMode=process
[Install]
WantedBy=multi-user.target
EOF
# cri-docker.socket
cat >> /lib/systemd/system/cri-docker.socket << EOF
[Unit]
Description=CRI Docker Socket for the API
PartOf=cri-docker.service
[Socket]
ListenStream=%t/cri-dockerd.sock
SocketMode=0660
SocketUser=root
SocketGroup=docker
[Install]
WantedBy=sockets.target
EOF
启动并配置自启动
# 配置其自启动并生成套接字
systemctl enable --now cri-docker.socket cri-docker.service
# 这个时候在/var/run里面可以看到套接字的路径,这个是要用于初始化的!
ll /var/run/cri-dockerd.sock
> srw-rw---- 1 root docker 0 Jan 10 19:53 /var/run/cri-dockerd.sock=
k8s安装及初始化
安装k8s
mkdir -p /etc/apt/keyrings
# 下载gpg公钥
curl -fsSL https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.28/deb/Release.key |
gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
# 写入相关的源
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.28/deb/ /" |
tee /etc/apt/sources.list.d/kubernetes.list
# 更新源并安装kubelet、kubeadm、kubectl
apt-get update
apt-get install -y kubelet kubeadm kubectl
# 锁定一下版本信息,防止系统自动更新报错
sudo apt-mark hold kubelet kubeadm kubectl
这里仅展示安装相关代码,与docker安装类似
修改相关参数(这里只需要master节点用,其他节点安装了就行)
这边参数的正确性决定启动是否成功
1.配置k8s启动引擎为systemd
vim /etc/default/kubelet
> KUBELET_EXTRA_ARGS="--cgroup-driver=systemd"
# 将上面内容写入,ESC + ":wq"报错退出
生成配置文件并修改
# 生成默认配置文件(我这是直接放在opt底下,可以自选)
kubeadm config print init-defaults > kubeadm.yaml
# 接下来注意配置文件情况,重要!!!
# cat kubeadm.yaml
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.223.129 # 这里改成公网路径
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/cri-dockerd.sock #这里注意改成cri-dockerd生成的套接字路径
imagePullPolicy: IfNotPresent
name: k8s-master # 这里注意改成自己的主机名(master节点)
taints: null
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers #这里换源,外网的拉不了,有魔法那也可以不改
kind: ClusterConfiguration
kubernetesVersion: 1.28.0
networking:
dnsDomain: cluster.local
podSubnet: 10.244.0.0/16 # 注意加上这个,这是属于节点的网段
serviceSubnet: 10.96.0.0/12
scheduler: {}
- localAPIEndpoint-advertiseAddress: master节点地址
- nodeRegistration-criSocket:cri提供的套接字路径
- imageRepository:拉取镜像的源
- networking-podSubnet:pod节点所用网段。可以照抄就是,不要与本机网段冲突
提前拉取镜像
# 查看所需镜像
kubeadm config images list --config kubeadm.yaml
> registry.aliyuncs.com/google_containers/kube-apiserver:v1.28.0
> registry.aliyuncs.com/google_containers/kube-controller-manager:v1.28.0
> registry.aliyuncs.com/google_containers/kube-scheduler:v1.28.0
> registry.aliyuncs.com/google_containers/kube-proxy:v1.28.0
> registry.aliyuncs.com/google_containers/pause:3.9
> registry.aliyuncs.com/google_containers/etcd:3.5.15-0
> registry.aliyuncs.com/google_containers/coredns:v1.10.1
# 自动提前拉取
kubeadm config images pull --config kubeadm.yaml
> [config/images] Pulled registry.aliyuncs.com/google_containers/kube-apiserver:v1.28.0
> [config/images] Pulled registry.aliyuncs.com/google_containers/kube-controller-manager:v1.28.0
> [config/images] Pulled registry.aliyuncs.com/google_containers/kube-scheduler:v1.28.0
> [config/images] Pulled registry.aliyuncs.com/google_containers/kube-proxy:v1.28.0
> [config/images] Pulled registry.aliyuncs.com/google_containers/pause:3.9
> [config/images] Pulled registry.aliyuncs.com/google_containers/etcd:3.5.15-0
> [config/images] Pulled registry.aliyuncs.com/google_containers/coredns:v1.10.1
# 拉取成功通过docker images验证一下
docker images
> REPOSITORY TAG IMAGE ID CREATED SIZE
> registry.aliyuncs.com/google_containers/etcd 3.5.15-0 27e3830e1402 5 months ago 139MB
> registry.aliyuncs.com/google_containers/kube-apiserver v1.28.0 00543d2fe5d7 17 months ago 119MB
> registry.aliyuncs.com/google_containers/kube-controller-manager v1.28.0 46cc66ccc7c1 17 months ago 116MB
> registry.aliyuncs.com/google_containers/kube-scheduler v1.28.0 762dce4090c5 17 months ago 57.8MB
> registry.aliyuncs.com/google_containers/kube-proxy v1.28.0 940f54a5bcae 17 months ago 68.3MB
> registry.aliyuncs.com/google_containers/coredns v1.10.1 97e04611ad43 23 months ago 51.4MB
> registry.aliyuncs.com/google_containers/pause 3.9 829e9de338bd 2 years ago 514kB
初始化
# 上述成功了就可以在master节点开始初始化了
kubeadm init --config kubeadm.yaml
成功后会提示如下信息:
···
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.223.129:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:c8954c91e9186ff1c485b5cd2701b69d5227ef33d98a5395268eb02332666611
到这里主节点初始化成功了,本人目前也在做到这里,做个记录
验证
最后运行一下成功后需要执行的命令,并验证主节点是否存在
# 这段命令直接执行,初始化成功后提示的,不需要修改
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 查看主节点是不是在,docker是否启动成功
kubectl get nodes
> NAME STATUS ROLES AGE VERSION
> k8s-master NotReady control-plane 3h46m v1.28.15
docker ps
> CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
> 1218ac19cf38 940f54a5bcae "/usr/local/bin/kube…" 4 hours ago Up 4 hours k8s_kube-proxy_kube-proxy-qctdx_kube-system_1760cfa5-36f2-4cdf-8633-c21ea1b062e9_0
> b3048947a366 registry.aliyuncs.com/google_containers/pause:3.9 "/pause" 4 hours ago Up 4 hours k8s_POD_kube-proxy-qctdx_kube-system_1760cfa5-36f2-4cdf-8633-c21ea1b062e9_0
> 05c9b056d2ba 00543d2fe5d7 "kube-apiserver --ad…" 4 hours ago Up 4 hours k8s_kube-apiserver_kube-apiserver-k8s-master_kube-system_57f73696a474af51d70e9b1c94ce436a_0
> 8e2f18909852 762dce4090c5 "kube-scheduler --au…" 4 hours ago Up 4 hours k8s_kube-scheduler_kube-scheduler-k8s-master_kube-system_de1d2e45d6b9a2308036bffaaadd3eac_0
> 3ae551e86403 27e3830e1402 "etcd --advertise-cl…" 4 hours ago Up 4 hours k8s_etcd_etcd-k8s-master_kube-system_08cdc7707b74498e298e27f1832b2cac_0
> 8a68b0f2521a 46cc66ccc7c1 "kube-controller-man…" 4 hours ago Up 4 hours k8s_kube-controller-manager_kube-controller-manager-k8s-master_kube-system_d78de7bd32e364977b61756008eae129_0
> 1fbd55621266 registry.aliyuncs.com/google_containers/pause:3.9 "/pause" 4 hours ago Up 4 hours k8s_POD_kube-controller-manager-k8s-master_kube-system_d78de7bd32e364977b61756008eae129_0
> 65f41294aba6 registry.aliyuncs.com/google_containers/pause:3.9 "/pause" 4 hours ago Up 4 hours k8s_POD_etcd-k8s-master_kube-system_08cdc7707b74498e298e27f1832b2cac_0
> 2cd06e9a4ecc registry.aliyuncs.com/google_containers/pause:3.9 "/pause" 4 hours ago Up 4 hours k8s_POD_kube-apiserver-k8s-master_kube-system_57f73696a474af51d70e9b1c94ce436a_0
> f5a3742507bf registry.aliyuncs.com/google_containers/pause:3.9 "/pause" 4 hours ago Up 4 hours k8s_POD_kube-scheduler-k8s-master_kube-system_de1d2e45d6b9a2308036bffaaadd3eac_0