背景
之前使用过kubespray
的 ansible playbook 安装集群,尽管此剧本是官方推荐使用的,但是它有以下缺点:
- 剧本的编排太过于晦涩
- 创建集群的过程中,会产生许多必要的配置文件和依赖
以上种种,导致使用kubespray
安装的集群不利于后续的维护和部分运维功能的二次开发。一些集群相关的运维流程都会绑架。
So,目前演进为使用github社区的kubeadm-ha进行安装。顾名思义,它是直接使用kubeadm进行集群的部署。在生产环境中的实际体验还不错。
本文主要是温习如何不借助编排好的剧本,人工执行安装剧本,加深对k8的理解。
参考链接:使用 kubeadm 引导集群
实施
资源准备
按官方文档要求,有以下限制:
-
一台兼容的 Linux 主机。Kubernetes 项目为基于 Debian 和 Red Hat 的 Linux 发行版以及一些不提供包管理器的发行版提供通用的指令。
-
每台机器 2 GB 或更多的 RAM(如果少于这个数字将会影响你应用的运行内存)。
-
CPU 2 核心及以上。
-
集群中的所有机器的网络彼此均能相互连接(公网和内网都可以)。
-
节点之中不可以有重复的主机名、MAC 地址或 product_uuid。请参见这里了解更多详细信息。
-
开启机器上的某些端口。请参见这里了解更多详细信息。
-
禁用交换分区。为了保证 kubelet 正常工作,你必须禁用交换分区。
- 例如,
sudo swapoff -a
将暂时禁用交换分区。要使此更改在重启后保持不变,请确保在如/etc/fstab
、systemd.swap
等配置文件中禁用交换分区,具体取决于你的系统如何配置。
- 例如,
我选用的实验机器,是基于pve
在本地服务器上创建的虚拟机。进行最小化并保障高可用的安装。
master * 3 & worker * 2
操作系统:ubuntu20.04
CPU:2 core
Mem:2GB
Disk:50GB
Master IP:192.168.2.21;192.168.2.22;192.168.2.23
Worker IP:192.168.2.24;192.168.2.25
基础环境
使用社区原生镜像安装操作系统后,往往需要更改一些系统配置以适应我们的使用需求,下面是一些安装k8需要实现的操作系统初始化准备。
因为5台机器都需要执行下述步骤,建议相同步骤均使用ansible
root用户进行批量执行。
注意事项:使用kubernetes
的原生仓库,需要提前解决网络问题。在国内访问原生仓库质量不佳。当然你可以选择用其他仓库代替,例如阿里云,本文不详述。
5台vm都需要执行:
操作系统基础环境配置
-
ansible
hosts资源清单配置[root@ centos-ops /ansible/kubeadm-test] 14:55:00 # cat hosts [master] 192.168.2.21 192.168.2.22 192.168.2.23 [worker] 192.168.2.24 192.168.2.25
-
配置hostname
hostnamectl set-hostname ***(每台vm的hostname)
-
ansible
配置hosts# 配置 hosts 文件 cat files/hosts 192.168.2.21 k8s-master-1 192.168.2.22 k8s-master-2 192.168.2.23 k8s-master-3 192.168.2.24 k8s-worker-1 192.168.2.25 k8s-worker-2 # ansible 下发 ansible all -i hosts -m copy -a "src=./files/hosts dest=/tmp/hosts" # 追加内容到 /etc/hosts ansible all -i hosts -m shell -a "cat /tmp/hosts >> /etc/hosts" # 检验 /etc/hosts ansible all -i hosts -m shell -a "cat /etc/hosts"
-
ansible
关闭防火墙ansible all -i hosts -m shell -a "systemctl stop ufw && systemctl disable ufw"
-
ansible
关闭 selinux# 关闭 selinux ansible all -i hosts -m shell -a "setenforce 0" # 若报错或输出disable,说明selinux未开启
-
ansible
配置 ntp 时间同步# 立刻同步 ansible all -i hosts -m shell -a "ntpdate time.windows.com" # 计划任务 ansible all -i hosts -m cron -a "name='Run ntpdate' minute=0 hour=0 job='ntpdate time.windows.com'"
-
ansible
禁用 swap 交换分区# 立即清除 ansible all -i hosts -m shell -a "swapoff --all" # 取消开机自动挂载 ansible all -i hosts -m shell -a "sed -i '/^\/swap\.img/d' /etc/fstab"
参考链接:安装 kubeadm ,官方说明需要关闭 swap 以提高性能和稳定性。
参考链接:swap分析,详细阐述了swap的运行机制。
参考链接:新增功能:对使用交换内存的 alpha 支持 ,在实际迭代中,v1.22后的版本允许使用交换分区,前提是节点的控制组需要是
cgroups v2
,即systemd
cgroup。在实际生产环境中,还是不建议开启swap。
-
启用
br_netfilter
和overlay
模块# 加载模块 ansible all -i hosts -m shell -a "modprobe br_netfilter; modprobe overlay" # 查看模块是否生效 ansible all -i hosts -m shell -a "lsmod | grep -E 'br_netfilter|overlay'"
-
ansible
变更部分内核参数# 编写内核参数配置文件 cat files/kubernetes.conf net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 # 下发配置 ansible all -i hosts -m copy -a "src=./files/kubernetes.conf dest=/etc/sysctl.d/kubernetes.conf mode=0644" # 加载配置 ansible all -i hosts -m shell -a "sysctl -p /etc/sysctl.d/kubernetes.conf" # 确认配置生效 ansible all -i hosts -m shell -a "sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables net.ipv4.ip_forward"
容器环境配置
-
ansible
安装 docker 和 container# 添加 docker 官方GPG 密钥 ansible all -i hosts -m shell -a "curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg" # 添加 docker 的稳定版存储库 ansible all -i hosts -m shell -a 'echo "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null' # 更新软件包列表 ansible all -i hosts -m shell -a "apt update" # 安装 docker ansible all -i hosts -m shell -a "apt-get install docker-ce docker-ce-cli -y" # 启动并配置开机自启 ansible all -i hosts -m shell -a "systemctl start docker && systemctl enable docker" # 检查 docker 版本 ansible all -i hosts -m shell -a "docker --version"
-
ansible
配置 Containerd 作为容器运行时的必要参数参考链接:容器运行时
# 配置 SystemdCgroup 和 pause ansible all -i hosts -m shell -a 'echo "SystemdCgroup = true\nsandbox_image = \"registry.k8s.io/pause:3.2\"" >> /etc/containerd/config.toml' # 取消禁用 cri 插件 # 此行原内容为 disabled_plugins = ["cri"] ,含义是禁用了 Kubernetes 对 Containerd 的直接使用。 ansible all -i hosts -m shell -a "sed -i '/disabled_plugins/d' /etc/containerd/config.toml" # 重启以加载配置 ansible all -i hosts -m shell -a 'systemctl restart containerd'
-
ansible
安装 crictl参考链接:crictl
# 编写配置文件 vim files/crictl.yaml runtime-endpoint: unix:///run/containerd/containerd.sock image-endpoint: unix:///run/containerd/containerd.sock timeout: 10 debug: false # 下发配置文件 ansible all -i hosts -m copy -a "src=./files/crictl.yaml dest=/etc/crictl.yaml mode=0644"
手动配置极其繁琐,后续我可能会单独发文,详细的说明操作系统初始化应该怎么做,并以ansible playbook
呈现。
安装 kubeadm及相关依赖
因为5台机器都需要执行下述步骤,建议使用ansible
进行批量执行。
直接以root用户执行所有操作。
-
更新
apt
包索引并安装使用 Kubernetesapt
仓库所需要的包:apt-get update apt-get install -y apt-transport-https ca-certificates curl
-
下载 Google Cloud 公开签名秘钥:
mkdir /etc/apt/keyrings -p curl -fsSL https://dl.k8s.io/apt/doc/apt-key.gpg | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-archive-keyring.gpg
-
添加 Kubernetes
apt
仓库:echo "deb [signed-by=/etc/apt/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | tee /etc/apt/sources.list.d/kubernetes.list
-
更新
apt
包索引,安装 kubelet、kubeadm 和 kubectl,并锁定其版本:apt-get update apt-get install -y kubelet kubeadm kubectl apt-mark hold kubelet kubeadm kubectl
-
检查kubelet、kubeadm 和 kubectl的版本。可以看到,上述命令执行安装时,会自动安装最新版。即
latest
。[root@ centos-ops /ansible/kubeadm-test] 18:12:58 # ansible all -i hosts -m shell -a "apt-get install -y kubelet kubeadm kubectl" 192.168.2.23 | CHANGED | rc=0 >> Reading package lists... Building dependency tree... Reading state information... kubeadm is already the newest version (1.27.4-00). kubectl is already the newest version (1.27.4-00). kubelet is already the newest version (1.27.4-00). 0 upgraded, 0 newly installed, 0 to remove and 63 not upgraded. 192.168.2.24 | CHANGED | rc=0 >> Reading package lists... Building dependency tree... Reading state information... kubeadm is already the newest version (1.27.4-00). kubectl is already the newest version (1.27.4-00). kubelet is already the newest version (1.27.4-00). 0 upgraded, 0 newly installed, 0 to remove and 157 not upgraded. 192.168.2.25 | CHANGED | rc=0 >> Reading package lists... Building dependency tree... Reading state information... kubeadm is already the newest version (1.27.4-00). kubectl is already the newest version (1.27.4-00). kubelet is already the newest version (1.27.4-00). 0 upgraded, 0 newly installed, 0 to remove and 157 not upgraded. 192.168.2.21 | CHANGED | rc=0 >> Reading package lists... Building dependency tree... Reading state information... kubeadm is already the newest version (1.27.4-00). kubectl is already the newest version (1.27.4-00). kubelet is already the newest version (1.27.4-00). 0 upgraded, 0 newly installed, 0 to remove and 157 not upgraded. 192.168.2.22 | CHANGED | rc=0 >> Reading package lists... Building dependency tree... Reading state information... kubeadm is already the newest version (1.27.4-00). kubectl is already the newest version (1.27.4-00). kubelet is already the newest version (1.27.4-00). 0 upgraded, 0 newly installed, 0 to remove and 157 not upgraded.
使用 kubeadm 创建集群
参考链接:kubeadm init
在使用kubeadm时,有两种方式
- 命令行传参
- 加载配置文件
初始化第一个 Master
参考链接:利用 kubeadm 创建高可用集群
参考链接:kubeadm init 参数说明
kubeadm 可以使用--config
指定yaml配置文件,也可以直接传参。这里使用直接传参方式作为示例。
如果希望apiserver
高可用,在云厂的实现方式是预先建立 LB ,使用TCP
监听器将流量转发至后端所有master
的apiserver
。在自有的IDC下,可以使用keepalived
+haproxy
。
本次实验环境,不进行实际的高可用配置。使用第一个master
的 apiserver
伪装为高可用端点。
-
在第一台 master
192.168.2.11
执行kubeadm init --node-name "k8s-master-1" \ --control-plane-endpoint "192.168.2.21:6443" \ --apiserver-advertise-address "192.168.2.21" \ --pod-network-cidr "10.224.0.0/16" \ --service-cidr "10.96.0.0/12" \ --cri-socket "unix:///run/containerd/containerd.sock" \ --upload-certs
上面的参数是创建高可用集群是所必须的
参数释义:
-
control-plane-endpoint:负载均衡的 IP 和端口
- 配置为第一台 master
-
apiserver-advertise-address:当前机器的主IP
- 配置为第一台 master
-
pod-network-cidr:pod 网络地址范围
-
service-cidr:service 网络地址范围
-
cri-socket:容器运行时 socket
-
/etc/crictl.yaml
中的配置项
-
-
upload-certs:将控制平面证书上传到 kubeadm-certs Secret
运行结果
[init] Using Kubernetes version: v1.27.4 [preflight] Running pre-flight checks [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' W0809 16:49:38.779897 250597 checks.go:835] detected that the sandbox image "registry.k8s.io/pause:3.6" of the container runtime is inconsistent with that used by kubeadm. It is recommended that using "registry.k8s.io/pause:3.9" as the CRI sandbox image. [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "ca" certificate and key [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [k8s-master-1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.2.21] [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "front-proxy-ca" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Generating "etcd/ca" certificate and key [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [k8s-master-1 localhost] and IPs [192.168.2.21 127.0.0.1 ::1] [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [k8s-master-1 localhost] and IPs [192.168.2.21 127.0.0.1 ::1] [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "apiserver-etcd-client" certificate and key [certs] Generating "sa" key and public key [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "kubelet.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Starting the kubelet [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" [control-plane] Creating static Pod manifest for "kube-scheduler" [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [apiclient] All control plane components are healthy after 4.501651 seconds [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster [upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace [upload-certs] Using certificate key: a7818254f2ecd6ba07022e9929b9b5d54128e6a931a62e6e8b0be0db52c31135 [mark-control-plane] Marking the node k8s-master-1 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers] [mark-control-plane] Marking the node k8s-master-1 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule] [bootstrap-token] Using token: qw97bt.w0rcfkyao23r7gxf [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles [bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes [bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace [kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key [addons] Applied essential addon: CoreDNS [addons] Applied essential addon: kube-proxy Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of the control-plane node running the following command on each as root: kubeadm join 192.168.2.21:6443 --token qw97bt.w0rcfkyao23r7gxf \ --discovery-token-ca-cert-hash sha256:28865414ee96d6cdf32eb63e6a45d2f8fd51b0fd2113e21beacd8c0fa87bc9cc \ --control-plane --certificate-key a7818254f2ecd6ba07022e9929b9b5d54128e6a931a62e6e8b0be0db52c31135 Please note that the certificate-key gives access to cluster sensitive data, keep it secret! As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use "kubeadm init phase upload-certs --upload-certs" to reload certs afterward. Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.2.21:6443 --token qw97bt.w0rcfkyao23r7gxf \ --discovery-token-ca-cert-hash sha256:28865414ee96d6cdf32eb63e6a45d2f8fd51b0fd2113e21beacd8c0fa87bc9cc
上面为正确运行的输出,它给到了范例如何新增
control-plane
或worker
。下文中会用到。
-
-
配置 kubectl 并检查状态
# mkdir -p $HOME/.kube # cp -i /etc/kubernetes/admin.conf $HOME/.kube/config # chown $(id -u):$(id -g) $HOME/.kube/config # kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master-1 NotReady control-plane 13m v1.27.4 # kubectl get pod -o wide -A NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-system coredns-5d78c9869d-n6qtj 0/1 Pending 0 19m <none> <none> <none> <none> kube-system coredns-5d78c9869d-t86ms 0/1 Pending 0 19m <none> <none> <none> <none> kube-system etcd-k8s-master-1 1/1 Running 3 19m 192.168.2.21 k8s-master-1 <none> <none> kube-system kube-apiserver-k8s-master-1 1/1 Running 3 19m 192.168.2.21 k8s-master-1 <none> <none> kube-system kube-controller-manager-k8s-master-1 1/1 Running 3 19m 192.168.2.21 k8s-master-1 <none> <none> kube-system kube-proxy-npwc2 1/1 Running 0 19m 192.168.2.21 k8s-master-1 <none> <none> kube-system kube-scheduler-k8s-master-1 1/1 Running 3 19m 192.168.2.21 k8s-master-1 <none> <none>
-
k8s-master-1
已经被标记为集群的control-plane
,但是现在还处于NotReady
状态。这是因为还未安装集群网络组件。 -
coredns
为Pending
状态,这也是因为还未安装集群网络组件。而etcd
、apiserver
、controller-manager
、proxy
、scheduler
已就绪。
-
-
安装集群网络组件
这里我们选择使用
Calico
。参考链接:calico github
参考链接:官网快速入门
# 下载 wget https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/calico.yaml # 编辑配置,下面两行取消注释,并且配置 CIDR 为 在kubeadm中定义的 pod network cidr - name: CALICO_IPV4POOL_CIDR value: "10.224.0.0/16" # 编辑配置,使用 IPIP 模式 calico_backend: "IPIP" # 安装 kubectl apply -f calico.yaml # 检查 pod 状态 此处可能需等待一会,因为拉取镜像需要时间 # kubectl get pod -o wide -A NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-system calico-kube-controllers-85578c44bf-n5b4f 1/1 Running 0 57s 10.224.196.3 k8s-master-1 <none> <none> kube-system calico-node-bvd9d 1/1 Running 0 57s 192.168.2.21 k8s-master-1 <none> <none> kube-system coredns-5d78c9869d-sczxv 1/1 Running 0 2m8s 10.224.196.2 k8s-master-1 <none> <none> kube-system coredns-5d78c9869d-z455n 1/1 Running 0 2m8s 10.224.196.1 k8s-master-1 <none> <none> kube-system etcd-k8s-master-1 1/1 Running 4 2m20s 192.168.2.21 k8s-master-1 <none> <none> kube-system kube-apiserver-k8s-master-1 1/1 Running 4 2m20s 192.168.2.21 k8s-master-1 <none> <none> kube-system kube-controller-manager-k8s-master-1 1/1 Running 4 2m20s 192.168.2.21 k8s-master-1 <none> <none> kube-system kube-proxy-vrjbp 1/1 Running 0 2m8s 192.168.2.21 k8s-master-1 <none> <none> kube-system kube-scheduler-k8s-master-1 1/1 Running 4 2m20s 192.168.2.21 k8s-master-1 <none> <none> # 检查 node 状态 # kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master-1 Ready control-plane 8m25s v1.27.4
可以看到,在安装好 calico 后,coredns 状态由
pending
变为running
。并且 node 状态也由NotReady
变为Ready
。
新增其它 Master
-
在 master2 和 master3 分别执行
kubeadm join 192.168.2.21:6443 --token qw97bt.w0rcfkyao23r7gxf \ --discovery-token-ca-cert-hash sha256:28865414ee96d6cdf32eb63e6a45d2f8fd51b0fd2113e21beacd8c0fa87bc9cc \ --control-plane --certificate-key a7818254f2ecd6ba07022e9929b9b5d54128e6a931a62e6e8b0be0db52c31135 mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config
-
检查pod状态
# kubectl get pod -o wide -A NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-system calico-kube-controllers-85578c44bf-nzlwh 1/1 Running 0 21m 10.224.196.2 k8s-master-1 <none> <none> kube-system calico-node-52p6r 1/1 Running 0 21m 192.168.2.21 k8s-master-1 <none> <none> kube-system calico-node-7frf4 1/1 Running 0 9m7s 192.168.2.23 k8s-master-3 <none> <none> kube-system calico-node-wc6nd 1/1 Running 0 19m 192.168.2.22 k8s-master-2 <none> <none> kube-system coredns-5d78c9869d-hndg5 1/1 Running 0 48m 10.224.196.1 k8s-master-1 <none> <none> kube-system coredns-5d78c9869d-kwl7s 1/1 Running 0 48m 10.224.196.3 k8s-master-1 <none> <none> kube-system etcd-k8s-master-1 1/1 Running 13 48m 192.168.2.21 k8s-master-1 <none> <none> kube-system etcd-k8s-master-2 1/1 Running 2 (19m ago) 19m 192.168.2.22 k8s-master-2 <none> <none> kube-system etcd-k8s-master-3 1/1 Running 0 9m7s 192.168.2.23 k8s-master-3 <none> <none> kube-system kube-apiserver-k8s-master-1 1/1 Running 1 48m 192.168.2.21 k8s-master-1 <none> <none> kube-system kube-apiserver-k8s-master-2 1/1 Running 2 (18m ago) 19m 192.168.2.22 k8s-master-2 <none> <none> kube-system kube-apiserver-k8s-master-3 1/1 Running 0 9m 192.168.2.23 k8s-master-3 <none> <none> kube-system kube-controller-manager-k8s-master-1 1/1 Running 3 (19m ago) 48m 192.168.2.21 k8s-master-1 <none> <none> kube-system kube-controller-manager-k8s-master-2 1/1 Running 1 17m 192.168.2.22 k8s-master-2 <none> <none> kube-system kube-controller-manager-k8s-master-3 1/1 Running 0 9m 192.168.2.23 k8s-master-3 <none> <none> kube-system kube-proxy-8hss9 1/1 Running 0 19m 192.168.2.22 k8s-master-2 <none> <none> kube-system kube-proxy-8pf6j 1/1 Running 0 9m7s 192.168.2.23 k8s-master-3 <none> <none> kube-system kube-proxy-tvxk6 1/1 Running 0 48m 192.168.2.21 k8s-master-1 <none> <none> kube-system kube-scheduler-k8s-master-1 1/1 Running 16 (18m ago) 48m 192.168.2.21 k8s-master-1 <none> <none> kube-system kube-scheduler-k8s-master-2 1/1 Running 1 17m 192.168.2.22 k8s-master-2 <none> <none> kube-system kube-scheduler-k8s-master-3 1/1 Running 0 9m 192.168.2.23 k8s-master-3 <none> <none>
可以看到,新增的 master 上
etcd
、apiserver
、controller-manager
、proxy
、scheduler
均已就绪。
-
查看机器拉取的镜像
# crictl images IMAGE TAG IMAGE ID SIZE docker.io/calico/cni v3.26.1 9dee260ef7f59 93.4MB docker.io/calico/kube-controllers v3.26.1 1919f2787fa70 32.8MB docker.io/calico/node v3.26.1 8065b798a4d67 86.6MB registry.k8s.io/coredns/coredns v1.10.1 ead0a4a53df89 16.2MB registry.k8s.io/etcd 3.5.7-0 86b6af7dd652c 102MB registry.k8s.io/kube-apiserver v1.27.4 e7972205b6614 33.4MB registry.k8s.io/kube-controller-manager v1.27.4 f466468864b7a 31MB registry.k8s.io/kube-proxy v1.27.4 6848d7eda0341 23.9MB registry.k8s.io/kube-scheduler v1.27.4 98ef2570f3cde 18.2MB registry.k8s.io/pause 3.6 6270bb605e12e 302kB registry.k8s.io/pause 3.9 e6f1816883972 322kB
其实也可以在执行 init 和 join 之前,提前下载好这些镜像,加快安装速度。
新增其他 Worker
-
在 worker-1 和 worker-2 上执行
kubeadm join 192.168.2.21:6443 --token qw97bt.w0rcfkyao23r7gxf \ --discovery-token-ca-cert-hash sha256:28865414ee96d6cdf32eb63e6a45d2f8fd51b0fd2113e21beacd8c0fa87bc9cc
-
在 master 上检查 pod 状态
kubectl get pod -o wide -A |grep worker kube-system calico-node-89hqd 0/1 Init:2/3 0 106s 192.168.2.24 k8s-worker-1 <none> <none> kube-system calico-node-vhpbr 0/1 Init:0/3 0 104s 192.168.2.25 k8s-worker-2 <none> <none> kube-system kube-proxy-csdss 1/1 Running 0 106s 192.168.2.24 k8s-worker-1 <none> <none> kube-system kube-proxy-njfq8 1/1 Running 0 104s 192.168.2.25 k8s-worker-2 <none> <none>
可以看到,worker 仅会启动
calico
和kube-proxy
。calico
镜像比较大,下载需要一些时间。
检查集群状态
# 检查集群的基本信息
kubectl cluster-info
# 检查节点就绪状态
kubectl get nodes
# 检查pod就绪状态
kubectl get pods --all-namespaces
kubeadm 做了什么?
参考链接:kubeadm init 工作流
参考链接:kubeadm join 工作流
总结
本文演示了如何使用kubeadm
进行集群的手动安装。在生产环境中其实很少会这么做,因为人工配置复杂度太高。
kubernetes 迭代速度飞快,生产环境中的版本其实已经落后很多。此次全部借助官方文档进行安装,也了解到了一些新的特性,例如swap
、容器运行时
和kube-proxy
等一些默认参数的演进。