kubespray部署kubernetes集群
1、kubespray简介
Kubespray 是开源的部署生产级别 Kubernetes 集群的项目,它整合了 Ansible 作为部署的工具。
可以部署在 AWS,GCE,Azure,OpenStack,vSphere,Packet(Bare metal),
Oracle Cloud Infrastructure(Experimental)或Baremetal上。
高可用集群
可组合各种组件(例如,选择网络插件)
支持最受欢迎的Linux发行版
持续集成测试
官网:https://kubespray.io
项目地址:https://github.com/kubernetes-sigs/kubespray
2、在线部署
国内特殊的网络环境导致使用 kubespray 特别困难,部分镜像需要从 gcr.io 拉取,部分二进制文件需要从
github 下载,所以可以提前下载好进行镜像导入。
说明:高可用部署 etcd 要求3个节点,所以高可用集群最少需要 3 个节点。
kubespray 需要一个部署节点,也可以复用集群任意一个节点,这里在第一个master节点( 192.168.54.211 )安装
kubespray,并执行后续的所有操作。
2.1 搭建环境准备
1、服务器规划
ip hostname
192.168.54.211 master
192.168.54.212 slave1
192.168.54.213 slave2
2、设置 hostname
# 三台主机分别设置
$ hostnamectl set-hostname master
$ hostnamectl set-hostname slave1
$ hostnamectl set-hostname slave2
# 查看当前主机名称
$ hostname
3、设置 ip 和 hostname 的对应关系
# 三台主机分别设置
$ cat >> /etc/hosts << EOF
192.168.54.211 master
192.168.54.212 slave1
192.168.54.213 slave2
EOF
2.2 下载kubespray
# master节点执行
# 下载正式发布的release版本
$ wget https://github.com/kubernetes-sigs/kubespray/archive/v2.16.0.tar.gz
$ tar -zxvf v2.16.0.tar.gz
# 或者直接克隆
$ git clone https://github.com/kubernetes-sigs/kubespray.git -b v2.16.0 --depth=1
2.3 安装依赖
# master节点执行
$ cd kubespray-2.16.0/
$ yum install -y epel-release python3-pip
$ pip3 install -r requirements.txt
如果报错:
# 错误一
Complete output from command python setup.py egg_info:
=============================DEBUG ASSISTANCE==========================
If you are seeing an error here please try the following to
successfully install cryptography:
Upgrade to the latest pip and try again. This will fix errors for most
users. See: https://pip.pypa.io/en/stable/installing/#upgrading-pip
=============================DEBUG ASSISTANCE==========================
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-build-3w9d_1bk/cryptography/setup.py", line 17, in <module>
from setuptools_rust import RustExtension
ModuleNotFoundError: No module named 'setuptools_rust'
----------------------------------------
# 解决方法
$ pip3 install --upgrade cryptography==3.2
# 错误二
Exception: command 'gcc' failed with exit status 1
# 解决方法
# python2
$ yum install gcc libffi-devel python-devel openssl-devel -y
# python3
$ yum install gcc libffi-devel python3-devel openssl-devel -y
2.4 更新Ansible
查看 Ansible 版本:
[root@master kubespray-2.16.0]# ansible --version
ansible 2.9.20
config file = /root/kubespray-2.16.0/ansible.cfg
configured module search path = ['/root/kubespray-2.16.0/library']
ansible python module location = /usr/local/lib/python3.6/site-packages/ansible
executable location = /usr/local/bin/ansible
python version = 3.6.8 (default, Nov 16 2020, 16:55:22) [GCC 4.8.5 20150623 (Red Hat 4.8.5-44)]
更新 Ansible inventory file,IPS 地址为 3 个实例的内部 IP:
# master节点执行
[root@master kubespray-2.16.0]# cp -rfp inventory/sample inventory/mycluster
[root@master kubespray-2.16.0]# declare -a IPS=( 192.168.54.211 192.168.54.212 192.168.54.213)
[root@master kubespray-2.16.0]# CONFIG_FILE=inventory/mycluster/hosts.yaml python3 contrib/inventory_builder/inventory.py ${IPS[@]}
DEBUG: Adding group all
DEBUG: Adding group kube_control_plane
DEBUG: Adding group kube_node
DEBUG: Adding group etcd
DEBUG: Adding group k8s_cluster
DEBUG: Adding group calico_rr
DEBUG: adding host node1 to group all
DEBUG: adding host node2 to group all
DEBUG: adding host node3 to group all
DEBUG: adding host node1 to group etcd
DEBUG: adding host node2 to group etcd
DEBUG: adding host node3 to group etcd
DEBUG: adding host node1 to group kube_control_plane
DEBUG: adding host node2 to group kube_control_plane
DEBUG: adding host node1 to group kube_node
DEBUG: adding host node2 to group kube_node
DEBUG: adding host node3 to group kube_node
2.5 修改安装节点信息
查看自动生成的 hosts.yaml,kubespray 会根据提供的节点数量自动规划节点角色。这里部署 2 个 master 节
点,同时 3 个节点也作为 node ,3 个节点也用来部署 etcd。
修改 inventory/mycluster/hosts.yaml 文件:
# master节点执行
[root@master kubespray-2.16.0]# vim inventory/mycluster/hosts.yaml
all:
hosts:
master:
ansible_host: 192.168.54.211
ip: 192.168.54.211
access_ip: 192.168.54.211
slave1:
ansible_host: 192.168.54.212
ip: 192.168.54.212
access_ip: 192.168.54.212
slave2:
ansible_host: 192.168.54.213
ip: 192.168.54.213
access_ip: 192.168.54.213
children:
kube-master:
hosts:
master:
slave1:
kube-node:
hosts:
master:
slave1:
slave2:
etcd:
hosts:
master:
slave1:
slave2:
k8s-cluster:
children:
kter:
kube-node:
calico-rr:
hosts: {}
2.6 修改全局环境变量(默认即可)
[root@master kubespray-2.16.0]# cat inventory/mycluster/group_vars/all/all.yml
---
## Directory where etcd data stored
etcd_data_dir: /var/lib/etcd
## Experimental kubeadm etcd deployment mode. Available only for new deployment
etcd_kubeadm_enabled: false
## Directory where the binaries will be installed
bin_dir: /usr/local/bin
## The access_ip variable is used to define how other nodes should access
## the node. This is used in flannel to allow other flannel nodes to see
## this node for example. The access_ip is really useful AWS and Google
## environments where the nodes are accessed remotely by the "public" ip,
## but don't know about that address themselves.
# access_ip: 1.1.1.1
## External LB example config
## apiserver_loadbalancer_domain_name: "elb.some.domain"
# loadbalancer_apiserver:
# address: 1.2.3.4
# port: 1234
## Internal loadbalancers for apiservers
# loadbalancer_apiserver_localhost: true
# valid options are "nginx" or "haproxy"
# loadbalancer_apiserver_type: nginx # valid values "nginx" or "haproxy"
## If the cilium is going to be used in strict mode, we can use the
## localhost connection and not use the external LB. If this parameter is
## not specified, the first node to connect to kubeapi will be used.
# use_localhost_as_kubeapi_loadbalancer: true
## Local loadbalancer should use this port
## And must be set port 6443
loadbalancer_apiserver_port: 6443
## If loadbalancer_apiserver_healthcheck_port variable defined, enables proxy liveness check for nginx.
loadbalancer_apiserver_healthcheck_port: 8081
### OTHER OPTIONAL VARIABLES
## Upstream dns servers
# upstream_dns_servers:
# - 8.8.8.8
# - 8.8.4.4
## There are some changes specific to the cloud providers
## for instance we need to encapsulate packets with some network plugins
## If set the possible values are either 'gce', 'aws', 'azure', 'openstack', 'vsphere', 'oci', or 'external'
## When openstack is used make sure to source in the openstack credentials
## like you would do when using openstack-client before starting the playbook.
# cloud_provider:
## When cloud_provider is set to 'external', you can set the cloud controller to deploy
## Supported cloud controllers are: 'openstack' and 'vsphere'
## When openstack or vsphere are used make sure to source in the required fields
# external_cloud_provider:
## Set these proxy values in order to update package manager and docker daemon to use proxies
# http_proxy: ""
# https_proxy: ""
## Refer to roles/kubespray-defaults/defaults/main.yml before modifying no_proxy
# no_proxy: ""
## Some problems may occur when downloading files over https proxy due to ansible bug
## https://github.com/ansible/ansible/issues/32750. Set this variable to False to disable
## SSL validation of get_url module. Note that kubespray will still be performing checksum validation.
# download_validate_certs: False
## If you need exclude all cluster nodes from proxy and other resources, add other resources here.
# additional_no_proxy: ""
## If you need to disable proxying of os package repositories but are still behind an http_proxy set
## skip_http_proxy_on_os_packages to true
## This will cause kubespray not to set proxy environment in /etc/yum.conf for centos and in /etc/apt/apt.conf for debian/ubuntu
## Special information for debian/ubuntu - you have to set the no_proxy variable, then apt package will install from your source of wish
# skip_http_proxy_on_os_packages: false
## Since workers are included in the no_proxy variable by default, docker engine will be restarted on all nodes (all
## pods will restart) when adding or removing workers. To override this behaviour by only including master nodes in the
## no_proxy variable, set below to true:
no_proxy_exclude_workers: false
## Certificate Management
## This setting determines whether certs are generated via scripts.
## Chose 'none' if you provide your own certificates.
## Option is "script", "none"
# cert_management: script
## Set to true to allow pre-checks to fail and continue deployment
# ignore_assert_errors: false
## The read-only port for the Kubelet to serve on with no authentication/authorization. Uncomment to enable.
# kube_read_only_port: 10255
## Set true to download and cache container
# download_container: true
## Deploy container engine
# Set false if you want to deploy container engine manually.
# deploy_container_engine: true
## Red Hat Enterprise Linux subscription registration
## Add either RHEL subscription Username/Password or Organization ID/Activation Key combination
## Update RHEL subscription purpose usage, role and SLA if necessary
# rh_subscription_username: ""
# rh_subscription_password: ""
# rh_subscription_org_id: ""
# rh_subscription_activation_key: ""
# rh_subscription_usage: "Development"
# rh_subscription_role: "Red Hat Enterprise Server"
# rh_subscription_sla: "Self-Support"
## Check if access_ip responds to ping. Set false if your firewall blocks ICMP.
# ping_access_ip: true
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
2.7 修改集群安装配置
默认安装版本较低,指定 kubernetes 版本:
# master节点执行
[root@master kubespray-2.16.0]# vim inventory/mycluster/group_vars/k8s_cluster/k8s-cluster.yml
## Change this to use another Kubernetes version, e.g. a current beta release
kube_version: v1.20.7
1
2
3
4
如果有其它需要,修改 inventory/mycluster/group_vars/k8s_cluster/k8s-cluster.yml 文件即可。
2.8 k8s集群插件
Kuberenetes 仪表板和入口控制器等插件请在下面的文件中进行设置:
$ vim inventory/mycluster/group_vars/k8s_cluster/addons.yml
1
这里不对该文件进行修改。
2.9 SSH免密配置
配置ssh免密,kubespray ansible 节点对所有节点免密。
# master节点执行
ssh-keygen
ssh-copy-id 192.168.54.211
ssh-copy-id 192.168.54.212
ssh-copy-id 192.168.54.213
ssh-copy-id master
ssh-copy-id slave1
ssh-copy-id slave2
1
2
3
4
5
6
7
8
2.10 改变镜像源
# master节点执行
[root@master kubespray-2.16.0]# cat > inventory/mycluster/group_vars/k8s_cluster/vars.yml << EOF
gcr_image_repo: "registry.aliyuncs.com/google_containers"
kube_image_repo: "registry.aliyuncs.com/google_containers"
etcd_download_url: "https://ghproxy.com/https://github.com/coreos/etcd/releases/download/{{ etcd_version }}/etcd-{{ etcd_version }}-linux-{{ image_arch }}.tar.gz"
cni_download_url: "https://ghproxy.com/https://github.com/containernetworking/plugins/releases/download/{{ cni_version }}/cni-plugins-linux-{{ image_arch }}-{{ cni_version }}.tgz"
calicoctl_download_url: "https://ghproxy.com/https://github.com/projectcalico/calicoctl/releases/download/{{ calico_ctl_version }}/calicoctl-linux-{{ image_arch }}"
calico_crds_download_url: "https://ghproxy.com/https://github.com/projectcalico/calico/archive/{{ calico_version }}.tar.gz"
crictl_download_url: "https://ghproxy.com/https://github.com/kubernetes-sigs/cri-tools/releases/download/{{ crictl_version }}/crictl-{{ crictl_version }}-{{ ansible_system | lower }}-{{ image_arch }}.tar.gz"
nodelocaldns_image_repo: "cncamp/k8s-dns-node-cache"
dnsautoscaler_image_repo: "cncamp/cluster-proportional-autoscaler-amd64"
EOF
1
2
3
4
5
6
7
8
9
10
11
12
2.11 安装集群
运行 kubespray playbook 安装集群:
# master节点执行
[root@master kubespray-2.16.0]# ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root cluster.yml
1
2
安装过程中会下载许多可执行文件和镜像。
出现下面的信息表示执行成功:
PLAY RECAP *************************************************************************************************************
localhost : ok=3 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
master : ok=584 changed=109 unreachable=0 failed=0 skipped=1160 rescued=0 ignored=1
slave1 : ok=520 changed=97 unreachable=0 failed=0 skipped=1008 rescued=0 ignored=0
slave2 : ok=438 changed=76 unreachable=0 failed=0 skipped=678 rescued=0 ignored=0
Saturday 31 December 2022 20:07:57 +0800 (0:00:00.060) 0:59:12.196 *****
===============================================================================
container-engine/docker : ensure docker packages are installed ----------------------------------------------- 2180.79s
kubernetes/preinstall : Install packages requirements --------------------------------------------------------- 487.24s
download_file | Download item ---------------------------------------------------------------------------------- 58.95s
download_file | Download item ---------------------------------------------------------------------------------- 50.40s
download_container | Download image if required ---------------------------------------------------------------- 44.25s
download_file | Download item ---------------------------------------------------------------------------------- 42.65s
download_container | Download image if required ---------------------------------------------------------------- 38.06s
download_container | Download image if required ---------------------------------------------------------------- 32.38s
kubernetes/kubeadm : Join to cluster --------------------------------------------------------------------------- 32.29s
download_container | Download image if required ---------------------------------------------------------------- 30.67s
download_file | Download item ---------------------------------------------------------------------------------- 25.82s
kubernetes/control-plane : Joining control plane node to the cluster. ------------------------------------------ 25.60s
download_container | Download image if required ---------------------------------------------------------------- 25.34s
download_container | Download image if required ---------------------------------------------------------------- 22.49s
kubernetes/control-plane : kubeadm | Initialize first master --------------------------------------------------- 20.90s
download_container | Download image if required ---------------------------------------------------------------- 20.14s
download_file | Download item ---------------------------------------------------------------------------------- 19.50s
download_container | Download image if required ---------------------------------------------------------------- 17.84s
download_container | Download image if required ---------------------------------------------------------------- 13.96s
download_container | Download image if required ---------------------------------------------------------------- 13.31s
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
2.12 查看创建的集群
# master节点执行
[root@master ~]# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master Ready control-plane,master 10m v1.20.7 192.168.54.211 <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 docker://19.3.15
slave1 Ready control-plane,master 9m38s v1.20.7 192.168.54.212 <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 docker://19.3.15
slave2 Ready <none> 8m40s v1.20.7 192.168.54.213 <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 docker://19.3.15
1
2
3
4
5
6
[root@master ~]# kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-7c5b64bf96-wtmxn 1/1 Running 0 8m41s
calico-node-c6rr6 1/1 Running 0 9m6s
calico-node-l59fj 1/1 Running 0 9m6s
calico-node-n9tg6 1/1 Running 0 9m6s
coredns-f944c7f7c-n2wzp 1/1 Running 0 8m26s
coredns-f944c7f7c-x2tfl 1/1 Running 0 8m22s
dns-autoscaler-557bfb974d-6cbtk 1/1 Running 0 8m24s
kube-apiserver-master 1/1 Running 0 10m
kube-apiserver-slave1 1/1 Running 0 10m
kube-controller-manager-master 1/1 Running 0 10m
kube-controller-manager-slave1 1/1 Running 0 10m
kube-proxy-czk9s 1/1 Running 0 9m17s
kube-proxy-gwfc8 1/1 Running 0 9m17s
kube-proxy-tkxlf 1/1 Running 0 9m17s
kube-scheduler-master 1/1 Running 0 10m
kube-scheduler-slave1 1/1 Running 0 10m
nginx-proxy-slave2 1/1 Running 0 9m18s
nodelocaldns-4vd75 1/1 Running 0 8m23s
nodelocaldns-cr5gg 1/1 Running 0 8m23s
nodelocaldns-pmgqx 1/1 Running 0 8m23s
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
2.13 查看安装的镜像
# master节点执行
[root@master ~]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
registry.aliyuncs.com/google_containers/kube-proxy v1.20.7 ff54c88b8ecf 19 months ago 118MB
registry.aliyuncs.com/google_containers/kube-controller-manager v1.20.7 22d1a2072ec7 19 months ago 116MB
registry.aliyuncs.com/google_containers/kube-apiserver v1.20.7 034671b24f0f 19 months ago 122MB
registry.aliyuncs.com/google_containers/kube-scheduler v1.20.7 38f903b54010 19 months ago 47.3MB
nginx 1.19 f0b8a9a54136 19 months ago 133MB
quay.io/calico/node v3.17.4 4d9399da41dc 20 months ago 165MB
quay.io/calico/cni v3.17.4 f3abd83bc819 20 months ago 128MB
quay.io/calico/kube-controllers v3.17.4 c623a89d3672 20 months ago 52.2MB
cncamp/k8s-dns-node-cache 1.17.1 21fc69048bd5 22 months ago 123MB
quay.io/coreos/etcd v3.4.13 d1985d404385 2 years ago 83.8MB
cncamp/cluster-proportional-autoscaler-amd64 1.8.3 078b6f04135f 2 years ago 40.6MB
registry.aliyuncs.com/google_containers/coredns 1.7.0 bfe3a36ebd25 2 years ago 45.2MB
registry.aliyuncs.com/google_containers/pause 3.3 0184c1613d92 2 years ago 683kB
registry.aliyuncs.com/google_containers/pause 3.2 80d28bedfe5d 2 years ago 683kB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# slave1节点执行
[root@slave1 ~]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
registry.aliyuncs.com/google_containers/kube-proxy v1.20.7 ff54c88b8ecf 19 months ago 118MB
registry.aliyuncs.com/google_containers/kube-apiserver v1.20.7 034671b24f0f 19 months ago 122MB
registry.aliyuncs.com/google_containers/kube-controller-manager v1.20.7 22d1a2072ec7 19 months ago 116MB
registry.aliyuncs.com/google_containers/kube-scheduler v1.20.7 38f903b54010 19 months ago 47.3MB
nginx 1.19 f0b8a9a54136 19 months ago 133MB
quay.io/calico/node v3.17.4 4d9399da41dc 20 months ago 165MB
quay.io/calico/cni v3.17.4 f3abd83bc819 20 months ago 128MB
quay.io/calico/kube-controllers v3.17.4 c623a89d3672 20 months ago 52.2MB
cncamp/k8s-dns-node-cache 1.17.1 21fc69048bd5 22 months ago 123MB
quay.io/coreos/etcd v3.4.13 d1985d404385 2 years ago 83.8MB
cncamp/cluster-proportional-autoscaler-amd64 1.8.3 078b6f04135f 2 years ago 40.6MB
registry.aliyuncs.com/google_containers/coredns 1.7.0 bfe3a36ebd25 2 years ago 45.2MB
registry.aliyuncs.com/google_containers/pause 3.3 0184c1613d92 2 years ago 683kB
registry.aliyuncs.com/google_containers/pause 3.2 80d28bedfe5d 2 years ago 683kB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# slave2节点执行
[root@slave2 ~]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
registry.aliyuncs.com/google_containers/kube-proxy v1.20.7 ff54c88b8ecf 19 months ago 118MB
registry.aliyuncs.com/google_containers/kube-apiserver v1.20.7 034671b24f0f 19 months ago 122MB
registry.aliyuncs.com/google_containers/kube-controller-manager v1.20.7 22d1a2072ec7 19 months ago 116MB
registry.aliyuncs.com/google_containers/kube-scheduler v1.20.7 38f903b54010 19 months ago 47.3MB
nginx 1.19 f0b8a9a54136 19 months ago 133MB
quay.io/calico/node v3.17.4 4d9399da41dc 20 months ago 165MB
quay.io/calico/cni v3.17.4 f3abd83bc819 20 months ago 128MB
quay.io/calico/kube-controllers v3.17.4 c623a89d3672 20 months ago 52.2MB
cncamp/k8s-dns-node-cache 1.17.1 21fc69048bd5 22 months ago 123MB
quay.io/coreos/etcd v3.4.13 d1985d404385 2 years ago 83.8MB
registry.aliyuncs.com/google_containers/pause 3.3 0184c1613d92 2 years ago 683kB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
导出镜像供离线使用:
# master节点执行
docker save -o kube-proxy.tar registry.aliyuncs.com/google_containers/kube-proxy:v1.20.7
docker save -o kube-controller-manager.tar registry.aliyuncs.com/google_containers/kube-controller-manager:v1.20.7
docker save -o kube-apiserver.tar registry.aliyuncs.com/google_containers/kube-apiserver:v1.20.7
docker save -o kube-scheduler.tar registry.aliyuncs.com/google_containers/kube-scheduler:v1.20.7
docker save -o nginx.tar nginx:1.19
docker save -o node.tar quay.io/calico/node:v3.17.4
docker save -o cni.tar quay.io/calico/cni:v3.17.4
docker save -o kube-controllers.tar quay.io/calico/kube-controllers:v3.17.4
docker save -o k8s-dns-node-cache.tar cncamp/k8s-dns-node-cache:1.17.1
docker save -o etcd.tar quay.io/coreos/etcd:v3.4.13
docker save -o cluster-proportional-autoscaler-amd64.tar cncamp/cluster-proportional-autoscaler-amd64:1.8.3
docker save -o coredns.tar registry.aliyuncs.com/google_containers/coredns:1.7.0
docker save -o pause_3.3.tar registry.aliyuncs.com/google_containers/pause:3.3
docker save -o pause_3.2.tar registry.aliyuncs.com/google_containers/pause:3.2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
查看生成的文件:
# master节点执行
[root@master ~]# tree Kubespray-2.16.0/
Kubespray-2.16.0/
├── calicoctl
├── cni-plugins-linux-amd64-v0.9.1.tgz
├── images
│ ├── cluster-proportional-autoscaler-amd64.tar
│ ├── cni.tar
│ ├── coredns.tar
│ ├── etcd.tar
│ ├── k8s-dns-node-cache.tar
│ ├── kube-apiserver.tar
│ ├── kube-controller-manager.tar
│ ├── kube-controllers.tar
│ ├── kube-proxy.tar
│ ├── kube-scheduler.tar
│ ├── nginx.tar
│ ├── node.tar
│ ├── pause_3.2.tar
│ └── pause_3.3.tar
├── kubeadm-v1.20.7-amd64
├── kubectl-v1.20.7-amd64
├── kubelet-v1.20.7-amd64
└── rpm
├── docker
│ ├── audit-libs-python-2.8.5-4.el7.x86_64.rpm
│ ├── b001-libsemanage-python-2.5-14.el7.x86_64.rpm
│ ├── b002-setools-libs-3.3.8-4.el7.x86_64.rpm
│ ├── b003-libcgroup-0.41-21.el7.x86_64.rpm
│ ├── b0041-checkpolicy-2.5-8.el7.x86_64.rpm
│ ├── b004-python-IPy-0.75-6.el7.noarch.rpm
│ ├── b005-policycoreutils-python-2.5-34.el7.x86_64.rpm
│ ├── b006-container-selinux-2.119.2-1.911c772.el7_8.noarch.rpm
│ ├── b007-containerd.io-1.3.9-3.1.el7.x86_64.rpm
│ ├── d001-docker-ce-cli-19.03.14-3.el7.x86_64.rpm
│ ├── d002-docker-ce-19.03.14-3.el7.x86_64.rpm
│ └── d003-libseccomp-2.3.1-4.el7.x86_64.rpm
└── preinstall
├── a001-libseccomp-2.3.1-4.el7.x86_64.rpm
├── bash-completion-2.1-8.el7.noarch.rpm
├── chrony-3.4-1.el7.x86_64.rpm
├── e2fsprogs-1.42.9-19.el7.x86_64.rpm
├── ebtables-2.0.10-16.el7.x86_64.rpm
├── ipset-7.1-1.el7.x86_64.rpm
├── ipvsadm-1.27-8.el7.x86_64.rpm
├── rsync-3.1.2-10.el7.x86_64.rpm
├── socat-1.7.3.2-2.el7.x86_64.rpm
├── unzip-6.0-22.el7_9.x86_64.rpm
├── wget-1.14-18.el7_6.1.x86_64.rpm
└── xfsprogs-4.5.0-22.el7.x86_64.rpm
4 directories, 43 files
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
2.14 卸载集群
卸载集群:
[root@master kubespray-2.16.0]# ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root reset.yml
1
2.15 添加节点
1、在 inventory/mycluster/hosts.yaml 中添加新增节点信息
2、执行下面的命令:
[root@master kubespray-2.16.0]# ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root scale.yml -v -b --private-key=~/.ssh/id_rsa
1
2.16 移除节点
不用修改 hosts.yaml 文件,而是直接执行下面的命令:
[root@master kubespray-2.16.0]# ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root remove-node.yml -v -b --extra-vars "node=slave1"
1
2.17 升级集群
[root@master kubespray-2.16.0]# ansible-playbook upgrade-cluster.yml -b -i inventory/mycluster/hosts.yaml -e kube_version=v1.22.0
1
3、离线部署
在线部署可能因为网络的原因导致部署失败,所以可以使用离线部署 k8s 集群。
下面是从网上看到的一个离线部署的例子。
kubespray GitHub 地址为: https://github.com/kubernetes-sigs/kubespray
这里使用分支为 release-2.15,对应的主要组件和系统版本如下:
kubernetes v1.19.10
docker v19.03
calico v3.16.9
centos 7.9.2009
kubespray 离线包下载地址:
https://www.mediafire.com/file/nyifoimng9i6zp5/kubespray_offline.tar.gz/file
离线包下载完成后解压到 /opt 目录下:
# master节点执行
[root@master opt]# tar -zxvf /opt/kubespray_offline.tar.gz -C /opt/
1
2
查看文件列表:
# master节点执行
[root@master opt]# ll /opt/kubespray_offline
总用量 4
drwxr-xr-x. 4 root root 28 7月 11 2021 ansible_install
drwxr-xr-x. 15 root root 4096 7月 8 2021 kubespray
drwxr-xr-x. 4 root root 240 7月 9 2021 kubespray_cache
1
2
3
4
5
6
三台机器的IP地址为:192.168.43.211、192.168.43.212 和 192.168.43.213。
开始部署 ansible 服务器:
# master节点执行
[root@master opt]# yum install /opt/kubespray_offline/ansible_install/rpm/*
[root@master opt]# pip3 install /opt/kubespray_offline/ansible_install/pip/*
1
2
3
配置主机免密码登陆:
# master节点执行
[root@master ~]# ssh-keygen
[root@master ~]# ssh-copy-id 192.168.43.211
[root@master ~]# ssh-copy-id 192.168.43.212
[root@master ~]# ssh-copy-id 192.168.43.213
[root@master ~]# ssh-copy-id master
[root@master ~]# ssh-copy-id slave1
[root@master ~]# ssh-copy-id slave2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
配置 ansible 主机组:
# master节点执行
[root@master opt]# cd /opt/kubespray_offline/kubespray
[root@master kubespray]# declare -a IPS=(192.168.43.211 192.168.43.212 192.168.43.213)
[root@master kubespray]# CONFIG_FILE=inventory/mycluster/hosts.yaml python3.6 contrib/inventory_builder/inventory.py ${IPS[@]}
DEBUG: Adding group all
DEBUG: Adding group kube-master
DEBUG: Adding group kube-node
DEBUG: Adding group etcd
DEBUG: Adding group k8s-cluster
DEBUG: Adding group calico-rr
DEBUG: adding host node1 to group all
DEBUG: adding host node2 to group all
DEBUG: adding host node3 to group all
DEBUG: adding host node1 to group etcd
DEBUG: adding host node2 to group etcd
DEBUG: adding host node3 to group etcd
DEBUG: adding host node1 to group kube-master
DEBUG: adding host node2 to group kube-master
DEBUG: adding host node1 to group kube-node
DEBUG: adding host node2 to group kube-node
DEBUG: adding host node3 to group kube-node
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
inventory/mycluster/hosts.yaml 文件会自动生成。
修改 inventory/mycluster/hosts.yaml 文件:
# master节点执行
[root@master kubespray]# vim inventory/mycluster/hosts.yaml
all:
hosts:
master:
ansible_host: 192.168.43.211
ip: 192.168.43.211
access_ip: 192.168.43.211
slave1:
ansible_host: 192.168.43.212
ip: 192.168.43.212
access_ip: 192.168.43.212
slave2:
ansible_host: 192.168.43.213
ip: 192.168.43.213
access_ip: 192.168.43.213
children:
kube-master:
hosts:
master:
slave1:
kube-node:
hosts:
master:
slave1:
slave2:
etcd:
hosts:
master:
slave1:
slave2:
k8s-cluster:
children:
kter:
kube-node:
calico-rr:
hosts: {}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
修改配置文件使用离线的安装包和镜像:
# master节点执行
[root@master kubespray]# vim inventory/mycluster/group_vars/all/all.yml
---
## Directory where etcd data stored
etcd_data_dir: /var/lib/etcd
## Experimental kubeadm etcd deployment mode. Available only for new deployment
etcd_kubeadm_enabled: false
## Directory where the binaries will be installed
bin_dir: /usr/local/bin
## The access_ip variable is used to define how other nodes should access
## the node. This is used in flannel to allow other flannel nodes to see
## this node for example. The access_ip is really useful AWS and Google
## environments where the nodes are accessed remotely by the "public" ip,
## but don't know about that address themselves.
# access_ip: 1.1.1.1
## External LB example config
## apiserver_loadbalancer_domain_name: "elb.some.domain"
# loadbalancer_apiserver:
# address: 1.2.3.4
# port: 1234
## Internal loadbalancers for apiservers
# loadbalancer_apiserver_localhost: true
# valid options are "nginx" or "haproxy"
# loadbalancer_apiserver_type: nginx # valid values "nginx" or "haproxy"
## If the cilium is going to be used in strict mode, we can use the
## localhost connection and not use the external LB. If this parameter is
## not specified, the first node to connect to kubeapi will be used.
# use_localhost_as_kubeapi_loadbalancer: true
## Local loadbalancer should use this port
## And must be set port 6443
loadbalancer_apiserver_port: 6443
## If loadbalancer_apiserver_healthcheck_port variable defined, enables proxy liveness check for nginx.
loadbalancer_apiserver_healthcheck_port: 8081
### OTHER OPTIONAL VARIABLES
## Upstream dns servers
# upstream_dns_servers:
# - 8.8.8.8
# - 8.8.4.4
## There are some changes specific to the cloud providers
## for instance we need to encapsulate packets with some network plugins
## If set the possible values are either 'gce', 'aws', 'azure', 'openstack', 'vsphere', 'oci', or 'external'
## When openstack is used make sure to source in the openstack credentials
## like you would do when using openstack-client before starting the playbook.
# cloud_provider:
## When cloud_provider is set to 'external', you can set the cloud controller to deploy
## Supported cloud controllers are: 'openstack' and 'vsphere'
## When openstack or vsphere are used make sure to source in the required fields
# external_cloud_provider:
## Set these proxy values in order to update package manager and docker daemon to use proxies
# http_proxy: ""
# https_proxy: ""
#
## Refer to roles/kubespray-defaults/defaults/main.yml before modifying no_proxy
# no_proxy: ""
## Some problems may occur when downloading files over https proxy due to ansible bug
## https://github.com/ansible/ansible/issues/32750. Set this variable to False to disable
## SSL validation of get_url module. Note that kubespray will still be performing checksum validation.
# download_validate_certs: False
## If you need exclude all cluster nodes from proxy and other resources, add other resources here.
# additional_no_proxy: ""
## If you need to disable proxying of os package repositories but are still behind an http_proxy set
## skip_http_proxy_on_os_packages to true
## This will cause kubespray not to set proxy environment in /etc/yum.conf for centos and in /etc/apt/apt.conf for debian/ubuntu
## Special information for debian/ubuntu - you have to set the no_proxy variable, then apt package will install from your source of wish
# skip_http_proxy_on_os_packages: false
## Since workers are included in the no_proxy variable by default, docker engine will be restarted on all nodes (all
## pods will restart) when adding or removing workers. To override this behaviour by only including master nodes in the
## no_proxy variable, set below to true:
no_proxy_exclude_workers: false
## Certificate Management
## This setting determines whether certs are generated via scripts.
## Chose 'none' if you provide your own certificates.
## Option is "script", "none"
## note: vault is removed
# cert_management: script
## Set to true to allow pre-checks to fail and continue deployment
# ignore_assert_errors: false
## The read-only port for the Kubelet to serve on with no authentication/authorization. Uncomment to enable.
# kube_read_only_port: 10255
## Set true to download and cache container
# download_container: true
## Deploy container engine
# Set false if you want to deploy container engine manually.
# deploy_container_engine: true
## Red Hat Enterprise Linux subscription registration
## Add either RHEL subscription Username/Password or Organization ID/Activation Key combination
## Update RHEL subscription purpose usage, role and SLA if necessary
# rh_subscription_username: ""
# rh_subscription_password: ""
# rh_subscription_org_id: ""
# rh_subscription_activation_key: ""
# rh_subscription_usage: "Development"
# rh_subscription_role: "Red Hat Enterprise Server"
# rh_subscription_sla: "Self-Support"
## Check if access_ip responds to ping. Set false if your firewall blocks ICMP.
# ping_access_ip: true
kube_apiserver_node_port_range: "1-65535"
kube_apiserver_node_port_range_sysctl: false
download_run_once: true
download_localhost: true
download_force_cache: true
download_cache_dir: /opt/kubespray_offline/kubespray_cache # 修改
preinstall_cache_rpm: true
docker_cache_rpm: true
download_rpm_localhost: "{{ download_cache_dir }}/rpm" # 修改
tmp_cache_dir: /tmp/k8s_cache # 修改
tmp_preinstall_rpm: "{{ tmp_cache_dir }}/rpm/preinstall" # 修改
tmp_docker_rpm: "{{ tmp_cache_dir }}/rpm/docker" # 修改
image_is_cached: true
nodelocaldns_dire_coredns: true
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
开始部署 k8s:
# master节点执行
[root@master kubespray]# ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root cluster.yml
......
PLAY RECAP ****************************************************************************************
localhost : ok=3 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
master : ok=762 changed=178 unreachable=0 failed=0 skipped=1060 rescued=0 ignored=1
slave1 : ok=648 changed=159 unreachable=0 failed=0 skipped=918 rescued=0 ignored=0
slave2 : ok=462 changed=104 unreachable=0 failed=0 skipped=584 rescued=0 ignored=0
Sunday 18 June 2023 12:36:42 +0800 (0:00:00.059) 0:13:16.912 ***********
===============================================================================
kubernetes/master : kubeadm | Initialize first master ------------------------------------ 136.25s
kubernetes/master : Joining control plane node to the cluster. --------------------------- 110.63s
kubernetes/kubeadm : Join to cluster ------------------------------------------------------ 37.66s
container-engine/docker : Install packages docker with local rpm|Install RPM -------------- 29.70s
download_container | Load image into docker ----------------------------------------------- 11.72s
reload etcd ------------------------------------------------------------------------------- 10.62s
Gen_certs | Write etcd master certs -------------------------------------------------------- 9.13s
Gen_certs | Write etcd master certs -------------------------------------------------------- 8.84s
kubernetes/master : Master | wait for kube-scheduler --------------------------------------- 8.03s
download_container | Load image into docker ------------------------------------------------ 7.07s
download_container | Upload image to node if it is cached ---------------------------------- 6.72s
download_container | Load image into docker ------------------------------------------------ 6.69s
download_container | Load image into docker ------------------------------------------------ 6.36s
kubernetes/preinstall : Install packages requirements with local rpm|Install RPM ----------- 6.00s
wait for etcd up --------------------------------------------------------------------------- 5.76s
download_file | Copy file from cache to nodes, if it is available -------------------------- 5.64s
download_container | Load image into docker ------------------------------------------------ 5.57s
network_plugin/calico : Wait for calico kubeconfig to be created --------------------------- 5.37s
Configure | Check if etcd cluster is healthy ----------------------------------------------- 5.25s
kubernetes-apps/ansible : Kubernetes Apps | Start Resources -------------------------------- 5.16s
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
部署时间大概持续半个小时,中间不需要任何介入,部署完成后,查看集群和Pod状态:
# master节点执行
[root@master kubespray]# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master Ready master 5m46s v1.19.10 192.168.43.211 <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 docker://19.3.14
slave1 Ready master 3m50s v1.19.10 192.168.43.212 <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 docker://19.3.14
slave2 Ready <none> 2m49s v1.19.10 192.168.43.213 <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 docker://19.3.15
[root@master kubespray]# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-7fbf9b4bbb-nw7j5 1/1 Running 0 4m24s
calico-node-8bhct 1/1 Running 0 4m45s
calico-node-rbkls 1/1 Running 0 4m45s
calico-node-svphr 1/1 Running 0 4m45s
coredns-7677f9bb54-j8xx5 1/1 Running 0 3m58s
coredns-7677f9bb54-tzzpp 1/1 Running 0 4m2s
dns-autoscaler-5b7b5c9b6f-mx9dv 1/1 Running 0 4m
k8dash-77959656b-vsqfq 1/1 Running 0 3m55s
kube-apiserver-master 1/1 Running 0 7m47s
kube-apiserver-slave1 1/1 Running 0 5m58s
kube-controller-manager-master 1/1 Running 0 7m47s
kube-controller-manager-slave1 1/1 Running 0 5m58s
kube-proxy-ktvmd 1/1 Running 0 4m56s
kube-proxy-rcnhc 1/1 Running 0 4m56s
kube-proxy-slc7z 1/1 Running 0 4m56s
kube-scheduler-master 1/1 Running 0 7m47s
kube-scheduler-slave1 1/1 Running 0 5m58s
kubernetes-dashboard-758979f44b-xfw8x 1/1 Running 0 3m57s
kubernetes-metrics-scraper-678c97765c-k7z5c 1/1 Running 0 3m56s
metrics-server-8676bf5f99-nkrjr 1/1 Running 0 3m39s
nginx-proxy-slave2 1/1 Running 0 4m57s
nodelocaldns-bxww2 1/1 Running 0 3m58s
nodelocaldns-j2hvc 1/1 Running 0 3m58s
nodelocaldns-p2nx8 1/1 Running 0 3m58s
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
验证集群:
# master节点执行
# nginx安装
# 创建一个nginx镜像
[root@master ~]# kubectl create deployment nginx --image=nginx
deployment.apps/nginx created
1
2
3
4
5
# master节点执行
# 设置对外暴露端口
[root@master ~]# kubectl expose deployment nginx --port=80 --type=NodePort
service/nginx exposed
1
2
3
4
# master节点执行
[root@master ~]# kubectl get pods,svc
NAME READY STATUS RESTARTS AGE
pod/nginx-6799fc88d8-4t4mv 1/1 Running 0 72s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.233.0.1 <none> 443/TCP 11m
service/nginx NodePort 10.233.31.130 <none> 80:22013/TCP 53s
1
2
3
4
5
6
7
8
# master节点执行
# 发送curl请求
[root@master ~]# curl http://192.168.43.211:22013/
[root@master ~]# curl http://192.168.43.212:22013/
[root@master ~]# curl http://192.168.43.213:22013/
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
说明集群没有问题。
卸载集群:
# master节点执行
[root@master kubespray]# ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root reset.yml
......
PLAY RECAP ****************************************************************************************
localhost : ok=3 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
master : ok=31 changed=17 unreachable=0 failed=0 skipped=24 rescued=0 ignored=0
slave1 : ok=30 changed=17 unreachable=0 failed=0 skipped=19 rescued=0 ignored=0
slave2 : ok=30 changed=17 unreachable=0 failed=0 skipped=19 rescued=0 ignored=0
Sunday 18 June 2023 12:46:14 +0800 (0:00:01.135) 0:00:49.596 ***********
===============================================================================
Gather necessary facts (hardware) --------------------------------------------------------- 21.52s
reset | delete some files and directories ------------------------------------------------- 10.41s
reset | unmount kubelet dirs --------------------------------------------------------------- 1.73s
reset | remove all containers -------------------------------------------------------------- 1.63s
reset | remove services -------------------------------------------------------------------- 1.63s
reset | Restart network -------------------------------------------------------------------- 1.14s
download | Download files / images --------------------------------------------------------- 1.02s
reset : flush iptables --------------------------------------------------------------------- 0.89s
reset | stop services ---------------------------------------------------------------------- 0.80s
reset | restart docker if needed ----------------------------------------------------------- 0.77s
reset | remove docker dropins -------------------------------------------------------------- 0.76s
reset | remove remaining routes set by bird ------------------------------------------------ 0.57s
reset | stop etcd services ----------------------------------------------------------------- 0.53s
Gather minimal facts ----------------------------------------------------------------------- 0.48s
Gather necessary facts (network) ----------------------------------------------------------- 0.46s
reset | remove dns settings from dhclient.conf --------------------------------------------- 0.44s
reset | remove etcd services --------------------------------------------------------------- 0.41s
reset | systemctl daemon-reload ------------------------------------------------------------ 0.41s
reset | check if crictl is present --------------------------------------------------------- 0.30s
reset | Remove kube-ipvs0 ------------------------------------------------------------------ 0.25s
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
至此,离线的 k8s 集群搭建完毕。
————————————————
版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
原文链接:https://blog.csdn.net/qq_30614345/article/details/131264385
转自
kubespray部署kubernetes集群-CSDN博客
https://blog.csdn.net/qq_30614345/article/details/131264385