安装注意事项
1、保证能联网
2、开启Selinux
3、操作系统语言不能是中文
4、infra节点会自动部署router,lb不要放在infra节点上,所以80端口不能冲突
5、如果web console访问端口改成443,lb不能放一起,端口冲突
6、硬盘格式XFS才支持overlay2
7、如果lb和master在一个节点上,会有8443端口已被占用的问题,建议安装时lb不要放在master节点上
8、如果etcd放在master节点上,会以静态pod形式启动。如果放在node节点上,会以系统服务的形式启动。我在安装过程中,一个etcd放在了master上,另一个放在了node上,导致etcd启动失败。建议安装时etcd要么全放在master节点上,要么全放在node节点上。
9、安装过程中,直接安装了带有nfs持久存储的监控,需要提前安装java-1.8.0-openjdk-headless python-passlib,这一点官网没有提及,不提前装安装会报错。
10、docker 启用配置参数–selinux-enabled=false ,但是操作系统Selinux必须开启,否则安装报错
前奏
本篇基于openshift3.11进行的集群安装,安装环境是在VMWare16 pro上安装centos7.9虚拟机,一台master两台slave。通过该方式模拟实际服务器的集群搭建openshift
从OCP3.10开始,如果您使用RHEL作为主机的底层操作系统则使用RPM方法在该主机上安装OCP组件,在RHEL上使用RPM时,所有服务都通过包管理从外部源安装和更新,这些包修改了同一用户空间中主机的现有配置
如果您使用RHEL Atomic Host则在该主机上使用系统容器方法, 该方法使OCP的每个组件都作为一个容器,在一个独立的包中,使用主机的内核运行。更新后的新容器会替换您主机上的任何现有容器
这两种安装类型都为集群提供了相同的功能,但您使用的操作系统决定了您管理服务和主机更新的方式
从OCP3.10开始,Red Hat Enterprise Linux系统不再支持容器化安装方法
生产级硬件要求
master节点
在etcd集群单独部署时,高可用性OCP集群master节点需要满足最低要求:
每1000个pod具有1个CPU和1.5GB内存
2000个pod的OCP集群建议master节点是4个CPU内核和19GB内存
node节点
node节点的大小取决于其工作负载的预期大小。作为OCP集群管理员,您需要计算预期工作负载并增加大约10%的开销。对于生产环境请分配足够的资源,以便节点主机故障不会影响您的最大容量,| 超额调度node节点上的物理资源会影响 Kubernetes调度程序在pod放置期间的资源保证
安装过程分为5部分:
1、集群主从机环境准备
2、安装依赖包,提前安装使用Ansible安装OpenShift集群所依赖的第三方包
3、Ansible执行安装:使用Ansible Playbook进行自动化安装
4、OpenShift系统配置:在使用Ansible执行安装完成之后的系统配置。
5、测试问题处理
环境准备
IP | OS | CPU | Mem | Disk | |
---|---|---|---|---|---|
master.lab.example.com | 172.16.186.110/24 | CentOS7.9 | 2vCPU | 6G | 20G(OS) / 30G(Docker) |
node1.lab.example.com | 172.16.186.111/24 | CentOS7.9 | 2vCPU | 6G | 20G(OS) / 30G(Docker) |
node2.lab.example.com | 172.16.186.112/24 | CentOS7.9 | 2vCPU | 6G | 20G(OS) / 30G(Docker) |
注:infra节点可单独部署,hosts中设置好,即可在/etc/ansible/hosts中加入infra.ocp.cn openshift_node_group_name='node-config-infra'
存储管理
目录 | 作用 | 规格 | 预期增长 |
---|---|---|---|
/var/lib/openshift | 仅在单主模式下用于etcd存储且etcd嵌入在atomic-openshift-master进程中 | 小于10GB | 仅存储元数据(随着环境慢慢成长) |
/var/lib/etcd | 在多主模式下或当etcd由管理员独立时用于etcd存储 | 小于20GB | 仅存储元数据(随着环境慢慢成长) |
/var/lib/docker | 当运行时为docker时这是挂载点, 用于活动容器运行时(包括 pod)和本地图像存储(不用于注册表存储)的存储。挂载点应由docker-storage管理而不是手动管理 | 当节点内存为16GB时为50GB,每增加8GB内存就增加20-25GB | 增长受限于运行容器的容量 |
/var/lib/container | 当运行时为CRI-O时这是挂载点,用于活动容器运行时(包括pod)和本地图像存储(不用于注册表存储)的存储 | 当节点内存为16GB时为50GB,每增加8GB内存就增加20-25GB | 增长受限于运行容器的容量 |
/var/lib/origin/openshift.local.volumes | 用于pod的临时卷存储。这包括在运行时挂载到容器中的任何外部内容。包括不受持久存储PV支持的环境变量、kube机密和数据卷 | 变化 | 如果需要存储的pod使用持久卷则最小化 如果使用临时存储这可快速增长 |
/var/logs | 所有组件的日志文件 | 10~30GB | 日志文件可以快速增长;大小可通过不断增长的磁盘来管理,也可用日志轮换来管理 |
安装
前期配置
# 关于所有节点的SELinux
[root@ * ~]# grep -v ^# /etc/selinux/config | grep -v ^$
SELINUX=enforcing # 官方文档推荐开启SELINUX,否则会导致安装失败
SELINUXTYPE=targeted
# 所有节点关闭防火墙
[root@ * ~]# systemctl stop firewalld && systemctl disable firewalld
# 修改hosts文件
[root@master ~]# vim /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
172.16.186.110 master.lab.example.com master
172.16.186.111 node1.lab.example.com node1
172.16.186.112 node2.lab.example.com node2
[root@master ~]# for i in 1 2;do scp /etc/hosts root@172.16.186.11$i:/etc;done
# 配置免密登录
[root@master ~]# ssh-keygen -t rsa -P ''
[root@master ~]# ssh-copy-id master
[root@master ~]# ssh-copy-id master.lab.example.com
[root@master ~]# for i in {0..2};do ssh-copy-id 172.16.186.11$i;done
[root@master ~]# for i in {1..2};do ssh-copy-id node$i;done
[root@master ~]# for i in {1..2};do ssh-copy-id node$i.lab.example.com;done
安装依赖包
# 所有主机下安装OpenShift依赖的软件包
yum -y install wget git net-tools bind-utils iptables-services bridge-utils bash-completion kexec-tools sos psacct bash-completion.noarch python-passlib NetworkManager vim
# 所有主机安装docker
yum -y install docker-1.13.1
# 所有主机配置docker使用的存储
vim /etc/sysconfig/docker-storage-setup
DEVS=/dev/nvme0n2
VG=DOCKER
SETUP_LVM_THIN_POOL=yes
DATA_SIZE="100%FREE"
[root@* ~]# rm -rf /var/lib/docker/
[root@* ~]# wipefs --all /dev/nvme0n2
# 就是创建vg、Thinpool、lv的过程
[root@* ~]# docker-storage-setup
INFO: Volume group backing root filesystem could not be determined
INFO: Writing zeros to first 4MB of device /dev/nvme0n2
记录了4+0 的读入
记录了4+0 的写出
4194304字节(4.2 MB)已复制,0.00452444 秒,927 MB/秒
INFO: Device node /dev/nvme0n2p1 exists.
Physical volume "/dev/nvme0n2p1" successfully created.
Volume group "DOCKER" successfully created
Rounding up size to full physical extent 24.00 MiB
Thin pool volume with chunk size 512.00 KiB can address at most 126.50 TiB of data.
Logical volume "docker-pool" created.
Logical volume DOCKER/docker-pool changed.
[root@* ~]# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
docker-pool DOCKER twi-a-t--- <29.93g 0.00 10.14
[root@* ~]# cat > /etc/docker/daemon.json<<EOF
{
"registry-mirrors" : [
"https://registry.docker-cn.com",
"https://docker.mirrors.ustc.edu.cn",
"http://hub-mirror.c.163.com",
"https://cr.console.aliyun.com/",
"https://0trl8ny5.mirror.aliyuncs.com"
]
}
EOF
[root@* ~]# systemctl daemon-reload && systemctl restart docker && systemctl enable docker
master主机上安装Ansible
[root@master ~]# yum -y install epel-release
# 全局禁用EPEL存储库,以便在以后的安装步骤中不会意外使用它
[root@master ~]# sed -i -e "s/^enabled=1/enabled=0/" /etc/yum.repos.d/epel.repo
[root@master ~]# yum -y install centos-release-ansible-28.noarch
[root@master ~]# yum list ansible --showduplicates
[root@master ~]# yum -y install ansible-2.8.6 # ansible版本太高会报错
[root@master ~]# ansible --version
ansible 2.8.6
config file = /etc/ansible/ansible.cfg
configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python2.7/site-packages/ansible
executable location = /usr/bin/ansible
python version = 2.7.5 (default, Oct 14 2020, 14:45:30) [GCC 4.8.5 20150623 (Red Hat 4.8.5-44)]
===================================== 备用 ======================================
# 下载ansible26
wget http://mirror.centos.org/centos/7/extras/x86_64/Packages/centos-release-ansible26-1-3.el7.centos.noarch.rpm
# 安装依赖并安装
yum install -y centos-release-configmanagement
rpm -ivh centos-release-ansible26-1-3.el7.centos.noarch.rpm
# 下载ansible-2.6.5
wget https://releases.ansible.com/ansible/rpm/release/epel-7-x86_64/ansible-2.6.5-1.el7.ans.noarch.rpm
# 先yum安装ansible-2.6.5所需依赖
yum install -y python-jinja2 python-paramiko python-setuptools python-six python2-cryptography sshpass PyYAML
# 安装
rpm -ivh ansible-2.6.5-1.el7.ans.noarch.rpm
===============================================================================
安装/配置openshift
[root@master ~]# wget https://github.com/openshift/openshift-ansible/archive/refs/tags/openshift-ansible-3.11.764-1.tar.gz
[root@master ~]# tar -zxvf openshift-ansible-3.11.764-1.tar.gz
[root@master ~]# mv openshift-ansible-openshift-ansible-3.11.764-1/ openshift-ansible
[root@master ~]# cd openshift-ansible
[root@master openshift-ansible]# cp roles/openshift_repos/templates/CentOS-OpenShift-Origin311.repo.j2{,-bak}
[root@master openshift-ansible]# vim roles/openshift_repos/templates/CentOS-OpenShift-Origin311.repo.j2
[centos-openshift-origin311]
name=CentOS OpenShift Origin
# 改为aliyun的源
baseurl=http://mirrors.aliyun.com/centos/7/paas/x86_64/openshift-origin311/
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-SIG-PaaS
# 编写Inventory文件
[root@master openshift-ansible]# vim /etc/ansible/hosts
# Create an OSEv3 group that contains the masters and nodes groups
[OSEv3:children]
#目前配置标准的三个角色
masters
nodes
etcd
# Set variables common for all OSEv3 hosts
[OSEv3:vars]
# SSH user, this user should allow ssh based auth without requiring a password
ansible_ssh_user=root
#使用origin社区版
openshift_deployment_type=origin
#指定安装版本
openshift_release=3.9
#openshift_node_group_name= master
#指定默认域名,访问的时候需要使用该域名,没有dns服务器,需要手动添加本地hosts文件
openshift_master_default_subdomain=master
#禁止磁盘、内存和镜像检查
openshift_disable_check=disk_availability,docker_storage,memory_availability,docker_image_availability
#disk_availability:报错信息是推荐的master磁盘空间剩余量大于40GB。测试环境无法满足,跳过检测。
#memory_availability:报错信息是推荐的master内存为16GB,node内存为8GB,测试环境无法满足,跳过检测。
#docker_image_availability:报错信息是需要的几个镜像未找到,选择跳过,装完集群后,在使用的时候再自行下载。
#docker_storage:报错信息是推荐选择一块磁盘空间存储镜像,这里选择跳过。采用docker默认的方式存储镜像。
# uncomment the following to enable htpasswd authentication; defaults to DenyAllPasswordIdentityProvider
openshift_master_identity_providers=[{'name':'htpasswd_auth','login':'true','challenge':'true','kind':'HTPasswdPasswordIdentityProvider','filename':'/etc/origin/master/htpasswd'}]
#ntp时间同步
openshift_clock_enabled=true
#节点配额
openshift_node_kubelet_args={'pods-per-core': ['10']}
# host group for masters
[masters]
master openshift_schedulable=True
# host group for nodes, includes region info
[nodes]
master openshift_node_labels="{'region': 'infra'}"
node1 openshift_node_labels="{'region': 'infra', 'zone': 'default'}"
node2 openshift_node_labels="{'region': 'infra', 'zone': 'default'}"
master openshift_node_group_name='node-config-master'
node1 openshift_node_group_name='node-config-compute'
node2 openshift_node_group_name='node-config-compute'
[etcd]
master
部署前检测
这一步如报错需排错后运行多次
[root@master openshift-ansible]# cd
[root@master ~]# ansible-playbook -i /etc/ansible/hosts /root/openshift-ansible/playbooks/prerequisites.yml
....
....
PLAY RECAP *************************************************************************************************************************
localhost : ok=11 changed=0 unreachable=0 failed=0 skipped=5 rescued=0 ignored=0
master : ok=94 changed=22 unreachable=0 failed=0 skipped=89 rescued=0 ignored=0
node1 : ok=65 changed=21 unreachable=0 failed=0 skipped=71 rescued=0 ignored=0
node2 : ok=65 changed=21 unreachable=0 failed=0 skipped=71 rescued=0 ignored=0
INSTALLER STATUS *******************************************************************************************************************
Initialization : Complete (0:01:33)
正式部署
# 提前创建密码保存文件,不然待会会报错
[root@master ~]# mkdir -p /etc/origin/master/
[root@master ~]# touch /etc/origin/master/htpasswd
[root@master ~]# ansible-playbook -i /etc/ansible/hosts /root/openshift-ansible/playbooks/deploy_cluster.yml
....
....
PLAY RECAP *************************************************************************************************************************
localhost : ok=11 changed=0 unreachable=0 failed=0 skipped=5 rescued=0 ignored=0
master : ok=692 changed=221 unreachable=0 failed=0 skipped=1061 rescued=0 ignored=0
node1 : ok=123 changed=19 unreachable=0 failed=0 skipped=166 rescued=0 ignored=0
node2 : ok=123 changed=19 unreachable=0 failed=0 skipped=166 rescued=0 ignored=0
INSTALLER STATUS *******************************************************************************************************************
Initialization : Complete (0:00:11)
Health Check : Complete (0:00:04)
Node Bootstrap Preparation : Complete (0:02:13)
etcd Install : Complete (0:00:12)
Master Install : Complete (0:03:15)
Master Additional Install : Complete (0:00:19)
Node Join : Complete (0:03:23)
Hosted Install : Complete (0:00:29)
Cluster Monitoring Operator : Complete (0:04:40)
Web Console Install : Complete (0:02:19)
Console Install : Complete (0:01:33)
Service Catalog Install : Complete (0:04:53)
# 安装结果检查
[root@master ~]# oc get nodes
NAME STATUS ROLES AGE VERSION
master.lab.example.com Ready master 1h v1.11.0+d4cacc0
node1.lab.example.com Ready compute 1h v1.11.0+d4cacc0
node2.lab.example.com Ready infra 1h v1.11.0+d4cacc0
注:二个node节点的roles类型不对,需要将node2打上compute标签,因为它也是计算节点
[root@master ~]# oc label node node2.lab.example.com node-role.kubernetes.io/compute=true
[root@master ~]# oc get nodes
NAME STATUS ROLES AGE VERSION
master.lab.example.com Ready master 46m v1.11.0+d4cacc0
node1.lab.example.com Ready compute 42m v1.11.0+d4cacc0
node2.lab.example.com Ready compute,infra 42m v1.11.0+d4cacc0
# 查看pods
[root@master ~]# oc get pods
NAME READY STATUS RESTARTS AGE
docker-registry-1-t259n 1/1 Running 1 1h
registry-console-1-4bw98 1/1 Running 1 1h
router-6-rcxwv 1/1 Running 0 8m
# 查看svc
[root@master ~]# oc get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
docker-registry ClusterIP 172.30.162.101 <none> 5000/TCP 1h
kubernetes ClusterIP 172.30.0.1 <none> 443/TCP,53/UDP,53/TCP 2h
registry-console ClusterIP 172.30.20.229 <none> 9000/TCP 1h
router ClusterIP 172.30.220.154 <none> 80/TCP,443/TCP,1936/TCP 1h
# 查看路由
[root@master ~]# oc get route
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
docker-registry docker-registry-default.master docker-registry <all> passthrough None
registry-console registry-console-default.master registry-console <all> passthrough None
# 查看bc
[root@master ~]# oc get dc
NAME REVISION DESIRED CURRENT TRIGGERED BY
docker-registry 1 1 1 config
registry-console 1 1 1 config
router 6 1 1 config
# 查看版本
[root@master ~]# oc version
oc v3.11.0+62803d0-1
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO
Server https://master.lab.example.com:8443
openshift v3.11.0+07ae5a0-606
kubernetes v1.11.0+d4cacc0
注: 现在/etc/origin/master/htpasswd还是空的,所以要先创建管理账号和普通账号
# 使用管理员登录集群并创建管理员账号
[root@master ~]# oc login -u system:admin
[root@master ~]# oc whoami
system:admin
# 创建管理员账号
格式:htpasswd -b /etc/origin/master/htpasswd <用户名> <密码>
[root@master ~]# htpasswd -b /etc/origin/master/htpasswd admin redhat
# 授权(集群级权限有点大,需要注意)
[root@master ~]# oc adm policy add-cluster-role-to-user cluster-admin admin
# 给某个用户添加角色
格式: oc adm policy add-role-to-user <角色> <用户名> -n <项目名>
[root@master ~]# oc adm policy add-role-to-user admin admin -n webapp
# 使用新用户登录集群
[root@master ~]# oc login -u admin -p redhat
Login successful.
You don't have any projects. You can try to create a new project, by running
oc new-project <projectname>
# 确定"我是谁"
[root@master ~]# oc whoami
admin
# 使用管理员创建项目并授权给admin用户
[root@master ~]# oc login -u system:admin
# 创建项目
[root@master ~]# oc new-project webapp
==================================================
# 删除用户
[root@master ~]# vim /etc/origin/master/htpasswd # 将admin用户删除即可
[root@master ~]# oc delete user admin
==================================================
浏览器验证
账密: admin/redhat(也就是上面自己创建的账密)
管理员用户创建项目
注: system:admin用户是系统自带的用户,是权限最高的用户,不支持远程登录,只有在集群中能登录且该用户没有密码
[root@master ~]# oc whoami
system:admin
[root@master ~]# oc login -u admin -p redhat
[root@master ~]# oc whoami
admin # 虽然也是admin但和上面的system:admin用户 !!!完全!!!不同
# 创建一个项目(在K8S中这一步相当于namespace)
[root@master ~]# oc new-project test
Now using project "test" on server "https://master.lab.example.com:8443".
You can add applications to this project with the 'new-app' command. For example, try:
oc new-app centos/ruby-25-centos7~https://github.com/sclorg/ruby-ex.git
to build a new example application in Ruby.
# 创建应用(按上一步的回显提示进行)
格式:oc new-app <image>:[tag]~<source code> [--name <container_name>]
[root@master ~]# oc new-app centos/ruby-25-centos7~https://github.com/sclorg/ruby-ex.git
....
....
Tags: builder, ruby, ruby25, rh-ruby25
* An image stream tag will be created as "ruby-25-centos7:latest" that will track the source image
* A source build using source code from https://github.com/sclorg/ruby-ex.git will be created
* The resulting image will be pushed to image stream tag "ruby-ex:latest"
* Every time "ruby-25-centos7:latest" changes a new build will be triggered
* This image will be deployed in deployment config "ruby-ex"
* Port 8080/tcp will be load balanced by service "ruby-ex"
* Other containers can access this service through the hostname "ruby-ex"
--> Creating resources ...
imagestream.image.openshift.io "ruby-25-centos7" created
imagestream.image.openshift.io "ruby-ex" created
buildconfig.build.openshift.io "ruby-ex" created
deploymentconfig.apps.openshift.io "ruby-ex" created
service "ruby-ex" created
--> Success
Build scheduled, use 'oc logs -f bc/ruby-ex' to track its progress.
Application is not exposed. You can expose services to the outside world by executing one or more of the commands below:
'oc expose svc/ruby-ex'
Run 'oc status' to view your app.
# 查看各组件状态
[root@master ~]# oc get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ruby-ex ClusterIP 172.30.15.243 <none> 8080/TCP 2m
[root@master ~]# oc get pod
ruby-ex-1-build 0/1 Init:Error 0 3m # 这里错误我未修复,只为说明使用步骤
[root@master ~]# oc logs -f po/ruby-ex-1-build
[root@master ~]# oc describe pod
# 查看所有事件
[root@master ~]# oc get events
======================= 删除应用相关 ===========================
[root@master ~]# oc get all # 获得所有bc、dc、is、svc
[root@master ~]# oc delete deploymentconfig test111
[root@master ~]# oc delete buildconfig test111
[root@master ~]# oc delete build test111-1 # test111-1是应用名
[root@master ~]# oc delete build test-1
[root@master ~]# oc delete imagestreams test111
[root@master ~]# oc delete svc test111
或者(以下一条等于上面6条)
[root@master ~]# oc delete all -l app=test111
=============================================================
部署应用
部署一个nginx为例,在default这个namespace下
创建yaml文件
oc create deployment web --image=nginx:1.14 --dry-run -o yaml > nginx.yaml
# nginx.yaml内容如下
apiVersion: apps/v1
kind: Deployment
metadata:
creationTimestamp: null
labels:
app: web
name: web
spec:
replicas: 1
selector:
matchLabels:
app: web
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
app: web
spec:
containers:
- image: nginx:1.14
name: nginx
resources: {}
status: {}
oc apply -f nginx.yaml
此时会发现容器起不来,用docker logs查看对应节点上的容器日志会发现一堆Permission Denied,这是因为OpenShift默认不给root权限,而nginx是需要root权限的
参考:https://stackoverflow.com/questions/42363105/permission-denied-mkdir-in-container-on-openshift
将anyuid这个scc( Security Context Constraints)赋给default这个命名空间,再重新部署下就可以了
oc adm policy add-scc-to-user anyuid -z default
oc apply -f nginx.yaml
OC常用命令合集
# 获取集群所有节点
oc get nodes
# 查看对应节点详细信息,可以看到运行在该节点下的pod
oc describe node node-name
# 查看对应namespace下pod
oc get pods -n namespace-name
# 查看当前project下的所有资源
oc get all
# 查看登录在哪个project下
oc status
# 查看对应namespace下pod详情
oc get pods -o wide -n namespace-name
# 查看pod详细信息
oc describe pod pod-name -n namespace-name
# 获取对应namespace的limitrange配置文件
oc get limitrange -n namespace-name
# 查看配置文件详情
oc describe limitrange limitrange.config -n namespace-name
# 修改limitrange配置
oc edit limitrange limitrange.config -n namespace-name
# 切换到project
oc project project-name
# 为该project下default账户开启anyuid,可以使用root权限,一般是安装/运行某些软件时需要
oc adm policy add-scc-to-user anyuid -z default
# 删除default的anyuid权限
oc adm policy remove-scc-from-user anyuid -z default
# 查看该project下的pod
oc get pod
# 查看该project下的service
oc get service
# 查看该project下的Endpoints
oc get endpoints
# 重启pod
oc delete pod pod-name -n namespace-name
# 查看dc历史版本
oc rollout history DeploymentConfig/dc-name
# 回滚到5版本
oc rollout history DeploymentConfig/dc-name --revision=5
# 设置副本数为3
oc scale dc pod-name --replicas=3 -n namespace
# 设置自动伸缩最小2,最大10
oc autoscale dc dc-name --min=2 --max=10 --cpu-percent=80
# 查看scc
oc get scc
# 查看anyuid详细信息,user即包含已经开启anyuid的project
oc describe scc anyuid
# 查看集群管理员角色及权限
oc describe clusterrole.rbac
# 查看用户组及绑定的角色
oc describe clusterrolebinding.rbac
# 添加username为cluster-admin
oc adm policy add-cluster-role-to-user cluster-admin username
# 查看所有namespace的route
oc get routes --all-namespaces
# 查看pod log
oc logs -f pod-name
# 查看pod对应containerID
docker ps -a|grep pod-name
# 登录到container
docker exec -it containerID /bin/sh
# 创建project
oc new-project my-project
# 查看当前项目状态
oc status
# 查看server支持的api资源
oc api-resources
# 收集当前集群的状态信息
oc adm must-gather
# 查看pod资源状态
oc adm top pods
# 查看节点资源状态
oc adm top node
# 查看images使用情况
oc adm top images
# 标记node1为SchedulingDisabled
oc adm cordon node1
# 标记node1为unschedulable
oc adm manage-node <node1> --schedulable = false
# 标记node1为schedulable
oc adm manage-node node1 --schedulable
# 将node1进入维护模式
oc adm drain node1
# 删除node
oc delete node
# 查询CSR(certificate signing requests)
oc get csr
# 同意和拒绝CSR
oc adm certificate { approve | deny } csr-name
# approve所有csr
oc get csr|xargs oc adm certificate approve csr
# 设置自动approve csr
echo 'openshift_master_bootstrap_auto_approve=true' >> /etc/ansible/hosts
# 获取project
oc get project <projectname>
# 查看project信息
oc describe project projectname
# 查看pod-name yaml文件
oc get pod pod-name -o yaml
# 查看节点label
oc get nodes --show-labels
# 给node添加标签,pod也要有对应label(在第二层spec下添加nodeSelector),pod就会运行在指定的node上
oc label nodes node-name label-key=label-value
# 更新指定key值的label
oc label nodes node-name key=new-value --overwrite
# 删除指定key值的label
oc label nodes node-name key-
# 查看运行在某node上的所有pod
oc adm manage-node node-name --list-pods
# mark node as unschedulable
oc adm manage-node node-name --schedulable=false
# 使用命令行登录openshift,token是用你自己的账户在登录网址时生成的token
oc login --token=iz56jscHZp9mSN3kHzjayaEnNo0DMI_nRlaiJyFmN74 --server=https://console.qa.c.sm.net:8443
# 获取/查看token
TOKEN=$(oc get secret $(oc get serviceaccount default -o jsonpath='{.secrets[0].name}') -o jsonpath='{.data.token}' | base64 --decode )
# 获取/查看apiserver
APISERVER=$(oc config view --minify -o jsonpath='{.clusters[0].cluster.server}')
# 命令行访问API
curl $APISERVER/api --header "Authorization: Bearer $TOKEN" --insecure
# 使用命令行登录pod
oc rsh pod-name
# 查看当前登录的服务地址
oc whoami --show-server
# 查看当前登录账户
oc whoami
# 查看dc(deployment config)
oc get dc -n namespace
# 查看deploy
oc get deploy -n namespace
# 编辑deploy yaml文件
oc edit deploy/deployname -o yaml -n namespace
# 查看cronjob
oc get cronjob
# 编辑cronjob
oc edit cronjob/cronjob-name -n namespace-name
# 查看cronjob
oc describe cm/configmap-name -n namespace-name
# 查询configmap
oc get configmap -n namespace-name
# 查询configmap
oc get cm -n namespace-name
# 查看集群pod网段规划
cat /etc/origin/master/master-config.yaml|grep cidr
# 编辑VM yaml文件
oc edit vm appmngr54321-poc-msltoibh -n appmngr54321-poc -o yaml
# 获取sa-name的token
oc serviceaccounts get-token sa-name
# 使用token登录
oc login url --token=token
# 扩展pod副本数量
oc scale deployment deployment-name --replicas 5
# 查看config
oc config view
# get api versions
oc api-versions
# get api-resources
oc api-resources
# 查询HPA
oc get hpa --all-namespaces
# 查看HPA,可以看到Metrics,Events
oc describe hpa/hpaname -n namespace
# 创建sa
oc create serviceaccount caller
# 赋予cluster-admin权限
oc adm policy add-cluster-role-to-user cluster-admin -z caller
# get sa token
oc serviceaccounts get-token caller
# 获取kubeconfig文件位置
echo $KUBECONFIG
# 根据labelselector筛选cm
oc get cm --all-namespaces -l app=deviation
# 查询pod CPU/mem usage,openshift4.X适用
oc describe PodMetrics podname
# 查看shortnames、apiGroup、verbs
oc api-resources -o wide
# 按selector删除
oc delete pod --selector logging-infra=fluentd
# watch 某个pod状态
oc get pods -n logger -w
# 查看资源对象的定义
oc explain pv.spec
# 查看MutatingWebhook
oc get MutatingWebhookConfiguration
# 查看ValidatingWebhook
oc get ValidatingWebhookConfiguration
# 给validatingwebhook注入CA
oc annotate validatingwebhookconfigurations <validating_webhook_name> service.beta.openshift.io/inject-cabundle=true
# 给mutatingwebhook注入CA
oc annotate mutatingwebhookconfigurations <mutating_webhook_name> service.beta.openshift.io/inject-cabundle=true
# 根据label查询pv
oc get pv --selector=='path=testforocp'
# 根据自定义annotations查询,返回cm name
oc get cm --all-namespaces -o=jsonpath='{.items[?(@..metadata.annotations.cpuUsage=="0")].metadata.name}'
ns=$(oc get cm --selector='app=dev' --all-namespaces|awk '{print $1}'|grep -v NAMESPACE)
for i in $ns;do oc get cm dev -oyaml -n $i >> /tmp/test.txt; done;
标签:社区,name,get,部署,oc,openshift3.11,master,root,openshift From: https://www.cnblogs.com/oodcloud/p/16739873.html