案例说明:
在一些通用机的生产环境,不允许主机之间通过ssh通讯,或者不允许root用户建立ssh互信或登录。默认KingbaseES V8R3集群通用机环境部署需要建立数据库用户及root用户,在集群节点之间建立ssh互信,如果生产环境不允许,可以使用集群自带的es_server工具建立节点之间的通讯,部署集群。
适用版本:
KingbaseES V8R3(本案例使用较新版本 V008R003C002B0370,较早版本不支持。)
集群节点信息:
一、部署集群软件环境
1、安装数据库软件(集群任一节点)
2、查看集群安装所需文件
# 如下所示,集群部署所需的压缩包
[kingbase@node101 Lin64]$ pwd
/opt/Kingbase/ES/V8R3_370/DeployTools/zip/Lin64
[kingbase@node101 Lin64]$ ls
db.zip kingbasecluster.zip
3、创建集群安装目录及分发集群所需文件(所有节点)
# 创建集群安装目录
[kingbase@node102 ~]$ mkdir -p /home/kingbase/cluster/HAR3/
# 分发安装包到集群安装目录下
[kingbase@node101 Lin64]$ cp *.zip /home/kingbase/cluster/HAR3/
[kingbase@node101 Lin64]$ cp /data/soft/license_V8R3_2022-01-04-365.dat /home/kingbase/cluster/HAR3/license.dat
[kingbase@node101 r3_install]$ ls -lh /home/kingbase/cluster/HAR3/
total 37M
-rw-r--r-- 1 kingbase kingbase 32M Sep 28 13:13 db.zip
-rw-r--r-- 1 kingbase kingbase 5.3M Sep 28 13:13 kingbasecluster.zip
-rw-r--r-- 1 kingbase kingbase 3.1K Sep 28 13:14 license.dat
4、解压集群部署压缩包
[kingbase@node101 HAR3]$ unzip db.zip
[kingbase@node101 HAR3]$ unzip kingbasecluster.zip
[kingbase@node101 HAR3]$ ls
db db.zip kingbasecluster kingbasecluster.zip license.dat
[kingbase@node101 db]$ ls
bin data es_server etc kb_scripts lib share
5、执行NEWHA.sh脚本(更新集群部署所需的脚本,以适用于es_server部署)
[kingbase@node101 es_server]$ sh NEWHA.sh
[CHECK] check old files in /home/kingbase/cluster/HAR3/db/bin ...
[INFO] /home/kingbase/cluster/HAR3/db/bin/install.conf exist, will rename it
......
[UPDATE] update files in /home/kingbase/cluster/HAR3/kingbasecluster ... OK
[UPDATE] DONE
二、部署和配置es_server服务环境(root用户)
1、查看es_server配置文件(es_server默认使用8890端口,可以修改此配置文件,更改端口号)
[kingbase@node101 share]$ cat esHA.conf
# it can be 'systemd' or 'crontab'
# systemd: start the es_server by service of systemctl
# crontab: start the es_server by crontab
start_method=systemd
# the port of es_server
# if it is null, will be default 8890
es_port=8890
2、初始化和启动es_server服务
[root@node101 bin]# sh esHAservice.sh --help
Usage: esHAservice.sh { init | start | stop | status }
# 初始化es_server服务环境
[root@node101 bin]# sh esHAservice.sh init
successfully initialized the es_server, please use "esHAservice.sh start" to start the es_server
[root@node101 bin]# sh esHAservice.sh start
Created symlink from /etc/systemd/system/multi-user.target.wants/es_server.service to /etc/systemd/system/es_server.service.
[root@node101 bin]# ps -ef |grep es_server
root 20196 1 0 13:28 ? 00:00:00 /home/kingbase/cluster/HAR3/db/bin/es_server -f /home/kingbase/cluster/HAR3/db/share/es_server.conf
[root@node101 bin]# netstat -antlp |grep 8890
tcp 0 0 0.0.0.0:8890 0.0.0.0:* LISTEN 20196/es_server
[root@node101 bin]#
3、systemctl管理es_server
# 停止es_server服务
[root@node101 ~]# systemctl stop es_server
[root@node101 ~]# systemctl status es_server
● es_server.service - KingbaseES - es_server daemon
Loaded: loaded (/etc/systemd/system/es_server.service; enabled; vendor preset: disabled)
Active: inactive (dead) since Wed 2022-09-28 14:12:02 CST; 2s ago
Process: 20795 ExecStart=/home/kingbase/cluster/HAR3/db/bin/es_server -f /home/kingbase/cluster/HAR3/db/share/es_server.conf (code=exited, status=0/SUCCESS)
Main PID: 20795 (code=exited, status=0/SUCCESS)
Sep 28 14:11:50 node101 systemd[1]: Started KingbaseES - es_server daemon.
Sep 28 14:12:02 node101 systemd[1]: Stopping KingbaseES - es_server daemon...
Sep 28 14:12:02 node101 systemd[1]: Stopped KingbaseES - es_server daemon.
#启动es_server服务
[root@node101 ~]# systemctl start es_server
[root@node101 ~]# systemctl status es_server
● es_server.service - KingbaseES - es_server daemon
Loaded: loaded (/etc/systemd/system/es_server.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2022-09-28 14:12:10 CST; 3s ago
Main PID: 20819 (es_server)
Tasks: 1
CGroup: /system.slice/es_server.service
└─20819 /home/kingbase/cluster/HAR3/db/bin/es_server -f /home/kingbase/cluster/HAR3/db/share/es_server.conf
Sep 28 14:12:10 node101 systemd[1]: Started KingbaseES - es_server daemon.
4、测试节点间通讯
[root@node102 es_server]# ./es_client root@192.168.1.101 'hostname'
node101
5、查看es_server服务配置
# systemctl管理配置
[root@node101 bin]# cat /etc/systemd/system/es_server.service
[Unit]
Description=KingbaseES - es_server daemon
After=network.target
[Service]
Type=simple
ExecStart=/home/kingbase/cluster/HAR3/db/bin/es_server -f /home/kingbase/cluster/HAR3/db/share/es_server.conf
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=always
RestartSec=10s
[Install]
WantedBy=multi-user.target
#用户密钥认证配置
[root@node101 ~]# ls .es/ -lh
total 8.0K
-rw------- 1 root root 381 Sep 28 13:28 accept_hosts
-rw------- 1 root root 1.7K Sep 28 13:28 key_file
[root@node101 ~]# ls -lh /home/kingbase/.es
total 8.0K
-rw------- 1 kingbase kingbase 381 Sep 28 13:28 accept_hosts
-rw------- 1 kingbase kingbase 1.7K Sep 28 13:28 key_file
三、执行脚本部署集群
1、部署配置文件
[kingbase@node101 bin]$ cat install.conf |grep -v ^$|grep -v ^#
on_bmj=0
all_node_ip=(192.168.1.101 192.168.1.102)
cluster_path="/home/kingbase/cluster/HAR3"
db_package="/home/kingbase/cluster/HAR3/db.zip"
cluster_package="/home/kingbase/cluster/HAR3/kingbasecluster.zip"
license_file=(license.dat)
db_user="SYSTEM" # the user name of database
db_password="123456" # the password of database, since the R3 has a special feature that password cannot be stored in clear text, please delete the password after the cluster deployment is complete.
db_port="54321" # the port of database, defaults is 54321
trust_ip="192.168.1.1"
db_vip="192.168.1.204"
cluster_vip="192.168.1.205"
net_device=(enp0s3 enp0s3)
ipaddr_path="/sbin"
arping_path="/home/kingbase/cluster/HAR3/db/bin/"
super_user="root"
cluster_user="kingbase"
use_sshd=0 # choose whether to use sshd service, 0 means not to use, 1 means to use, default value is 0
wd_deadtime="30" # cluster heartbeats timeout, unit: seconds
check_retries="6" # number of detection retries in case of database failure
check_delay="10" # detection retry interval in case of database failure
connect_timeout="10000" # timeout value in milliseconds before giving up to connect to backend
auto_primary_recovery="0" # automatic recovery parameter of cluster primary host, default value is 0
ssh_port="22" # the port of sshd [if on_bmj=1 or use_sshd=0, you do not need to configure this parameter]
es_port="8890" # the port of es_server
case_sensitive="on" # select whether database is case sensitive, off means case insensitive, on means case sensitive,the default value is on
max_available_level="1" # when all databases are down, should the cluster be automatically started. 1 means yes, 0 means no, default value is 1
如下图所示:关闭ssh分发的部署。
2、执行部署脚本
[kingbase@node101 bin]$ sh V8R3_cluster_install.sh
[INFO]-Check if the cluster_vip "192.168.1.205" is already exist ...
.......
[INSTALL] start up the slave on "192.168.1.102" ... OK
[INSTALL] Create physical_replication_slot on 192.168.1.101
SYS_CREATE_PHYSICAL_REPLICATION_SLOT
--------------------------------------
(slot_node1,)
(1 row)
[INSTALL] Create physical_replication_slot on 192.168.1.101 ... OK
[INSTALL] Create physical_replication_slot on 192.168.1.101
SYS_CREATE_PHYSICAL_REPLICATION_SLOT
--------------------------------------
(slot_node2,)
(1 row)
[INSTALL] Create physical_replication_slot on 192.168.1.101 ... OK
[INSTALL] Create physical_replication_slot on 192.168.1.102
SYS_CREATE_PHYSICAL_REPLICATION_SLOT
--------------------------------------
(slot_node1,)
(1 row)
[INSTALL] Create physical_replication_slot on 192.168.1.102 ... OK
[INSTALL] Create physical_replication_slot on 192.168.1.102
SYS_CREATE_PHYSICAL_REPLICATION_SLOT
--------------------------------------
(slot_node2,)
(1 row)
[INSTALL] Create physical_replication_slot on 192.168.1.102 ... OK
[INSTALL] start up the whole cluster ...
-----------------------------------------------------------------------
2022-09-28 14:26:38 KingbaseES automation beging...
......
2022-09-28 14:26:48 Del kingbase VIP [192.168.1.204/24] ...
DEL VIP NOW AT 2022-09-28 14:26:46 ON enp0s3
No VIP on my dev, nothing to do.
2022-09-28 14:26:48 Done...
......................
all stop..
ping trust ip 192.168.1.1 success ping times :[3], success times:[2]
......
Redirecting to /bin/systemctl restart crond.service
Redirecting to /bin/systemctl restart crond.service
......................
all started..
...
now we check again
=======================================================================
| ip | program| [status]
[ 192.168.1.101]| [kingbasecluster]| [active]
[ 192.168.1.102]| [kingbasecluster]| [active]
[ 192.168.1.101]| [kingbase]| [active]
[ 192.168.1.102]| [kingbase]| [active]
=======================================================================
[INSTALL] start up the whole cluster ... OK
---如上所示:集群部署成功。
四、验证集群
1、查看集群节点状态
[kingbase@node101 bin]$ ./ksql -U SYSTEM -W 123456 TEST -p 9999
ksql (V008R003C002B0370)
Type "help" for help.
TEST=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay
---------+---------------+-------+--------+-----------+---------+------------+-------------------+-------------------
0 | 192.168.1.101 | 54321 | up | 0.500000 | primary | 0 | false | 0
1 | 192.168.1.102 | 54321 | up | 0.500000 | standby | 0 | true | 0
(2 rows)
2、查看流复制状态
[kingbase@node101 bin]$ ./ksql -U SYSTEM -W 123456 TEST
ksql (V008R003C002B0370)
Type "help" for help.
TEST=# select * from sys_stat_replication;
PID | USESYSID | USENAME | APPLICATION_NAME | CLIENT_ADDR | CLIENT_HOSTNAME | CLIENT_PORT | BACKEND_START |BACKEND_XMIN | STATE | SENT_LOCATION | WRITE_LOCATION | FLUSH_LOCATION | REPLAY_LOCATION | SYNC_PRIORITY | SYNC_STATE
-------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+--------------+-----------+---------------+----------------+----------------+-----------------+---------------+------------
27905 | 10 | SYSTEM | node2 | 192.168.1.102 | | 59898 | 2022-09-28 14:26:58.204189+08 | | streaming | 0/30000D0 | 0/30000D0 | 0/30000D0 | 0/30000D0 | 2 | sync
(1 row)
五、总结
通过es_server在无ssh的通用机环境,可以很方便的执行集群的部署,对于一些对安全要求比较严格的生产环境,可以参考以上案例执行集群的部署。