一、关于MogHA
MogHA 是云和恩墨基于 MogDB 同步异步流复制技术自研的一款保障数据库主备集群高可用的企业级软件系统
(适用于 MogDB 和 openGauss 数据库)
MogHA 能够自主探测故障实现故障转移,虚拟IP自动漂移等特性,使得数据库的故障持续时间从分钟级降到秒级(RPO=0,RTO<30s),确保数据库集群的高可用服务。
二、为什么数据库支持主备,还需要 MogHA
首先我们需要理解一下什么是高可用,高可用的目的是为了让数据库尽可能提供连续服务,以保证上层业务的稳定运行。数据库虽然支持主备库的部署结构,其目的是防止单点故障。但数据库并不提供故障检测以及自动化切换主备的功能,这也不属于数据库的处理范畴。所以需要有 MogHA 这样的一套高可用系统,来保证数据库服务的连续性。
三、功能特性
自主发现数据库实例角色
自主故障转移
支持网络故障检测
支持磁盘故障检测
虚拟IP自动漂移
感知双主脑裂,自动选主
数据库进程和CPU绑定
HA自身进程高可用
支持单机并行部署多套 MogHA
支持 x86_64 和 aarch64
四、系统架构
五、支持的模式
Lite 模式(推荐)
Lite 模式,顾名思义即轻量级模式,该模式仅需在主库和一台同步备机器上启动 MogHA 服务,此时 MogHA 服务可以保证这两台机器上数据库实例的高可用,当主库发生不可修复的问题或者网络隔离时,MogHA 可以自主地进行故障切换和虚拟IP漂移。
Full 模式
Full模式相较于 lite 模式,需要在所有实例机器上运行 MogHA 服务,且所有的实例有由 MogHA 来自动管理,当出现主库故障时,会优先选择本机房同步备进行切换,如果本机房同步备也是故障的情况,会选择同城备机房的同步备进行切换。为了达到RPO=0,MogHA 不会选择异步备库进行切换,以防止数据丢失。该模式会在主备切换时,会自动修改数据库的复制连接及同步备列表配置。
举例:两地三中心【1主6备】
六、实施步骤
6.1 已安装好mogdb一主二备
6.2 omm需要有sudo权限
[root@mogdb2 ~]# chmod +w /etc/sudoers
[root@mogdb2 ~]# which systemctl
/bin/systemctl
[root@mogdb2 ~]# which ifconfig
/sbin/ifconfig
[root@mogdb2 ~]#
[root@mogdb2 ~]# vi /etc/sudoers
#追加下面两行到文件末尾
omm ALL=(ALL) NOPASSWD:/bin/systemctl
omm ALL=(ALL) NOPASSWD:/sbin/ifconfig
[root@mogdb1 ~]# chmod -w /etc/sudoers
6.3 安装 MogHA 服务
[root@mogdb2 ~]# cd /home/omm/mogha/
[root@mogdb2 mogha]# sudo ./install.sh omm /opt/mogdb/install/data/dn
[2022-11-02 18:06:09]: MogHA installation directory: /home/omm/mogha
[2022-11-02 18:06:09]: runtime user: omm, user group: dbgrp
[2022-11-02 18:06:09]: PGDATA=/opt/mogdb/install/data/dn
[2022-11-02 18:06:09]: LD_LIBRARY_PATH=/opt/mogdb/install/app/lib:/opt/mogdb/install/om/lib:/opt/mogdb/install/om/lib:/opt/mogdb/install/om/script/gspylib/clib:
[2022-11-02 18:06:09]: GAUSSHOME=/opt/mogdb/install/app
[2022-11-02 18:06:09]: database port: 15400
[2022-11-02 18:06:09]: architecture: x86_64
[2022-11-02 18:06:09]: modify owner for installation dir...
[2022-11-02 18:06:09]: generate mogha.service file...
[2022-11-02 18:06:09]: move mogha.service to /usr/lib/systemd/system/
[2022-11-02 18:06:09]: reload systemd
[2022-11-02 18:06:09]: mogha service register successful
[2022-11-02 18:06:09]: add sudo permissions to /etc/sudoers.d/omm for omm
[2022-11-02 18:06:09]: not found node.conf in current directory, generate it
[2022-11-02 18:06:09]: node.conf generated, location: /home/omm/mogha/node.conf
[2022-11-02 18:06:09]: MogHA install successfully!
Please edit /home/omm/mogha/node.conf first before start MogHA service!!!
Manage MogHA service by systemctl command:
Start service: sudo systemctl start mogha
Stop service: sudo systemctl stop mogha
Restart service: sudo systemctl restart mogha
Uninstall MogHA service command:
sudo ./uninstall.sh mogha
6.4node.conf 配置
[config]
db_port=15400
db_user=omm
db_datadir=/opt/mogdb/install/data/dn
primary_info=/home/omm/mogha/primary_info
standby_info=/home/omm/mogha/standby_info
meta_file_type=json
lite_mode=false
agent_port=8081
http_req_timeout=3
heartbeat_interval=3
primary_lost_timeout=10
primary_lonely_timeout=10
double_primary_timeout=10
taskset=false
logger_format=%(asctime)s %(levelname)s [%(filename)s:%(lineno)d]: %(message)s
log_dir=/home/omm/mogha
log_max_size=512MB
log_backup_count=10
allow_ips=
handle_down_primary=true
handle_down_standby=true
primary_down_handle_method=restart
restart_strategy=10/3
uce_error_detection=true
uce_detect_max_lines=200
debug_mode=false
[host1]
ip=192.168.3.69
heartbeat_ips=
[host2]
ip=192.168.3.68
heartbeat_ips=
[host3]
ip=192.168.3.70
heartbeat_ips=
[zone1]
vip=192.168.3.222
hosts=host1,host2,host3
ping_list=192.168.3.1
cascades=
6.5启动服务
三台服务器都需要启动MogHA服务,启动服务指令如下:
sudo systemctl start mogha
6.6 查看运行状态
[omm@mogdb2 mogha]$ sudo systemctl status mogha
● mogha.service - MogHA High Available Service
Loaded: loaded (/usr/lib/systemd/system/mogha.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2022-11-02 20:48:10 CST; 7min ago
Docs: https://docs.mogdb.io/zh/mogha/v2.0/installation-and-depolyment
Main PID: 14450 (mogha)
Tasks: 22
Memory: 82.3M
CGroup: /system.slice/mogha.service
├─14450 /home/omm/mogha/mogha -c /home/omm/mogha/node.conf
├─14471 mogha: watchdog
├─14474 mogha: http-server
├─14475 mogha: heartbeat
├─22856 ping -c 3 -i 0.5 192.168.3.1
├─22930 ping -c 3 -i 0.5 192.168.3.68
├─22932 ping -c 3 -i 0.5 192.168.3.70
└─22934 ping -c 3 -i 0.5 192.168.3.1
Nov 02 20:48:10 mogdb2 systemd[1]: Started MogHA High Available Service.
Nov 02 20:48:10 mogdb2 mogha[14450]: MogHA Version: Version: 2.3.8
Nov 02 20:48:10 mogdb2 mogha[14450]: GitHash: 9fc105e
Nov 02 20:48:10 mogdb2 mogha[14450]: config loaded successfully
6.7 设置开机启动
sudo systemctl enable mogha
6.8 查看VIP是否绑定在主机
[omm@mogdb2 mogha]$ ifconfig
enp0s3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.3.69 netmask 255.255.255.0 broadcast 192.168.3.255
inet6 fe80::4cf3:db2a:1070:e254 prefixlen 64 scopeid 0x20<link>
inet6 fe80::8ad9:5e73:e1cf:5d08 prefixlen 64 scopeid 0x20<link>
ether 08:00:27:86:f6:22 txqueuelen 1000 (Ethernet)
RX packets 669541 bytes 87133424 (83.0 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 713940 bytes 221250135 (211.0 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
enp0s3:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.3.222 netmask 255.255.255.0 broadcast 192.168.3.255
ether 08:00:27:86:f6:22 txqueuelen 1000 (Ethernet)
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 777680 bytes 352174236 (335.8 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 777680 bytes 352174236 (335.8 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
virbr0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 192.168.122.1 netmask 255.255.255.0 broadcast 192.168.122.255
ether 52:54:00:88:a4:65 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
[omm@mogdb2 mogha]$