PostgreSQL一主一从高可用搭建

标签：PostgreSQL log app msp pgsql 一主一 conf repmgr 搭建

1. PostgreSQL一主一从高可用搭建（VIP）

IP	主机名	角色
192.168.198.146	test11	主服务器
192.168.198.143	test12	备用服务器
192.168.198.144	test13	见证服务器

一、基本环境配置

1、hosts文件配置（所有节点）

vim /etc/hosts
# 新增
192.168.198.146   msp-app01
192.168.198.143   msp-app02
192.168.198.144   msp-app03

2、配置SSH免密登录

root用户下免密登录设置（所有节点都执行）

ssh-keygen
ssh-copy-id 192.168.198.146
ssh-copy-id 192.168.198.143
ssh-copy-id 192.168.198.144

postgres用户下免密登录设置，postgresql需要postgres用户

#创建postgres用户
groupadd postgres
useradd -g postgres postgres
passwd postgres

su - postgres
ssh-keygen
ssh-copy-id 192.168.198.146
ssh-copy-id 192.168.198.143
ssh-copy-id 192.168.198.144

3、添加sudo权限

主节点与备用节点postgres用户添加sudo权限，用于执行VIP自动切换脚本

sudo visudo

postgres ALL=(ALL)       NOPASSWD:ALL

如果上面的方法无法使用：

创建完用户后，需要给用户添加权限。

[root@localhost ~]# chmod -v u+w /etc/sudoers  
mode of '/etc/sudoers' changed from 0440 (r--r-----) to 0640 (rw-r-----)

修改sudoers权限，使其可编辑，编辑sudoers文件。

vim /etc/sudoers
## Allow root to run any commands anywhere 
root    ALL=(ALL)       ALL
postgres ALL=(ALL)       NOPASSWD:ALL

将sudoers文件权限修改为不可编辑。

[root@localhost ~]# chmod -v u-w /etc/sudoers   
mode of "/etc/sudoers" changed from 0640 (rw-r-----) to 0440 (r--r-----)

可能出现问题：

1、使用test用户进行切换（su root）root用户，提示鉴定故障。

可能在权限不足带来的问题。

检查办法如下：

1）检查/etc目录下passwd的权限，该文件应该具备的权限如下：

[root@localhost ~]# ll /etc/passwd
-rw-r--r--. 1 root root 2347 4f  16 21:00 /etc/passwd

如果该权限不足，则使用chmod 补充完成。

2）检查/bin/su文件是否有s位权限也有可能是所属用户为普通用户，该文件应该具备的权限如下：

[root@localhost ~]# ll /bin/su
-rwsrwxrwx. 1 root root 32128 8f   9 2019 /bin/su

如果该权限不足，则使用chmod u+s /bin/su补充完成即可。

二、安装配置PostgreSQL

1、下载postgresql-16源码包（所有节点）

wget https://ftp.postgresql.org/pub/source/v16.0/postgresql-16.0.tar.gz --no-check-certificate

2、安装依赖包（所有节点）

yum install gcc gcc-c++ readline-devel readline readline-dev zlib-devel

3、解压编译（所有节点）

tar -zxvf postgresql-16.0.tar.gz -C /app

cd /app/postgresql-16.0

./configure --prefix=/app/pgsql
或
./configure --prefix=/app/pgsql --without-icu

make && make install

mv /app/postgresql-16.0 /app/pgsql

4、配置数据、日志目录（所有节点）

mkdir -p /app/pgsql/{data,log}

chown -R postgres:postgres /app/pgsql

5、初始化数据库（主节点操作）

只需要在主数据库服务器上对 PostgreSQL数据库服务进行初始化并且正常启动即可，其余备服务器数据库可利用 repmgr 工具进行初始化搭建部署。

su - postgres

/app/pgsql/bin/initdb -D /app/pgsql/data/

6、启动数据库（主节点操作）

/app/pgsql/bin/pg_ctl -D /app/pgsql/data/ -l logfile start

7、验证测试（主节点操作）

#软链接
ln -s /app/pgsql/bin/psql /bin/psql
ln -s /app/pgsql/bin/pg_config /bin/pg_config

su - postgres

#测试登录
psql -U postgres -d postgres

8、配置环境变量（所有节点）

vim /etc/profile

# PG安装目录
export PGHOME=/app/pgsql/
# PG数据目录
export PGDATA=/app/pgsql/data/
export PATH=$PGHOME/bin:$PATH
export PG_CONFIG=/bin/pg_config

source /etc/profile

三、安装repmgr

1、下载源码包

wget --no-check-certificate https://www.repmgr.org/download/repmgr-5.4.1.tar.gz

2、解压编译

yum install flex libcurl-devel json-c-devel

tar -zxvf repmgr-5.4.1.tar.gz -C /app/

cd /app/repmgr-5.4.1

./configure && make install

make install 成功后，pg_bin_path 里会有 repmgr、repmgrd 两个可执行文件

四、主节点配置

1、postgresql.conf配置文件

vim /app/pgsql/data/postgresql.conf

#直接添加在配置文件最下面就可以
listen_addresses = '*'
port = 5432
max_wal_senders = 10
max_replication_slots = 10
wal_level = 'hot_standby'
hot_standby = on
archive_mode = on
archive_command = '/bin/true'
wal_log_hints = on

2、pg_hba.conf配置文件

vim /app/pgsql/data/pg_hba.conf

#直接在文件最下面添加
local   replication   repmgr                              trust
host    replication   repmgr      127.0.0.1/32            trust
host    replication   repmgr      msp-app01               trust
host    replication   repmgr      msp-app02               trust
host    replication   repmgr      msp-app03               trust
local   repmgr        repmgr                              trust
host    repmgr        repmgr      127.0.0.1/32            trust
host    repmgr        repmgr      msp-app01               trust
host    repmgr        repmgr      msp-app02               trust
host    repmgr        repmgr      msp-app03               trust

3、创建repmgr用户和库

[root@msp-app01 src]# su - postgres
Last login: Fri Jan 26 10:09:52 CST 2024 on pts/0
[postgres@msp-app01 ~]$ createuser -s repmgr					#创建repmgr用户
[postgres@msp-app01 ~]$ createdb repmgr -O repmgr				#创建repmgr库，并将其所有者设置为repmgr用户
[postgres@msp-app01 ~]$ psql -U repmgr -d repmgr 				#以repmgr身份登录到repmgr库
psql (16.0)
Type "help" for help.

repmgr=# alter user repmgr with password '123456';			#修改repmgr密码
ALTER ROLE
repmgr=#

4、配置用户登录数据库免密

# root用户下添加以下内容到~/.pgpass文件,用户、数据库和密码修改为自己的即可

vim /home/postgres/.pgpass

# ip:port:repmgr:repmgr:repmgr
*:*:repmgr:repmgr:123456

# 修改 ~/.pgpass 文件权限
chmod 600 /home/postgres/.pgpass

5、重启生效

[root@msp-app01 src]#  su - postgres -c "pg_ctl -D /app/pgsql/data/ -l /app/pgsql/log/start.log stop"
waiting for server to shut down.... done
server stopped
[root@msp-app01 src]#  su - postgres -c "pg_ctl -D /app/pgsql/data/ -l /app/pgsql/log/start.log start"
waiting for server to start.... done
server started

6、创建repmgr.conf配置文件

mkdir -p /etc/repmgr/{log,conf}

chown -R postgres:postgres /etc/repmgr

配置文件内容：

vim /etc/repmgr/conf/repmgr.conf

# 节点ID,各节点标识
node_id=1
# 节点名称
node_name='msp-app01'
# 本节点数据库连接信息
conninfo='host=msp-app01 user=repmgr dbname=repmgr password=123456 connect_timeout=2'
# pg数据目录
data_directory='/app/pgsql/data'
# 流复制数据库用户,默认使用repmgr
replication_user='repmgr'
# repmgr软件目录
repmgr_bindir='/app/pgsql/bin'
# pg软件目录
pg_bindir='/app/pgsql/bin'

# 日志管理
log_level=INFO
# log文件需要提前创建
log_file='/etc/repmgr/log/repmgrd.log'
# 此设置导致repmgrd以指定的时间间隔(以秒为单位,默认为300)发出状态日志行,描述repmgrd的当前状态
log_status_interval=10
 
# pg、repmgr服务管理命令
service_start_command='pg_ctl -D /app/pgsql/data/ start -o \'-c config_file=/app/pgsql/data/postgresql.conf\' -l /app/pgsql/log/start.log'
service_stop_command='pg_ctl -D /app/pgsql/data/ stop -o \'-c config_file=/app/pgsql/data/postgresql.conf\' -l /app/pgsql/log/start.log'
service_restart_command='pg_ctl -D /app/pgsql/data/ restart -o \'-c config_file=/app/pgsql/data/postgresql.conf\' -l /app/pgsql/log/start.log'
service_reload_command='pg_ctl reload'

# repmgrd运行时的pid文件
repmgrd_pid_file='/etc/repmgr/log/repmgrd.pid'
repmgrd_service_start_command='repmgrd  start'
repmgrd_service_stop_command='kill -9 `cat /etc/repmgr/log/repmgrd.pid`'

# failover设置
failover='automatic'
# 当repmgrd确定当前节点将成为新的主节点时,将在故障转移情况下执行promote_command中定义的程序或脚本
promote_command='repmgr standby promote   --log-to-file'
# %n将被替换repmgrd与新的主节点的ID,如果没有提供,repmgr standby follow将尝试自行确定新的主repmgr standby follow节点,但如果在新主节点提升后原主节点重新上线,则存在导致节点继续跟随原主节点的风险
follow_command='repmgr standby follow   --log-to-file --upstream-node-id=%n'
#发生特定事件时要执行的命令，比如升级备用节点为主节点
event_notification_command='bash /usr/local/bin/repmgr_promote_self.sh %n'
#要通知的事件类型，例如 'standby_promote' 表示备用节点升级为主节点时触发通知
event_notifications='standby_promote'

# 高可用参数设置
# 定义节点位置的任意字符串,在故障转移期间用于检查当前主节点的可见性
location='location1'
# 节点优先级,选主时可能使用到(lsn > priority > node_id),0代表该节点不会被提升为主节点
priority=100
# 是否将监控数据写入monitoring_history表
monitoring_history=yes
# 故障转移之前,尝试重新连接的间隔(以秒为单位)
reconnect_interval=5
# 故障转移之前,尝试重新连接的次数
reconnect_attempts=3
# ping: repmg使用PQPing()方法测试连接
# connection: 尝试与节点建立新的连接
# query:通过现有连接在节点上执行SQL语句
connection_check_type=ping
# 写入监控数据的间隔
monitor_interval_secs=5
use_replication_slots=true

7、VIP自动切换脚本

vim /usr/local/bin/repmgr_promote_self.sh

#!/usr/bin/env bash
#虚拟IP地址
VIP=192.168.198.145
#网络接口名称
IFACE=eno16777736
#postgresql的bin目录
PATH=/app/pgsql/bin:$PATH
VIP_CONF="${VIP}/23 dev ${IFACE} label ${IFACE}"
##IPDOWN
#通过SSH连接到另一台服务器上,删除指定的VIP配置
ssh postgres@msp-app02 sudo ip addr del ${VIP_CONF}
sleep 5;
##IPUP
#本地添加VIP
sudo /usr/sbin/ip addr add ${VIP_CONF}
sudo /usr/sbin/arping -b -A -c 3 -I ${IFACE} ${VIP}
exit 0

脚本授权

chmod +x /usr/local/bin/repmgr_promote_self.sh

8、注册本地节点为主服务器节点

su - postgres

repmgr -f /etc/repmgr/conf/repmgr.conf primary register

9、查看集群信息

repmgr -f /etc/repmgr/conf/repmgr.conf cluster show

10、数据库查看集群基础信息

[postgres@msp-app01 ~]$  psql -U repmgr -d repmgr
psql (16.0)
Type "help" for help.

repmgr=# \x 1
Expanded display is on.
repmgr=# SELECT * FROM repmgr.nodes;
-[ RECORD 1 ]----+---------------------------------------------------------------------------
node_id          | 1
upstream_node_id | 
active           | t
node_name        | msp-app01
type             | primary
location         | location1
priority         | 100
conninfo         | host=msp-app01 user=repmgr dbname=repmgr password=123456 connect_timeout=2
repluser         | repmgr
slot_name        | repmgr_slot_1
config_file      | /etc/repmgr/conf/repmgr.conf

11、启动repmgrd服务

修改postgresql.conf文件，加入 repmgr 共享库。

vim /app/pgsql/data/postgresql.conf

# 添加如下内容
shared_preload_libraries = 'repmgr'

重启数据库

pg_ctl -D /app/pgsql/data/ -l logfile restart

启动repmgrd服务

[postgres@msp-app01 ~]$ repmgrd -f /etc/repmgr/conf/repmgr.conf start
[2024-01-26 12:50:31] [NOTICE] redirecting logging output to "/etc/repmgr/log/repmgrd.log"

#查看/etc/repmgr/log/repmgrd.log启动是否有报错

12、添加VIP

sudo ip addr add 192.168.198.145/23 dev eno16777736

五、备用节点配置

1、创建repmgr.conf配置文件

mkdir -p /etc/repmgr/{log,conf}

chown -R postgres:postgres /etc/repmgr

配置文件内容：

vim /etc/repmgr/conf/repmgr.conf

# 节点ID,各节点标识
node_id=2
# 节点名称
node_name='msp-app02'
# 本节点数据库连接信息
conninfo='host=msp-app02 user=repmgr dbname=repmgr password=123456 connect_timeout=2'
# pg数据目录
data_directory='/app/pgsql/data'
# 流复制数据库用户,默认使用repmgr
replication_user='repmgr'
# repmgr软件目录
repmgr_bindir='/app/pgsql/bin'
# pg软件目录
pg_bindir='/app/pgsql/bin'

# 日志管理
log_level=INFO
# log文件需要提前创建
log_file='/etc/repmgr/log/repmgrd.log'
# 此设置导致repmgrd以指定的时间间隔(以秒为单位,默认为300)发出状态日志行,描述repmgrd的当前状态
log_status_interval=10
 
# pg、repmgr服务管理命令
service_start_command='pg_ctl -D /app/pgsql/data/ start -o \'-c config_file=/app/pgsql/data/postgresql.conf\' -l /app/pgsql/log/start.log'
service_stop_command='pg_ctl -D /app/pgsql/data/ stop -o \'-c config_file=/app/pgsql/data/postgresql.conf\' -l /app/pgsql/log/start.log'
service_restart_command='pg_ctl -D /app/pgsql/data/ restart -o \'-c config_file=/app/pgsql/data/postgresql.conf\' -l /app/pgsql/log/start.log'
service_reload_command='pg_ctl reload'

# repmgrd运行时的pid文件
repmgrd_pid_file='/etc/repmgr/log/repmgrd.pid'
repmgrd_service_start_command='repmgrd  start'
repmgrd_service_stop_command='kill -9 `cat /etc/repmgr/log/repmgrd.pid`'

# failover设置
failover='automatic'
# 当repmgrd确定当前节点将成为新的主节点时,将在故障转移情况下执行promote_command中定义的程序或脚本
promote_command='repmgr standby promote   --log-to-file'
# %n将被替换repmgrd与新的主节点的ID,如果没有提供,repmgr standby follow将尝试自行确定新的主repmgr standby follow节点,但如果在新主节点提升后原主节点重新上线,则存在导致节点继续跟随原主节点的风险
follow_command='repmgr standby follow   --log-to-file --upstream-node-id=%n'
#发生特定事件时要执行的命令，比如升级备用节点为主节点
event_notification_command='bash /usr/local/bin/repmgr_promote_self.sh %n'
#要通知的事件类型，例如 'standby_promote' 表示备用节点升级为主节点时触发通知
event_notifications='standby_promote'

# 高可用参数设置
# 定义节点位置的任意字符串,在故障转移期间用于检查当前主节点的可见性
location='location1'
# 节点优先级,选主时可能使用到(lsn > priority > node_id),0代表该节点不会被提升为主节点
priority=100
# 是否将监控数据写入monitoring_history表
monitoring_history=yes
# 故障转移之前,尝试重新连接的间隔(以秒为单位)
reconnect_interval=5
# 故障转移之前,尝试重新连接的次数
reconnect_attempts=3
# ping: repmg使用PQPing()方法测试连接
# connection: 尝试与节点建立新的连接
# query:通过现有连接在节点上执行SQL语句
connection_check_type=ping
# 写入监控数据的间隔
monitor_interval_secs=5
use_replication_slots=true

2、VIP自动切换脚本

vim /usr/local/bin/repmgr_promote_self.sh

#!/usr/bin/env bash
#虚拟IP地址
VIP=192.168.198.145
#网络接口名称
IFACE=eno16777736
#postgresql的bin目录
PATH=/app/pgsql/bin:$PATH
VIP_CONF="${VIP}/23 dev ${IFACE} label ${IFACE}"
##IPDOWN
#通过SSH连接到另一台服务器上,删除指定的VIP配置
ssh postgres@msp-app01 sudo ip addr del ${VIP_CONF}
sleep 5;
##IPUP
#本地添加VIP
sudo /usr/sbin/ip addr add ${VIP_CONF}
sudo /usr/sbin/arping -b -A -c 3 -I ${IFACE} ${VIP}
exit 0

脚本授权

chmod +x /usr/local/bin/repmgr_promote_self.sh

3、检查备库是否可克隆

su - postgres

# --dry-run表示命令测试,并不会实际执行,可用于验证是否会出现一些基本错误
repmgr -h msp-app01 -U repmgr -d repmgr -f /etc/repmgr/conf/repmgr.conf standby clone --dry-run

# 实际执行pg的克隆操作
repmgr -h msp-app01 -U repmgr -d repmgr -f /etc/repmgr/conf/repmgr.conf standby clone

4、启动备库

pg_ctl -D /app/pgsql/data/ -l /app/pgsql/log/start.log start

5、注册从库为备用服务器

repmgr -f /etc/repmgr/conf/repmgr.conf --upstream-node-id=1 standby register

6、查看当前集群信息

repmgr -f /etc/repmgr/conf/repmgr.conf cluster show


[postgres@msp-app02 ~]$ repmgr -f /etc/repmgr/conf/repmgr.conf cluster show
 ID | Name      | Role    | Status    | Upstream  | Location  | Priority | Timeline | Connection string                                                         
----+-----------+---------+-----------+-----------+-----------+----------+----------+----------------------------------------------------------------------------
 1  | msp-app01 | primary | * running |           | location1 | 100      | 1        | host=msp-app01 user=repmgr dbname=repmgr password=123456 connect_timeout=2
 2  | msp-app02 | standby |   running | msp-app01 | location1 | 100      | 1        | host=msp-app02 user=repmgr dbname=repmgr password=123456 connect_timeout=2

7、启动repmgrd服务

# 启动 repmgrd 服务
[postgres@msp-app02 ~]$ repmgrd -f /etc/repmgr/conf/repmgr.conf start
[2024-01-26 13:11:12] [NOTICE] redirecting logging output to "/etc/repmgr/log/repmgrd.log"

#查看/etc/repmgr/log/repmgrd.log启动是否有报错

六、见证节点配置

1、启动节点postgres服务

su - postgres

#初始化数据库
initdb -D /app/pgsql/data

#启动
pg_ctl -D /app/pgsql/data/ -l /app/pgsql/log/start.log start

2、创建repmgr.conf配置文件

mkdir -p /etc/repmgr/{log,conf}

chown -R postgres:postgres /etc/repmgr

配置文件内容：

vim /etc/repmgr/conf/repmgr.conf

# 基本信息
node_id=3
# 节点名称,高可用集群各节点名称
node_name='msp-app03'
# 本节点数据库连接信息
conninfo='host=msp-app03 user=repmgr dbname=repmgr password=123456 connect_timeout=2'
# pg数据目录
data_directory='/app/pgsql/data'
# 流复制数据库用户,默认使用repmgr
replication_user='repmgr'
# repmgr软件目录
repmgr_bindir='/app/pgsql/bin'
# pg软件目录
pg_bindir='/app/pgsql/bin'            

# 日志管理
log_level=INFO
log_file='/etc/repmgr/log/repmgrd.log'
log_status_interval=10

# pg、repmgr服务管理命令
service_start_command='pg_ctl -D /app/pgsql/data/ start -o \'-c config_file=/app/pgsql/data/postgresql.conf\' -l /app/pgsql/log/start.log'
service_stop_command='pg_ctl -D /app/pgsql/data/ stop -o \'-c config_file=/app/pgsql/data/postgresql.conf\' -l /app/pgsql/log/start.log'
service_restart_command='pg_ctl -D /app/pgsql/data/ restart -o \'-c config_file=/app/pgsql/data/postgresql.conf\' -l /app/pgsql/log/start.log'
service_reload_command='pg_ctl reload'
# repmgrd运行时的pid文件
repmgrd_pid_file='/etc/repmgr/log/repmgrd.pid'
repmgrd_service_start_command='repmgrd -f /etc/repmgr/conf/repmgr.conf start'
repmgrd_service_stop_command='kill -9 `cat /etc/repmgr/log/repmgrd.pid`'

# failover设置
failover='automatic'
promote_command='repmgr standby promote -f /etc/repmgr/conf/repmgr.conf --log-to-file'
follow_command='repmgr standby follow -f /etc/repmgr/conf/repmgr.conf --log-to-file --upstream-node-id=%n'

# 高可用参数设置
location='location1'
priority=100
monitoring_history=yes
reconnect_interval=5
reconnect_attempts=3
connection_check_type=ping
monitor_interval_secs=5
use_replication_slots=true

3、启动repmgrd服务

修改postgresql.conf，加入repmgr共享库。

vim /app/pgsql/data/postgresql.conf 

# 添加如下内容
listen_addresses = '*'
shared_preload_libraries = 'repmgr'

修改 pg_hba.conf 文件

vim /app/pgsql/data/pg_hba.conf 

# 添加如下内容
host    repmgr        repmgr      0.0.0.0/0               trust

重启数据库，创建repmgr用户和库。

su - postgres

#重启
pg_ctl -D /app/pgsql/data/ -l logfile restart

#创建repmgr用户和库
[postgres@msp-app03 ~]$ psql -U postgres -d postgres
psql (16.0)
Type "help" for help.

postgres=# create user repmgr with password '123456' superuser replication;
CREATE ROLE
postgres=# create database repmgr owner repmgr;
CREATE DATABASE
postgres=# \q

启动repmgrd服务

repmgrd -f /etc/repmgr/conf/repmgr.conf start

4、注册witness

repmgr -f /etc/repmgr/conf/repmgr.conf witness register -h msp-app01 -U repmgr -d repmgr

5、查看集群

repmgr -f /etc/repmgr/conf/repmgr.conf cluster show

[postgres@msp-app03 ~]$ repmgr -f /etc/repmgr/conf/repmgr.conf cluster show
 ID | Name      | Role    | Status    | Upstream  | Location  | Priority | Timeline | Connection string                                                         
----+-----------+---------+-----------+-----------+-----------+----------+----------+----------------------------------------------------------------------------
 1  | msp-app01 | primary | * running |           | location1 | 100      | 1        | host=msp-app01 user=repmgr dbname=repmgr password=123456 connect_timeout=2
 2  | msp-app02 | standby |   running | msp-app01 | location1 | 100      | 1        | host=msp-app02 user=repmgr dbname=repmgr password=123456 connect_timeout=2
 3  | msp-app03 | witness | * running | msp-app01 | location1 | 0        | n/a      | host=msp-app03 user=repmgr dbname=repmgr password=123456 connect_timeout=2

6、集群最终效果

[postgres@msp-app01 pgsql]$  psql -U repmgr -d repmgr
psql (16.0)
Type "help" for help.

repmgr=# \x 1
Expanded display is on.
repmgr=# SELECT * FROM repmgr.nodes;
-[ RECORD 1 ]----+---------------------------------------------------------------------------
node_id          | 1
upstream_node_id | 
active           | t
node_name        | msp-app01
type             | primary
location         | location1
priority         | 100
conninfo         | host=msp-app01 user=repmgr dbname=repmgr password=123456 connect_timeout=2
repluser         | repmgr
slot_name        | repmgr_slot_1
config_file      | /etc/repmgr/conf/repmgr.conf
-[ RECORD 2 ]----+---------------------------------------------------------------------------
node_id          | 2
upstream_node_id | 1
active           | t
node_name        | msp-app02
type             | standby
location         | location1
priority         | 100
conninfo         | host=msp-app02 user=repmgr dbname=repmgr password=123456 connect_timeout=2
repluser         | repmgr
slot_name        | repmgr_slot_2
config_file      | /etc/repmgr/conf/repmgr.conf
-[ RECORD 3 ]----+---------------------------------------------------------------------------
node_id          | 3
upstream_node_id | 1
active           | t
node_name        | msp-app03
type             | witness
location         | location1
priority         | 0
conninfo         | host=msp-app03 user=repmgr dbname=repmgr password=123456 connect_timeout=2
repluser         | repmgr
slot_name        | repmgr_slot_3
config_file      | /etc/repmgr/conf/repmgr.conf

七、开启守护进程（所有节点）

su - postgres

/app/pgsql/bin/repmgrd -f /etc/repmgr/conf/repmgr.conf -d  -p /etc/repmgr/log/repmgrd.pid

八、模拟故障自动转换

1、查看集群状态

[postgres@msp-app01 ~]$ repmgr -f /etc/repmgr/conf/repmgr.conf cluster show
 ID | Name      | Role    | Status    | Upstream  | Location  | Priority | Timeline | Connection string                                                         
----+-----------+---------+-----------+-----------+-----------+----------+----------+----------------------------------------------------------------------------
 1  | msp-app01 | primary | * running |           | location1 | 100      | 1        | host=msp-app01 user=repmgr dbname=repmgr password=123456 connect_timeout=2
 2  | msp-app02 | standby |   running | msp-app01 | location1 | 100      | 1        | host=msp-app02 user=repmgr dbname=repmgr password=123456 connect_timeout=2
 3  | msp-app03 | witness | * running | msp-app01 | location1 | 0        | n/a      | host=msp-app03 user=repmgr dbname=repmgr password=123456 connect_timeout=2

主节点IP信息

[postgres@msp-app01 ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eno16777736: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:ad:ac:5b brd ff:ff:ff:ff:ff:ff
    inet 192.168.198.146/24 brd 192.168.198.255 scope global dynamic eno16777736
       valid_lft 1390sec preferred_lft 1390sec
    inet 192.168.198.145/23 scope global eno16777736
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fead:ac5b/64 scope link 
       valid_lft forever preferred_lft forever

2、手动关闭主节点（primary），模拟故障宕机

pg_ctl -D /app/pgsql/data/ -l /app/pgsql/log/start.log stop

3、通过见证服务器查看集群信息变化

[postgres@msp-app01 ~]$ repmgr -f /etc/repmgr/conf/repmgr.conf cluster show
 ID | Name      | Role    | Status               | Upstream    | Location  | Priority | Timeline | Connection string                                                         
----+-----------+---------+----------------------+-------------+-----------+----------+----------+----------------------------------------------------------------------------
 1  | msp-app01 | primary | * running            |             | location1 | 100      | 1        | host=msp-app01 user=repmgr dbname=repmgr password=123456 connect_timeout=2
 2  | msp-app02 | standby | ! running as primary |             | location1 | 100      | 2        | host=msp-app02 user=repmgr dbname=repmgr password=123456 connect_timeout=2
 3  | msp-app03 | witness | * running            | ! msp-app02 | location1 | 0        | n/a      | host=msp-app03 user=repmgr dbname=repmgr password=123456 connect_timeout=2

#提升新的primary节点完成后,旧的primary节点被至为failed
[postgres@msp-app03 ~]$ repmgr -f /etc/repmgr/conf/repmgr.conf cluster show
 ID | Name      | Role    | Status    | Upstream  | Location  | Priority | Timeline | Connection string                                                         
----+-----------+---------+-----------+-----------+-----------+----------+----------+----------------------------------------------------------------------------
 1  | msp-app01 | primary | - failed  | ?         | location1 | 100      |          | host=msp-app01 user=repmgr dbname=repmgr password=123456 connect_timeout=2
 2  | msp-app02 | primary | * running |           | location1 | 100      | 2        | host=msp-app02 user=repmgr dbname=repmgr password=123456 connect_timeout=2
 3  | msp-app03 | witness | * running | msp-app02 | location1 | 0        | n/a      | host=msp-app03 user=repmgr dbname=repmgr password=123456 connect_timeout=2

WARNING: following issues were detected
  - unable to connect to node "msp-app01" (ID: 1)

HINT: execute with --verbose option to see connection error messages

4、重新将旧的 primary 节点以 standby 的身份加入到集群中

测试是否能重新加入集群

pg_ctl -D /app/pgsql/data/ -l /app/pgsql/log/start.log stop

repmgr node rejoin -f /etc/repmgr/conf/repmgr.conf -h msp-app02 -U repmgr -d repmgr --dry-run --force-rewind

加入集群

repmgr node rejoin -f /etc/repmgr/conf/repmgr.conf -h msp-app02 -U repmgr -d repmgr --force-rewind

5、查看最终集群状态

[postgres@msp-app02 ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eno16777736: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:40:16:78 brd ff:ff:ff:ff:ff:ff
    inet 192.168.198.143/24 brd 192.168.198.255 scope global dynamic eno16777736
       valid_lft 1563sec preferred_lft 1563sec
    inet6 fe80::20c:29ff:fe40:1678/64 scope link 
       valid_lft forever preferred_lft forever

查看集群信息

[postgres@msp-app03 ~]$ repmgr -f /etc/repmgr/conf/repmgr.conf cluster show
ID | Name      | Role    | Status    | Upstream  | Location  | Priority | Timeline | Connection string 
----+-----------+---------+-----------+-----------+-----------+----------+----------+------------------
1  | msp-app01 | standby |   running | msp-app02 | location1 | 100      | 4        | host=msp-app01 user=rep
2  | msp-app02 | primary | * running |           | location1 | 100      | 4        | host=msp-app02 user=rep
3  | msp-app03 | witness | * running | msp-app02 | location1 | 0        | n/a      | host=msp-app03 user=rep

标签：PostgreSQL,log,app,msp,pgsql,一主一,conf,repmgr,搭建
From： https://www.cnblogs.com/zreo2home/p/18353390

PostgreSQL一主一从高可用搭建

1. PostgreSQL一主一从高可用搭建（VIP）

一、基本环境配置

1、hosts文件配置（所有节点）

2、配置SSH免密登录

3、添加sudo权限

二、安装配置PostgreSQL

1、下载postgresql-16源码包（所有节点）

2、安装依赖包（所有节点）

3、解压编译（所有节点）

4、配置数据、日志目录（所有节点）

5、初始化数据库（主节点操作）

6、启动数据库（主节点操作）

7、验证测试（主节点操作）

8、配置环境变量（所有节点）

三、安装repmgr

1、下载源码包

2、解压编译

四、主节点配置

1、postgresql.conf配置文件

2、pg_hba.conf配置文件

3、创建repmgr用户和库

4、配置用户登录数据库免密

5、重启生效

6、创建repmgr.conf配置文件

7、VIP自动切换脚本

8、注册本地节点为主服务器节点

9、查看集群信息

10、数据库查看集群基础信息

11、启动repmgrd服务

五、备用节点配置

1、创建repmgr.conf配置文件

2、VIP自动切换脚本

3、检查备库是否可克隆

4、启动备库

5、注册从库为备用服务器

6、查看当前集群信息

7、启动repmgrd服务

六、见证节点配置

1、启动节点postgres服务

2、创建repmgr.conf配置文件

3、启动repmgrd服务

4、注册witness

5、查看集群

6、集群最终效果

七、开启守护进程（所有节点）

八、模拟故障自动转换

1、查看集群状态

2、手动关闭主节点（primary），模拟故障宕机

3、通过见证服务器查看集群信息变化

4、重新将旧的 primary 节点以 standby 的身份加入到集群中

5、查看最终集群状态

相关文章

赞助商

阅读排行