环境:
OS:Centos7
DB:pg13
repmgr:5.3
192.168.1.105 主库 pg1
192.168.1.106 从库 pg2
192.168.1.107 从库 pg3
1.在每台机器上都安装好pg
安装请参考
https://www.cnblogs.com/hxlasky/p/16844609.html
说明:主库安装的时候根据机器配置需要对配置文件进行修改,从库可以不修改,因为部署repmgr会做一个pg_basebackup
会将主库的postgresql.conf和pg_hba.conf拷贝到从库.
2.配置/etc/hosts
192.168.1.105
hostnamectl set-hostname pg1
192.168.1.106
hostnamectl set-hostname pg2
192.168.1.107
hostnamectl set-hostname pg3
每个节点配置一致
vi /etc/hosts
192.168.1.105 pg1
192.168.1.106 pg2
192.168.1.107 pg3
3.配置等效连接
3个节点之间postgres用户能互相免密登录
部署参考:
https://www.cnblogs.com/hxlasky/p/17036903.html
4.添加sudo用户 --done
每个节点上执行,root用户下执行
echo 'postgres ALL=(ALL) NOPASSWD: ALL'> /etc/sudoers.d/postgres
5.创建repmgr用户,以及创建repmgr数据库
我这里已经提前安装好了主库,直接登录到主库上执行,在主库上192.168.1.105执行即可
[postgres@pg2 ~]$ psql
psql (14.6)
Type "help" for help.
postgres=# create database repmgr;
CREATE DATABASE
postgres=# create user repmgr with password 'repmgr' superuser login;
CREATE ROLE
postgres=# alter database repmgr owner to repmgr;
ALTER DATABASE
6.repmgr安装
每个节点上都需要安装
curl https://dl.2ndquadrant.com/default/release/get/13/rpm | bash
yum list repmgr*
yum install repmgr13
7.配置pg_hba.conf和postgresql.conf
在主库192.168.1.105上配置即可,在原来的基础上添加如下参数
su - postgres
vi /opt/pg13/data/postgresql.conf
shared_preload_libraries = 'repmgr'
wal_log_hints = on #开启支持pg_rewind
vi /opt/pg13/data/pg_hba.conf
host replication repmgr 192.168.1.105/32 trust
host replication repmgr 192.168.1.106/32 trust
host replication repmgr 192.168.1.107/32 trust
重新启动主库
su - root
systemctl stop postgresql-13
systemctl start postgresql-13
8.在postgres用户下创建.pgpass密码文件
每个节点上都要创建,可以尝试在一个节点上创建,然后scp到另外的机器,
我这里是在主库192.169.1.105上创建,然后scp到另外2个节点
su - postgres
cd ~/
vi .pgpass
#hostname:port:database:username:password
192.168.1.105:5432:repmgr:repmgr:repmgr
192.168.1.106:5432:repmgr:repmgr:repmgr
192.168.1.107:5432:repmgr:repmgr:repmgr
.pgpass文件权限必须是0600
chmod 0600 .pgpass
拷贝到另外2个节点
su - postgres
scp /home/postgres/.pgpass [email protected]:/home/postgres/
scp /home/postgres/.pgpass [email protected]:/home/postgres/
9.主库修改repmgr配置文件
机器192.168.1.105上操作
su - root
vi /etc/repmgr/13/repmgr.conf
node_id=1
node_name='pg1'
conninfo='host=192.168.1.105 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/opt/pg13/data'
log_file='/var/log/repmgr/log'
passfile='/home/postgres/.pgpass'
pg_bindir='/usr/pgsql-13/bin'
10.主库注册
机器192.168.1.105上操作
[root@pg1 14]# su - postgres [postgres@pg1 ~]$ /usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf primary register INFO: connecting to primary database... NOTICE: attempting to install extension "repmgr" NOTICE: "repmgr" extension successfully installed NOTICE: primary node record (ID: 1) registered 查看 [postgres@pg1 ~]$ /usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------- 1 | pg1 | primary | * running | | default | 100 | 1 | host=192.168.1.105 user=repmgr dbname=repmgr connect_timeout=2
#######################从库1(192.168.1.106)############################
1.修改repmgr配置文件
修改repmgr.conf文件
su - root
vi /etc/repmgr/13/repmgr.conf
node_id=2
node_name='pg2'
conninfo='host=192.168.1.106 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/opt/pg13/data'
passfile='/home/postgres/.pgpass'
pg_bindir='/usr/pgsql-13/bin'
2.停掉从库上的pg,并清空data目录
[root@pg2 14]# systemctl stop postgresql-13
su - postgres
[postgres@pg2 pg14]$ cd /opt/pg13
[postgres@pg2 pg14]$ mv data bakdata
[postgres@pg2 pg14]$ mkdir data
[root@localhost bin]#chown -R postgres:postgres /opt/pg13
[root@localhost bin]#chmod 0700 /opt/pg13/data
3.测试(未真正执行)
su - postgres [postgres@pg3 pg13]$/usr/pgsql-13/bin/repmgr -h 192.168.1.105 -U repmgr -d repmgr -f /etc/repmgr/13/repmgr.conf standby clone --dry-run NOTICE: destination directory "/opt/pg13/data" provided INFO: connecting to source node DETAIL: connection string is: host=192.168.1.105 user=repmgr dbname=repmgr DETAIL: current installation size is 32 MB INFO: "repmgr" extension is installed in database "repmgr" INFO: replication slot usage not requested; no replication slot will be set up for this standby INFO: parameter "max_wal_senders" set to 10 NOTICE: checking for available walsenders on the source node (2 required) INFO: sufficient walsenders available on the source node DETAIL: 2 required, 9 available NOTICE: checking replication connections can be made to the source server (2 required) INFO: required number of replication connections could be made to the source server DETAIL: 2 replication connections required NOTICE: standby will attach to upstream node 1 HINT: consider using the -c/--fast-checkpoint option INFO: would execute: pg_basebackup -l "repmgr base backup" -D /opt/pg13/data -h 192.168.1.105 -p 5432 -U repmgr -X stream INFO: all prerequisites for "standby clone" are met
4.clone主节点数据
su - postgres
/usr/pgsql-13/bin/repmgr -h 192.168.1.105 -U repmgr -d repmgr -f /etc/repmgr/13/repmgr.conf standby clone
5.启动从库
[root@pg2 ~]#systemctl start postgresql-13
查看进程
[root@pg2 ~]# ps -ef|grep postgres root 2407 1551 0 Jan09 pts/0 00:00:00 su - postgres postgres 2408 2407 0 Jan09 pts/0 00:00:00 -bash postgres 3076 1 4 00:58 ? 00:00:00 /usr/pgsql-13/bin/postmaster -D /opt/pg13/data/ postgres 3078 3076 0 00:58 ? 00:00:00 postgres: logger postgres 3079 3076 0 00:58 ? 00:00:00 postgres: startup recovering 000000010000000000000004 postgres 3080 3076 0 00:58 ? 00:00:00 postgres: checkpointer postgres 3081 3076 0 00:58 ? 00:00:00 postgres: background writer postgres 3082 3076 0 00:58 ? 00:00:00 postgres: stats collector postgres 3083 3076 1 00:58 ? 00:00:00 postgres: walreceiver streaming 0/40001F0 root 3085 3026 0 00:58 pts/1 00:00:00 grep --color=auto postgres
6.注册从节点
su - postgres [postgres@pg2 data]$ /usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf standby register INFO: connecting to local node "pg2" (ID: 2) INFO: connecting to primary database WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID: 1) INFO: standby registration complete NOTICE: standby node "pg2" (ID: 2) successfully registered
7.查看
可以在主从的机器上查看
[postgres@pg2 data]$ /usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------- 1 | pg1 | primary | * running | | default | 100 | 1 | host=192.168.1.105 user=repmgr dbname=repmgr connect_timeout=2 2 | pg2 | standby | running | pg1 | default | 100 | 1 | host=192.168.1.106 user=repmgr dbname=repmgr connect_timeout=2
#######################从库2(192.168.1.107)############################
步骤跟部署从库1一致
从库2也加入到集群后查看
[postgres@pg3 data]$ /usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------- 1 | pg1 | primary | * running | | default | 100 | 1 | host=192.168.1.105 user=repmgr dbname=repmgr connect_timeout=2 2 | pg2 | standby | running | pg1 | default | 100 | 1 | host=192.168.1.106 user=repmgr dbname=repmgr connect_timeout=2 3 | pg3 | standby | running | pg1 | default | 100 | 1 | host=192.168.1.107 user=repmgr dbname=repmgr connect_timeout=2 [postgres@pg3 data]$
########################常规主从(当前主从都正常)切换##################
在从库1(192.168.1.106)上执行
先检查可行性:
su - postgres
/usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf standby switchover --siblings-follow --dry-run --force-rewind
真正执行切换:
[postgres@pg2 data]$ /usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf standby switchover --siblings-follow NOTICE: executing switchover on node "pg2" (ID: 2) NOTICE: attempting to pause repmgrd on 3 nodes NOTICE: local node "pg2" (ID: 2) will be promoted to primary; current primary "pg1" (ID: 1) will be demoted to standby NOTICE: stopping current primary node "pg1" (ID: 1) NOTICE: issuing CHECKPOINT on node "pg1" (ID: 1) DETAIL: executing server command "/usr/pgsql-13/bin/pg_ctl -D '/opt/pg13/data' -W -m fast stop" INFO: checking for primary shutdown; 1 of 60 attempts ("shutdown_check_timeout") INFO: checking for primary shutdown; 2 of 60 attempts ("shutdown_check_timeout") NOTICE: current primary has been cleanly shut down at location 0/7000028 NOTICE: promoting standby to primary DETAIL: promoting server "pg2" (ID: 2) using pg_promote() NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete NOTICE: STANDBY PROMOTE successful DETAIL: server "pg2" (ID: 2) was successfully promoted to primary NOTICE: node "pg2" (ID: 2) promoted to primary, node "pg1" (ID: 1) demoted to standby NOTICE: executing STANDBY FOLLOW on 1 of 1 siblings INFO: STANDBY FOLLOW successfully executed on all reachable sibling nodes NOTICE: switchover was successful DETAIL: node "pg2" is now primary and node "pg1" is attached as standby NOTICE: STANDBY SWITCHOVER has completed successfully
查看当前的集群状态
[postgres@pg2 data]$ /usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------- 1 | pg1 | standby | running | pg2 | default | 100 | 1 | host=192.168.1.105 user=repmgr dbname=repmgr connect_timeout=2 2 | pg2 | primary | * running | | default | 100 | 2 | host=192.168.1.106 user=repmgr dbname=repmgr connect_timeout=2 3 | pg3 | standby | running | pg2 | default | 100 | 1 | host=192.168.1.107 user=repmgr dbname=repmgr connect_timeout=2 [postgres@pg2 data]$
########################repmgr管理#####################
1.启动守护进程 repmgrd
su - postgres
/usr/pgsql-13/bin/repmgrd -f /etc/repmgr/13/repmgr.conf
2.查看状态node
/usr/pgsql-13/bin/repmgrd -f /etc/repmgr/13/repmgr.conf node status
3.集群相关
/usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf cluster show
/usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf cluster matrix
/usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf cluster crosscheck
4.service 相关
/usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf service status
5.暂停
/usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf service pause
-- The End --
标签:13,postgres,部署,pg13,00,192.168,pg2,repmgr From: https://www.cnblogs.com/hxlasky/p/17040784.html