repmgr安装部署(pg13)

标签：13 postgres 部署 pg13 00 192.168 pg2 repmgr

环境:
OS:Centos7
DB:pg13
repmgr:5.3

192.168.1.105 主库 pg1
192.168.1.106 从库 pg2
192.168.1.107 从库 pg3

1.在每台机器上都安装好pg
安装请参考
https://www.cnblogs.com/hxlasky/p/16844609.html
说明:主库安装的时候根据机器配置需要对配置文件进行修改,从库可以不修改,因为部署repmgr会做一个pg_basebackup
会将主库的postgresql.conf和pg_hba.conf拷贝到从库.

2.配置/etc/hosts
192.168.1.105
hostnamectl set-hostname pg1

192.168.1.106
hostnamectl set-hostname pg2

192.168.1.107
hostnamectl set-hostname pg3

每个节点配置一致
vi /etc/hosts
192.168.1.105 pg1
192.168.1.106 pg2
192.168.1.107 pg3

3.配置等效连接
3个节点之间postgres用户能互相免密登录
部署参考:
https://www.cnblogs.com/hxlasky/p/17036903.html

4.添加sudo用户 --done
每个节点上执行,root用户下执行
echo 'postgres ALL=(ALL) NOPASSWD: ALL'> /etc/sudoers.d/postgres

5.创建repmgr用户,以及创建repmgr数据库
我这里已经提前安装好了主库,直接登录到主库上执行,在主库上192.168.1.105执行即可
[postgres@pg2 ~]$ psql
psql (14.6)
Type "help" for help.

postgres=# create database repmgr;
CREATE DATABASE
postgres=# create user repmgr with password 'repmgr' superuser login;
CREATE ROLE
postgres=# alter database repmgr owner to repmgr;
ALTER DATABASE

6.repmgr安装
每个节点上都需要安装
curl https://dl.2ndquadrant.com/default/release/get/13/rpm | bash
yum list repmgr*
yum install repmgr13

7.配置pg_hba.conf和postgresql.conf
在主库192.168.1.105上配置即可,在原来的基础上添加如下参数
su - postgres
vi /opt/pg13/data/postgresql.conf
shared_preload_libraries = 'repmgr'
wal_log_hints = on #开启支持pg_rewind

vi /opt/pg13/data/pg_hba.conf
host replication repmgr 192.168.1.105/32 trust
host replication repmgr 192.168.1.106/32 trust
host replication repmgr 192.168.1.107/32 trust

重新启动主库
su - root
systemctl stop postgresql-13
systemctl start postgresql-13

8.在postgres用户下创建.pgpass密码文件
每个节点上都要创建,可以尝试在一个节点上创建,然后scp到另外的机器,
我这里是在主库192.169.1.105上创建,然后scp到另外2个节点

su - postgres
cd ~/
vi .pgpass

#hostname:port:database:username:password
192.168.1.105:5432:repmgr:repmgr:repmgr
192.168.1.106:5432:repmgr:repmgr:repmgr
192.168.1.107:5432:repmgr:repmgr:repmgr

.pgpass文件权限必须是0600
chmod 0600 .pgpass

拷贝到另外2个节点
su - postgres
scp /home/postgres/.pgpass [email protected]:/home/postgres/
scp /home/postgres/.pgpass [email protected]:/home/postgres/

9.主库修改repmgr配置文件
机器192.168.1.105上操作
su - root
vi /etc/repmgr/13/repmgr.conf

node_id=1
node_name='pg1'
conninfo='host=192.168.1.105 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/opt/pg13/data'
log_file='/var/log/repmgr/log'
passfile='/home/postgres/.pgpass'
pg_bindir='/usr/pgsql-13/bin'

10.主库注册
机器192.168.1.105上操作

[root@pg1 14]# su - postgres
[postgres@pg1 ~]$ /usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf primary register
INFO: connecting to primary database...
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
NOTICE: primary node record (ID: 1) registered

查看
[postgres@pg1 ~]$ /usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf cluster show
 ID | Name | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                             
----+------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------------
 1  | pg1  | primary | * running |          | default  | 100      | 1        | host=192.168.1.105 user=repmgr dbname=repmgr connect_timeout=2

#######################从库1(192.168.1.106)############################

1.修改repmgr配置文件
修改repmgr.conf文件
su - root
vi /etc/repmgr/13/repmgr.conf
node_id=2
node_name='pg2'
conninfo='host=192.168.1.106 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/opt/pg13/data'
passfile='/home/postgres/.pgpass'
pg_bindir='/usr/pgsql-13/bin'

2.停掉从库上的pg,并清空data目录
[root@pg2 14]# systemctl stop postgresql-13
su - postgres
[postgres@pg2 pg14]$ cd /opt/pg13
[postgres@pg2 pg14]$ mv data bakdata
[postgres@pg2 pg14]$ mkdir data

[root@localhost bin]#chown -R postgres:postgres /opt/pg13
[root@localhost bin]#chmod 0700 /opt/pg13/data

3.测试(未真正执行)

su - postgres
[postgres@pg3 pg13]$/usr/pgsql-13/bin/repmgr -h 192.168.1.105 -U repmgr -d repmgr -f /etc/repmgr/13/repmgr.conf standby clone --dry-run
NOTICE: destination directory "/opt/pg13/data" provided
INFO: connecting to source node
DETAIL: connection string is: host=192.168.1.105 user=repmgr dbname=repmgr
DETAIL: current installation size is 32 MB
INFO: "repmgr" extension is installed in database "repmgr"
INFO: replication slot usage not requested;  no replication slot will be set up for this standby
INFO: parameter "max_wal_senders" set to 10
NOTICE: checking for available walsenders on the source node (2 required)
INFO: sufficient walsenders available on the source node
DETAIL: 2 required, 9 available
NOTICE: checking replication connections can be made to the source server (2 required)
INFO: required number of replication connections could be made to the source server
DETAIL: 2 replication connections required
NOTICE: standby will attach to upstream node 1
HINT: consider using the -c/--fast-checkpoint option
INFO: would execute:
  pg_basebackup -l "repmgr base backup"  -D /opt/pg13/data -h 192.168.1.105 -p 5432 -U repmgr -X stream 
INFO: all prerequisites for "standby clone" are met

4.clone主节点数据
su - postgres
/usr/pgsql-13/bin/repmgr -h 192.168.1.105 -U repmgr -d repmgr -f /etc/repmgr/13/repmgr.conf standby clone

5.启动从库
[root@pg2 ~]#systemctl start postgresql-13
查看进程

[root@pg2 ~]# ps -ef|grep postgres
root      2407  1551  0 Jan09 pts/0    00:00:00 su - postgres
postgres  2408  2407  0 Jan09 pts/0    00:00:00 -bash
postgres  3076     1  4 00:58 ?        00:00:00 /usr/pgsql-13/bin/postmaster -D /opt/pg13/data/
postgres  3078  3076  0 00:58 ?        00:00:00 postgres: logger 
postgres  3079  3076  0 00:58 ?        00:00:00 postgres: startup recovering 000000010000000000000004
postgres  3080  3076  0 00:58 ?        00:00:00 postgres: checkpointer 
postgres  3081  3076  0 00:58 ?        00:00:00 postgres: background writer 
postgres  3082  3076  0 00:58 ?        00:00:00 postgres: stats collector 
postgres  3083  3076  1 00:58 ?        00:00:00 postgres: walreceiver streaming 0/40001F0
root      3085  3026  0 00:58 pts/1    00:00:00 grep --color=auto postgres

6.注册从节点

su - postgres
[postgres@pg2 data]$ /usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf standby register
INFO: connecting to local node "pg2" (ID: 2)
INFO: connecting to primary database
WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID: 1)
INFO: standby registration complete
NOTICE: standby node "pg2" (ID: 2) successfully registered

7.查看
可以在主从的机器上查看

[postgres@pg2 data]$ /usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf cluster show
 ID | Name | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                             
----+------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------------
 1  | pg1  | primary | * running |          | default  | 100      | 1        | host=192.168.1.105 user=repmgr dbname=repmgr connect_timeout=2
 2  | pg2  | standby |   running | pg1      | default  | 100      | 1        | host=192.168.1.106 user=repmgr dbname=repmgr connect_timeout=2

#######################从库2(192.168.1.107)############################
步骤跟部署从库1一致

从库2也加入到集群后查看

[postgres@pg3 data]$ /usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf cluster show
 ID | Name | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                             
----+------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------------
 1  | pg1  | primary | * running |          | default  | 100      | 1        | host=192.168.1.105 user=repmgr dbname=repmgr connect_timeout=2
 2  | pg2  | standby |   running | pg1      | default  | 100      | 1        | host=192.168.1.106 user=repmgr dbname=repmgr connect_timeout=2
 3  | pg3  | standby |   running | pg1      | default  | 100      | 1        | host=192.168.1.107 user=repmgr dbname=repmgr connect_timeout=2
[postgres@pg3 data]$

########################常规主从(当前主从都正常)切换##################
在从库1(192.168.1.106)上执行
先检查可行性:
su - postgres
/usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf standby switchover --siblings-follow --dry-run --force-rewind

真正执行切换:

[postgres@pg2 data]$ /usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf standby switchover --siblings-follow
NOTICE: executing switchover on node "pg2" (ID: 2)
NOTICE: attempting to pause repmgrd on 3 nodes
NOTICE: local node "pg2" (ID: 2) will be promoted to primary; current primary "pg1" (ID: 1) will be demoted to standby
NOTICE: stopping current primary node "pg1" (ID: 1)
NOTICE: issuing CHECKPOINT on node "pg1" (ID: 1) 
DETAIL: executing server command "/usr/pgsql-13/bin/pg_ctl  -D '/opt/pg13/data' -W -m fast stop"
INFO: checking for primary shutdown; 1 of 60 attempts ("shutdown_check_timeout")
INFO: checking for primary shutdown; 2 of 60 attempts ("shutdown_check_timeout")
NOTICE: current primary has been cleanly shut down at location 0/7000028
NOTICE: promoting standby to primary
DETAIL: promoting server "pg2" (ID: 2) using pg_promote()
NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
NOTICE: STANDBY PROMOTE successful
DETAIL: server "pg2" (ID: 2) was successfully promoted to primary
NOTICE: node "pg2" (ID: 2) promoted to primary, node "pg1" (ID: 1) demoted to standby
NOTICE: executing STANDBY FOLLOW on 1 of 1 siblings
INFO: STANDBY FOLLOW successfully executed on all reachable sibling nodes
NOTICE: switchover was successful
DETAIL: node "pg2" is now primary and node "pg1" is attached as standby
NOTICE: STANDBY SWITCHOVER has completed successfully

查看当前的集群状态

[postgres@pg2 data]$ /usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf cluster show
 ID | Name | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                             
----+------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------------
 1  | pg1  | standby |   running | pg2      | default  | 100      | 1        | host=192.168.1.105 user=repmgr dbname=repmgr connect_timeout=2
 2  | pg2  | primary | * running |          | default  | 100      | 2        | host=192.168.1.106 user=repmgr dbname=repmgr connect_timeout=2
 3  | pg3  | standby |   running | pg2      | default  | 100      | 1        | host=192.168.1.107 user=repmgr dbname=repmgr connect_timeout=2
[postgres@pg2 data]$

########################repmgr管理#####################
1.启动守护进程 repmgrd
su - postgres
/usr/pgsql-13/bin/repmgrd -f /etc/repmgr/13/repmgr.conf

2.查看状态node
/usr/pgsql-13/bin/repmgrd -f /etc/repmgr/13/repmgr.conf node status

3.集群相关
/usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf cluster show
/usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf cluster matrix
/usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf cluster crosscheck

4.service 相关
/usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf service status

5.暂停
/usr/pgsql-13/bin/repmgr -f /etc/repmgr/13/repmgr.conf service pause

-- The End --

标签：13,postgres,部署,pg13,00,192.168,pg2,repmgr
From： https://www.cnblogs.com/hxlasky/p/17040784.html

相关文章

赞助商

阅读排行