首页 > 数据库 >repmgr+keepalived+pgbouncer构建postgresql集群

repmgr+keepalived+pgbouncer构建postgresql集群

时间:2024-06-03 18:23:19浏览次数:34  
标签:node2 postgresql pgbouncer keepalived usr node1 repmgr local

安装环境

操作系统:openeuler 22 LTS SP3
数据库:postgresql 16.3
两台已经安装了数据库的服务器

10.11.110.46	node1
10.11.110.47	node2

配置node1和node2之间的免密登录,以及postgres用户需要无密执行systemd启停数据库服务。

echo "postgres ALL = (root) NOPASSWD:/usr/bin/systemctl start postgresql.service,/usr/bin/systemctl restart postgresql.service,/usr/bin/systemctl stop postgresql.service,/usr/bin/systemctl reload postgresql.service,/usr/bin/systemctl status postgresql.service" | sudo tee /etc/sudoers.d/postgres

chmod 0440 /etc/sudoers.d/postgres
visudo -c

注意这里配置了sudo密码执行systemctl命令,手动运行的时候也必须指定命令全路径
/etc/hosts配置ip与主机名映射关系

10.11.110.46 node1
10.11.110.47 node2

repmgr源码包下载
https://www.repmgr.org/download/repmgr-5.4.1.tar.gz

安装repmgr

su - postgres
tar  zxvf repmgr-5.4.1.tar.gz
export PG_CONFIG=/usr/local/postgresql/bin/pg_config
./configure
make
make install

如果编译报错如下,是因为缺少包导致

/usr/bin/ld: 找不到 -lcurl
/usr/bin/ld: 找不到 -ljson-c

安装依赖包

dnf install libcurl-devel json-c-devel -y

配置repmgr

postgresql配置

选择node1作为主节点,修改配置文件/usr/local/postgresql/data/postgresql.conf

max_wal_senders = 10
max_replication_slots = 10
wal_level = replica
hot_standby = on
archive_mode = on
wal_log_hints = on
archive_command = '/bin/true'
shared_preload_libraries = 'repmgr'

修改配置文件/usr/local/postgresql/data/pg_hba.conf,配置repmgr用户的认证权限,确保其有适当的权限,并且能够以复制模式连接

local   replication   repmgr                                    trust
host   replication   repmgr            127.0.0.1/32            trust
host    replication   repmgr            10.11.110.46/32         trust
host    replication   repmgr            10.11.110.47/32         trust
local   repmgr        repmgr                                    trust
host    repmgr        repmgr            127.0.0.1/32            trust
host    repmgr        repmgr            10.11.110.46/32         trust
host    repmgr        repmgr            10.11.110.47/32         trust

修改完成后重启数据库systemctl restart postgresql
在node1上创建repmgr超级用户和repmgr数据库

su - postgres
createuser  -s repmgr
createdb repmgr -O repmgr
psql
postgres=# ALTER USER repmgr ENCRYPTED PASSWORD 'repmgr';

备节点配置

在node2上停止postgresql,并删除其数据目录,如果之前没有初始化数据库就不用管。
从备节点测试是否能够正常连接主库
psql 'host=node1 user=repmgr dbname=repmgr connect_timeout=2'

创建repmgr配置文件

在node1和node2上创建/usr/local/postgresql/data/repmgr.conf
node1的配置文件

node_id=1
node_name=node1
conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/usr/local/postgresql/data'
log_file='/usr/local/postgresql/log/repmgr.log'
pg_bindir='/usr/local/postgresql/bin'
ssh_options='-q -o ConnectTimeout=10'
failover='automatic'
promote_command='/usr/local/postgresql/bin/repmgr standby promote -f /usr/local/postgresql/repmgr/repmgr.conf --log-to-file'
follow_command='/usr/local/postgresql/bin/repmgr standby follow -f /usr/local/postgresql/repmgr/repmgr.conf --log-to-file --upstream-node-id=%n'
service_start_command='sudo systemctl start postgresql.service'
service_stop_command='sudo systemctl stop postgresql.service'
service_restart_command='sudo systemctl restart postgresql.service'
service_reload_command='sudo systemctl reload postgresql.service'

node2的配置文件

node_id=2
node_name=node2
conninfo='host=node2 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/usr/local/postgresql/data'
log_file='/usr/local/postgresql/log/repmgr.log'
pg_bindir='/usr/local/postgresql/bin'
ssh_options='-q -o ConnectTimeout=10'
failover='automatic'
promote_command='/usr/local/postgresql/bin/repmgr standby promote -f /usr/local/postgresql/repmgr/repmgr.conf --log-to-file'
follow_command='/usr/local/postgresql/bin/repmgr standby follow -f /usr/local/postgresql/repmgr/repmgr.conf --log-to-file --upstream-node-id=%n'
service_start_command='sudo systemctl start postgresql.service'
service_stop_command='sudo systemctl stop postgresql.service'
service_restart_command='sudo systemctl restart postgresql.service'
service_reload_command='sudo systemctl reload postgresql.service'

其他配置项参考样例:https://raw.githubusercontent.com/EnterpriseDB/repmgr/master/repmgr.conf.sample

注册主节点

在主节点node1上注册repmgr主节点信息,将安装repmgr扩展和元数据对象,并为主服务器添加一个元数据记录。

repmgr -f /usr/local/postgresql/repmgr/repmgr.conf primary register

INFO: connecting to primary database...
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
NOTICE: primary node record (ID: 1) registered

查看集群状态

repmgr -f /usr/local/postgresql/repmgr/repmgr.conf cluster show
 
 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                     
----+-------+---------+-----------+----------+----------+----------+----------+--------------------------------------------------------
 1  | node1 | primary | * running |          | default  | 100      | 1        | host=node1 user=repmgr dbname=repmgr connect_timeout=2

repmgr元数据如下:

[postgres@node1 data]$ psql -Urepmgr
repmgr=# \x
Expanded display is on.
repmgr=# select * from repmgr.nodes;
-[ RECORD 1 ]----+-------------------------------------------------------
node_id          | 1
upstream_node_id | 
active           | t
node_name        | node1
type             | primary
location         | default
priority         | 100
conninfo         | host=node1 user=repmgr dbname=repmgr connect_timeout=2
repluser         | repmgr
slot_name        | 
config_file      | /usr/local/postgresql/repmgr/repmgr.conf

一个PostgreSQL复制集群中的每个服务器都将拥有自己的记录。在使用repmgrd时,当节点状态或角色发生变化时,upstream_node_id、active和type字段会更新。

备节点配置

在node2节点上测试(--dry-run)一下是否能正常clone主库数据:

[postgres@node2 data]$ repmgr -h node1 -U repmgr -d repmgr -f /usr/local/postgresql/repmgr/repmgr.conf standby clone --dry-run
NOTICE: destination directory "/usr/local/postgresql/data" provided
INFO: connecting to source node
DETAIL: connection string is: host=node1 user=repmgr dbname=repmgr
DETAIL: current installation size is 29 MB
INFO: "repmgr" extension is installed in database "repmgr"
INFO: replication slot usage not requested;  no replication slot will be set up for this standby
INFO: parameter "max_wal_senders" set to 10
NOTICE: checking for available walsenders on the source node (2 required)
INFO: sufficient walsenders available on the source node
DETAIL: 2 required, 10 available
NOTICE: checking replication connections can be made to the source server (2 required)
INFO: required number of replication connections could be made to the source server
DETAIL: 2 replication connections required
NOTICE: standby will attach to upstream node 1
HINT: consider using the -c/--fast-checkpoint option
INFO: would execute:
  /usr/local/postgresql/pg_basebackup -l "repmgr base backup"  -D /usr/local/postgresql/data -h node1 -p 5432 -U repmgr -X stream 
INFO: all prerequisites for "standby clone" are met

没有报错的情况下,开始正式克隆

[postgres@node2 postgresql]$ repmgr  -h node1 -U repmgr -d repmgr -f /usr/local/postgresql/repmgr/repmgr.conf standby clone
NOTICE: destination directory "/usr/local/postgresql/data" provided
INFO: connecting to source node
DETAIL: connection string is: host=node1 user=repmgr dbname=repmgr
DETAIL: current installation size is 30 MB
INFO: replication slot usage not requested;  no replication slot will be set up for this standby
NOTICE: checking for available walsenders on the source node (2 required)
NOTICE: checking replication connections can be made to the source server (2 required)
INFO: checking and correcting permissions on existing directory "/usr/local/postgresql/data"
NOTICE: starting backup (using pg_basebackup)...
HINT: this may take some time; consider using the -c/--fast-checkpoint option
INFO: executing:
  /usr/local/postgresql/bin/pg_basebackup -l "repmgr base backup"  -D /usr/local/postgresql/data -h node1 -p 5432 -U repmgr -X stream 
NOTICE: standby clone (using pg_basebackup) complete
NOTICE: you can now start your PostgreSQL server
HINT: for example: sudo systemctl start pgserver
HINT: after starting the server, you need to register this standby with "repmgr standby register"

实际上使用了pg_basebackup命令clone了主节点的数据目录文件。主节点数据目录中的配置文件也都被复制到了备节点的数据目录中,包括postgresql.conf, postgresql.auto.conf, pg_hba.conf和pg_ident.conf。 如果不需要针对备节点定制修改这些配置的话,就可以启动备节点的数据库服务了:
sudo /usr/bin/systemctl start postgresql.service

验证复制是否正常工作

在主库node1上查看复制情况

[postgres@node1 repmgr]$ psql   -Urepmgr
psql (16.3)
Type "help" for help.

repmgr=# \x
Expanded display is on.
repmgr=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 7665
usesysid         | 16388
usename          | repmgr
application_name | node2
client_addr      | 10.11.110.47
client_hostname  | 
client_port      | 40102
backend_start    | 2024-05-28 17:39:27.557961+08
backend_xmin     | 
state            | streaming
sent_lsn         | 0/7000328
write_lsn        | 0/7000328
flush_lsn        | 0/7000328
replay_lsn       | 0/7000328
write_lag        | 
flush_lag        | 
replay_lag       | 
sync_priority    | 0
sync_state       | async
reply_time       | 2024-05-28 21:48:53.074483+08

备库node2上查看复制情况

[postgres@node1 repmgr]$ psql   -Urepmgr
psql (16.3)
Type "help" for help.

repmgr=# \x
Expanded display is on.
repmgr=# select * from pg_stat_wal_receiver;
-[ RECORD 1 ]---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
pid                   | 6255
status                | streaming
receive_start_lsn     | 0/7000000
receive_start_tli     | 1
written_lsn           | 0/7000328
flushed_lsn           | 0/7000328
received_tli          | 1
last_msg_send_time    | 2024-05-28 21:48:33.053217+08
last_msg_receipt_time | 2024-05-28 21:48:33.054117+08
latest_end_lsn        | 0/7000328
latest_end_time       | 2024-05-28 17:42:57.692486+08
slot_name             | 
sender_host           | node1
sender_port           | 5432
conninfo              | user=repmgr passfile=/home/postgres/.pgpass channel_binding=disable connect_timeout=2 dbname=replication host=node1 port=5432 application_name=node2 fallback_application_name=walreceiver sslmode=disable sslcompression=0 sslcertmode=disable sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=disable krbsrvname=postgres gssdelegation=0 target_session_attrs=any load_balance_hosts=disable

注册备节点

在node2上使用repmgr命令将node2注册为备节点

[postgres@node2 ~]$ repmgr -f /usr/local/postgresql/repmgr/repmgr.conf standby register
INFO: connecting to local node "node2" (ID: 2)
INFO: connecting to primary database
WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID: 1)
INFO: standby registration complete
NOTICE: standby node "node2" (ID: 2) successfully registered

此时再次查看集群状态

[postgres@node2 ~]$ repmgr -f /usr/local/postgresql/repmgr/repmgr.conf cluster show
 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                     
----+-------+---------+-----------+----------+----------+----------+----------+--------------------------------------------------------
 1  | node1 | primary | * running |          | default  | 100      | 1        | host=node1 user=repmgr dbname=repmgr connect_timeout=2
 2  | node2 | standby |   running | node1    | default  | 100      | 1        | host=node2 user=repmgr dbname=repmgr connect_timeout=2

手动切换测试

在备节点node2上执行切换操作,手动将node2切换为主节点

[postgres@node2 ~]$ repmgr -f /usr/local/postgresql/repmgr/repmgr.conf standby switchover
NOTICE: executing switchover on node "node2" (ID: 2)
NOTICE: attempting to pause repmgrd on 2 nodes
NOTICE: local node "node2" (ID: 2) will be promoted to primary; current primary "node1" (ID: 1) will be demoted to standby
NOTICE: stopping current primary node "node1" (ID: 1)
NOTICE: issuing CHECKPOINT on node "node1" (ID: 1) 
DETAIL: executing server command "sudo systemctl stop postgresql.service"
INFO: checking for primary shutdown; 1 of 60 attempts ("shutdown_check_timeout")
NOTICE: current primary has been cleanly shut down at location 0/9000028
NOTICE: waiting up to 30 seconds (parameter "wal_receive_check_timeout") for received WAL to flush to disk
INFO: sleeping 1 of maximum 30 seconds waiting for standby to flush received WAL to disk
INFO: sleeping 2 of maximum 30 seconds waiting for standby to flush received WAL to disk
INFO: sleeping 3 of maximum 30 seconds waiting for standby to flush received WAL to disk
[...]
INFO: sleeping 30 of maximum 30 seconds waiting for standby to flush received WAL to disk
WARNING: local node "node2" is behind shutdown primary "node1"
DETAIL: local node last receive LSN is 0/82A0000, primary shutdown checkpoint LSN is 0/9000028
NOTICE: aborting switchover
HINT: use --always-promote to force promotion of standby
[postgres@node2 ~]$ repmgr -f /usr/local/postgresql/repmgr/repmgr.conf cluster show
 ID | Name  | Role    | Status        | Upstream | Location | Priority | Timeline | Connection string                                     
----+-------+---------+---------------+----------+----------+----------+----------+--------------------------------------------------------
 1  | node1 | primary | ? unreachable | ?        | default  | 100      |          | host=node1 user=repmgr dbname=repmgr connect_timeout=2
 2  | node2 | standby |   running     | ? node1  | default  | 100      | 1        | host=node2 user=repmgr dbname=repmgr connect_timeout=2
 

第一次切换可能会遇到如上的错误,主节点停掉了,但是报错WARNING: local node "node2" is behind shutdown primary "node1"。查询资料得到的解决办法是,将postgresql.conf配置文件做如下修改

archive_command = '{ sleep 5;true; }'

重新加载配置sudo /usr/bin/systemctl reload postgresql.service ,再执行手动切换

[postgres@node2 ~]$ repmgr -f /usr/local/postgresql/repmgr/repmgr.conf standby switchover
NOTICE: executing switchover on node "node2" (ID: 2)
NOTICE: attempting to pause repmgrd on 2 nodes
NOTICE: local node "node2" (ID: 2) will be promoted to primary; current primary "node1" (ID: 1) will be demoted to standby
NOTICE: stopping current primary node "node1" (ID: 1)
NOTICE: issuing CHECKPOINT on node "node1" (ID: 1) 
DETAIL: executing server command "sudo systemctl stop postgresql.service"
INFO: checking for primary shutdown; 1 of 60 attempts ("shutdown_check_timeout")
NOTICE: current primary has been cleanly shut down at location 0/A000028
NOTICE: promoting standby to primary
DETAIL: promoting server "node2" (ID: 2) using pg_promote()
NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
NOTICE: STANDBY PROMOTE successful
DETAIL: server "node2" (ID: 2) was successfully promoted to primary
NOTICE: node "node2" (ID: 2) promoted to primary, node "node1" (ID: 1) demoted to standby
NOTICE: switchover was successful
DETAIL: node "node2" is now primary and node "node1" is attached as standby
NOTICE: STANDBY SWITCHOVER has completed successfully

重新查看集群状态

[postgres@node2 ~]$ repmgr -f /usr/local/postgresql/repmgr/repmgr.conf cluster show
 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                     
----+-------+---------+-----------+----------+----------+----------+----------+--------------------------------------------------------
 1  | node1 | standby |   running | node2    | default  | 100      | 1        | host=node1 user=repmgr dbname=repmgr connect_timeout=2
 2  | node2 | primary | * running |          | default  | 100      | 2        | host=node2 user=repmgr dbname=repmgr connect_timeout=2

使用repmgrd实现自动故障转移

启动repmgrd

在node1和node2上配置systemd配置文件/lib/systemd/system/repmgrd.service

[Unit]
Description=A replication manager, and failover management tool for PostgreSQL
After=syslog.target
After=network.target
After=postgresql.service

[Service]
Type=forking

User=postgres
Group=postgres

# PID file
PIDFile=/usr/local/postgresql/data/repmgrd.pid

# Location of repmgr conf file:
Environment=REPMGRDCONF=/usr/local/postgresql/repmgr/repmgr.conf
Environment=PIDFILE=/usr/local/postgresql/data/repmgrd.pid

# Where to send early-startup messages from the server
# This is normally controlled by the global default set by systemd
# StandardOutput=syslog
ExecStart=/usr/local/postgresql/bin/repmgrd -f ${REPMGRDCONF} -p ${PIDFILE} -d --verbose
ExecStop=/usr/bin/kill -TERM $MAINPID
ExecReload=/usr/bin/kill -HUP $MAINPID

# Give a reasonable amount of time for the server to start up/shut down
TimeoutSec=300

[Install]
WantedBy=multi-user.target

在node1和node2上启动repmgrd服务

systemctl enable repmgrd --now

自动故障转移测试

当前集群状态

[postgres@node2 ~]$ repmgr -f /usr/local/postgresql/repmgr/repmgr.conf cluster show
 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                     
----+-------+---------+-----------+----------+----------+----------+----------+--------------------------------------------------------
 1  | node1 | standby |   running | node2    | default  | 100      | 2        | host=node1 user=repmgr dbname=repmgr connect_timeout=2
 2  | node2 | primary | * running |          | default  | 100      | 2        | host=node2 user=repmgr dbname=repmgr connect_timeout=2

当前主节点在node2,模拟主节点宕掉

[postgres@node2 ~]$ sudo systemctl stop postgresql.service

等一会儿查看集群状态,node1节点已提升为主节点

[postgres@node1 ~]$ repmgr -f /usr/local/postgresql/repmgr/repmgr.conf cluster show
 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                     
----+-------+---------+-----------+----------+----------+----------+----------+--------------------------------------------------------
 1  | node1 | primary | * running |          | default  | 100      | 3        | host=node1 user=repmgr dbname=repmgr connect_timeout=2
 2  | node2 | primary | - failed  | ?        | default  | 100      |          | host=node2 user=repmgr dbname=repmgr connect_timeout=2

重新启动node2节点postgresql,会出现下面的状况,这个也是这种主备高可用方案的缺点,出现了两个主节点

[postgres@node1 ~]$ repmgr -f /usr/local/postgresql/repmgr/repmgr.conf cluster show
 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                     
----+-------+---------+-----------+----------+----------+----------+----------+--------------------------------------------------------
 1  | node1 | primary | * running |          | default  | 100      | 3        | host=node1 user=repmgr dbname=repmgr connect_timeout=2
 2  | node2 | primary | ! running |          | default  | 100      | 2        | host=node2 user=repmgr dbname=repmgr connect_timeout=2

node2的状态是不对的,先将postgresql再次停止,并重新以备节点的角色加入到集群

[postgres@node2 ~]$ sudo systemctl stop postgresql.service 
[postgres@node2 ~]$ repmgr -f /usr/local/postgresql/repmgr/repmgr.conf node rejoin  -d 'host=node1 dbname=repmgr user=repmgr' --force-rewind
NOTICE: rejoin target is node "node1" (ID: 1)
NOTICE: pg_rewind execution required for this node to attach to rejoin target node 1
DETAIL: rejoin target server's timeline 3 forked off current database system timeline 2 before current recovery point 0/F000028
NOTICE: executing pg_rewind
DETAIL: pg_rewind command is "/usr/local/postgresql/bin/pg_rewind -D '/usr/local/postgresql/data' --source-server='host=node1 user=repmgr dbname=repmgr connect_timeout=2'"
NOTICE: 0 files copied to /usr/local/postgresql/data
NOTICE: setting node 2's upstream to node 1
WARNING: unable to ping "host=node2 user=repmgr dbname=repmgr connect_timeout=2"
DETAIL: PQping() returned "PQPING_NO_RESPONSE"
NOTICE: starting server using "sudo systemctl start postgresql.service"
NOTICE: NODE REJOIN successful
DETAIL: node 2 is now attached to node 1

注意:这里使用了--force-rewind参数,正常情况下先不加该参数执行rejoin,不成功的情况下再添加该参数。-d 参数后面指定的host为当前的主节点。在正式进行操作前可以先加--dry-run参数测试一下能否正常执行。
因此,如果是生产上用的双节点主备集群,那么当发生主备切换后,手动将主节点启动的时候就直接rejoin
repmgr -f /usr/local/postgresql/repmgr/repmgr.conf node rejoin -d 'host=node1 dbname=repmgr user=repmgr'

keepalived实现VIP

repmgr自身没有vip配置功能,可以使用keepalived来实现vip功能。
使用yum安装keepalived即可,配置如下
MASTER节点/etc/keepalived/keepalived.conf

vrrp_script chk_myscript {
  script       "/usr/local/bin/is_postgres_primary.sh"
  interval 3   # check every 3 seconds
  fall 3       # require 3 failures for KO
  rise 3       # require 3 successes for OK
  user postgres postgres
}

vrrp_instance VI_1 {
    state BACKUP
    interface ens33
    virtual_router_id 51
    priority 100   # make all keepalived have different priority
    nopreempt
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 123465
    }
    virtual_ipaddress {
        10.11.88.44/16     # this is the standard VIP, change as needed
    }

    track_script {
        chk_myscript
    }
}

BACKUP节点/etc/keepalived/keepalived.conf

vrrp_script chk_myscript {
  script       "/usr/local/bin/is_postgres_primary.sh"
  interval 3   # check every 3 seconds
  fall 3       # require 3 failures for KO
  rise 3       # require 3 successes for OK
  user postgres postgres
}

vrrp_instance VI_1 {
    state BACKUP
    interface ens33
    virtual_router_id 51
    priority 90   # make all keepalived have different priority
    nopreempt
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 123465
    }
    virtual_ipaddress {
        10.11.88.44/16     # this is the standard VIP, change as needed
    }

    track_script {
        chk_myscript
    }
}

postgresql检查脚本/usr/local/bin/is_postgres_primary.sh,记得赋予执行权限

#!/bin/bash
export PATH=/usr/local/postgresql/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
export PGDATA=/usr/local/postgresql/data
export PGDATABASE=postgres
export PGUSER=postgres
export PGPORT=5432
psql -q -t -c 'select pg_is_in_recovery()' |grep 'f'

配置好后启动keepalived,只有在对应节点postgresql正常运行且为primary节点时,VIP才会在当前服务器。

pgbouncer安装

下载源码包
https://www.pgbouncer.org/downloads/files/1.22.1/pgbouncer-1.22.1.tar.gz
安装依赖
dnf install make make-devel libevent libevent-devel pkgconfig pkgconfig-devel openssl openssl-devel -y
编译安装

tar zxvf pgbouncer-1.22.1.tar.gz && cd pgbouncer-1.22.1
./configure --prefix=/usr/local/pgbouncer --with-systemd
make && make install

安装好以后样例配置文件及service配置文件等在路径/usr/local/pgbouncer/share/doc/pgbouncer下面。
pgbouncer.ini配置

[databases]
test = host=10.11.88.44 port=5432 pool_size=20 dbname=test
#如果懒得配指定数据库也可以用下面这种通配符方式
* = host=10.11.88.44 port=5432 pool_size=20

[pgbouncer]
logfile = /usr/local/pgbouncer/log/pgbouncer.log
pidfile = /usr/local/pgbouncer/pgbouncer.pid

listen_addr = *
listen_port = 16432

auth_type = md5
auth_file = /usr/local/pgbouncer/userlist.txt

admin_users = postgres
stats_users = stats, root

pool_mode = session
server_reset_query = DISCARD ALL
server_check_query = select 1

max_client_conn = 1000
default_pool_size = 20
min_pool_size = 10
reserve_pool_size = 10
reserve_pool_timeout = 5
max_db_connections = 100
max_user_connections = 100

auth_file中的用户即访问后端真实postgresql数据库中的所需要的用户。
auth_type 配置为md5是会自动兼容md5和scram-sha-256两种加密方式。PG14之后默认scram-sha-256。可通过select usename,passwd from pg_shadow order by 1;查看。
日志文件路径要提前创建mkdir /usr/local/pgbouncer/log
host 这里配置的VIP地址

userlist.txt文件的内容可以直接从数据库pg_shadow里查询出来粘贴进去,样例如下:

"bill" "SCRAM-SHA-256$4096:orUtCmrN6jsZDygp3sJuog==$JevG8fFxys35E4Qe+aonmOm52PvXEHhmMgmCumsL9TM=:6E1YwsQc4CMYIXsy2koQFD2sbbphZb1s+dENcotanTM="

配置systemd启动脚本/lib/systemd/system/pgbouncer.service

[Unit]
Description=connection pooler for PostgreSQL
Documentation=man:pgbouncer(1)
Documentation=https://www.pgbouncer.org/
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
User=postgres
ExecStart=/usr/local/pgbouncer/bin/pgbouncer /usr/local/pgbouncer/pgbouncer.ini
ExecReload=/bin/kill -HUP $MAINPID
KillSignal=SIGINT

[Install]
WantedBy=multi-user.target

启动pgbouncer后就可以连接数据库了
systemctl enable pgbouncer.service --now
psql -h 10.11.88.44 -p 16432 -U bill -d test
本地登录pgbouncer控制台,可以查看连接状态等信息。

[postgres@node1 ~]$  psql -p 16432 pgbouncer pgbouncer
psql (16.3, server 1.22.1/bouncer)
WARNING: psql major version 16, server major version 1.22.
         Some psql features might not work.
Type "help" for help.

pgbouncer=# show help;
NOTICE:  Console usage
DETAIL:  
        SHOW HELP|CONFIG|DATABASES|POOLS|CLIENTS|SERVERS|USERS|VERSION
        SHOW PEERS|PEER_POOLS
        SHOW FDS|SOCKETS|ACTIVE_SOCKETS|LISTS|MEM|STATE
        SHOW DNS_HOSTS|DNS_ZONES
        SHOW STATS|STATS_TOTALS|STATS_AVERAGES|TOTALS
        SET key = arg
        RELOAD
        PAUSE [<db>]
        RESUME [<db>]
        DISABLE <db>
        ENABLE <db>
        RECONNECT [<db>]
        KILL <db>
        SUSPEND
        SHUTDOWN
        WAIT_CLOSE [<db>]
SHOW
pgbouncer=# show clients;
 type |   user    | database  | state  | addr | port  | local_addr | local_port |      connect_time       |      request_time       | wait | wait_us | close_needed |   ptr    | link | remote_pid | tls | application_name | prepared_stateme
nts 
------+-----------+-----------+--------+------+-------+------------+------------+-------------------------+-------------------------+------+---------+--------------+----------+------+------------+-----+------------------+-----------------
----
 C    | pgbouncer | pgbouncer | active | unix | 16432 | unix       |      16432 | 2024-05-29 21:27:14 CST | 2024-05-29 21:27:20 CST |    4 |  390026 |            0 | 0xf5a2a0 |      |      84252 |     | psql             |                 
  0
 C    | bill      | test      | active | unix | 16432 | unix       |      16432 | 2024-05-29 21:13:41 CST | 2024-05-29 21:13:41 CST |    0 |       0 |            0 | 0xf59ff0 |      |      82562 |     | psql             |                 
  0
(2 rows)

总结

通过repmgrd实现自动故障转移,结合keepalived使访问请求只会发到primary节点,再加上pgbouncer进行连接池控制,基本上实现了线上最少服务器情况下的postgresql高可用架构。
不足:在发生自动故障转移后,需要手动干预将故障节点重新加入集群。
特殊场景1:即使故障转移后原主节点postgresql被手动或自动的意外拉起来,造成双主的情况也不怕,因为keepalived配置的非抢占模式,vip还是停留在新主节点服务器上。
特殊场景2:手动把主节点的keepalived停掉,此时备节点keepalived检查脚本检测到本机postgresql仍是备节点,故不会将vip添加的备机,当前就处于没有vip的情况,无法对外提供数据库访问服务。

标签:node2,postgresql,pgbouncer,keepalived,usr,node1,repmgr,local
From: https://www.cnblogs.com/Before/p/18229403

相关文章

  • 修改postgresql管理员密码
    1.切换到postgres用户:sudo-i-upostgres2.进入PostgreSQLShell:psql3.设置postgres用户的密码:ALTERUSERpostgresPASSWORD'new_password';4.退出PostgreSQLShell和postgres用户的shell:\q5.使用新密码登录:psql-Upostgres-W输入新设置的密码以登录。如......
  • PostgreSQL的内存参数
    PostgreSQL的内存参数基础信息OS版本:RedHatEnterpriseLinuxServerrelease7.9(Maipo)DB版本:16.2pg软件目录:/home/pg16/softpg数据目录:/home/pg16/data端口:5777PostgreSQL提供了多种内存参数,可以通过调整这些参数来优化数据库的性能。以下是一些主要的内存参......
  • PostgreSQL启动报错“could not map anonymous shared memory: Cannot allocate memor
    PostgreSQL启动报错“couldnotmapanonymoussharedmemory:Cannotallocatememory”基础信息OS版本:RedHatEnterpriseLinuxServerrelease7.9(Maipo)DB版本:16.2pg软件目录:/home/pg16/softpg数据目录:/home/pg16/data端口:5777报错[pg16@test~]$pg_ctlst......
  • openeuler源码安装Postgresql 16
    准备条件OpenEuler(虚拟机):版本:22.03-LTS-SP3下载地址:https://www.openeuler.org/zh/download/PostgreSQL:版本:16.3源码包下载地址:https://www.postgresql.org/ftp/source/操作系统安装安装过程与centos基本一致,此处就省略了,安装的时候可以把需要的网络工具和开发工具包勾......
  • Mysql主主复制+keepalived
     MySQL的主从复制和主主复制可以参考一下两篇文章:MySQL主从同步-CSDN博客MySQL主主复制-CSDN博客master服务器IP192.168.7.67slave服务器IP192.168.7.711、keepalived部署(master和slave服务器都安装)安装依赖包:[root@dockermysql]#yuminstallgcclibnllibnl-devel......
  • 如何将 Langfuse 链接到自有 PostgreSQL 数据库并升级 PostgreSQL 版本
    在本文中,我们将介绍如何将Langfuse应用程序链接到自有的PostgreSQL数据库,并升级PostgreSQL以支持jsonb类型。前提条件运行CentOS7的服务器已安装的PostgreSQL9.2或更低版本需要将Langfuse连接到自有数据库,并升级PostgreSQL以支持jsonb类型1.......
  • PostgreSQL 小课推广-20240529
    目前PostgreSQL小课在持续更新中,PostgreSQL小课专栏新人优惠券到2024年6月1日到期,有需要的伙伴还请关注下。优惠券马上到期,领取券后,也就只需要30元(也就一杯咖啡的钱)就可以解锁专栏,先到先得。目前专栏的50元/年,后续到期不需要续费,等到专栏完成,会有一个......
  • 使用MySQL主主复制加Keepalived实现高可用
    目录资源列表基础环境关闭防火墙关闭内核安全机制修改主机名时间同步一、安装MySQL准备yum源清理mariadb相关的包安装启动二、MySQL主主复制两个节点授权用户两个节点使用授权用户进行连接启动同步三、安装keepalived安装修改配置文件启动服务验证  ......
  • PostgreSQL数据库实战:轻松修改字段名称
    哈喽,大家好,我是木头左!在本文中,将深入探讨PostgreSQL数据库中的一个强大功能,即如何轻松修改字段名称。无论你是一个新手开发者,还是一个经验丰富的DBA,这篇文章都将为你提供实用的技巧和建议。一、为什么需要修改字段名称?在开发过程中,经常会遇到需要修改表结构的情况,比如添加新......
  • CentOS7安装PostgreSQL15以及PostGIS3.3
     安装Postgresqlyuminstall-yhttps://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpmyuminstall-yhttps://dl.fedoraproject.org/pub/epel/7/x86_64/Packages/l/libzstd-1.5.5-1.el7.x86_64.rpmyum-yinstallpos......