首页 > 其他分享 >部署repmgr PG14集群<无autofailover>

部署repmgr PG14集群<无autofailover>

时间:2023-05-07 09:45:23浏览次数:44  
标签:5432 14 -- PG14 192.168 host 集群 repmgr

系统环境

OS: CentOS7
192.168.56.11 node1
192.168.56.12 node2

部署操作流程

1. 系统环境初始化


    echo redhat | passwd root --stdin 

    echo -e "192.168.56.11  node1 " >>/etc/hosts

    echo -e "192.168.56.12  node2 " >>/etc/hosts

    sed -i 's/vault.centos.org/mirrors.aliyun.com/g' /etc/yum.repos.d/CentOS*
 

2. 所有节点安装PG14 <root执行>

    sudo yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm

    sudo yum -y install epel-release yum-utils
  
    sudo yum-config-manager --enable pgdg14

    sudo yum install postgresql14-server postgresql14 -y

    echo -e " export PATH=/usr/pgsql-14/bin/:$PATH" >>/var/lib/pgsql/.bash_profile


3. 所有节点安装repmgr<root执行>

## on all nodes
sudo yum -y  install repmgr_14*

4. 关闭防火墙或者放行PG端口(默认5432)<root执行>

systemctl stop firewalld

5. 主节点创建repmgr账户 , pg_hba.cfg 以及postgresql.conf 参数调整 , repmgr.conf文件配置 <PG owner执行>

5.1 创建repmgr

createuser repmgr -P -s 

createdb  repmgr -O repmgr;

5.2 调整pg_hba.cfg 以及postgresql.conf参数

##  -- postgresql.conf
max_wal_senders = 10
max_replication_slots = 10
wal_level = 'hot_standby'
hot_standby = on
archive_mode = on
archive_command = '/bin/true'
shared_preload_libraries = 'repmgr'

##  -- pg_hba.cfg
--- 这里采用了trust模式,因此无需.pgpass
local   replication   repmgr                              trust
host    replication   repmgr      127.0.0.1/32            trust
host    replication   repmgr      192.168.56.11/32       trust
host    replication   repmgr      192.168.56.12/32       trust

local   repmgr        repmgr                              trust
host    repmgr        repmgr      127.0.0.1/32            trust
host    repmgr        repmgr      192.168.56.11/32       trust
host    repmgr        repmgr      192.168.56.12/32       trust

若有需要 .pgpass ,则需要如下配置:su - postgres
cd /var/lib/pgsql
touch .pgpass
chmod 0600 .pgpass
echo -e "192.168.56.11:5432:repmgr:repmgr:passwd_123456">>.pgpass
echo -e "192.168.56.12:5432:repmgr:repmgr:passwd_123456">>.pgpass

5.3 repmgr.conf文件配置

#-- 主库

node_id=1
node_name=node1
conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/var/lib/pgsql/14/data/'

#-- 从库
node_id=2
node_name=node2
conninfo='host=node2 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/var/lib/pgsql/14/data/'

6. 启动主节点PG 并 注册为primary server <PG owner执行>

# -- 首先启动主库PG
sudo /usr/pgsql-14/bin/postgresql-14-setup initdb

sudo systemctl enable postgresql-14

sudo systemctl start postgresql-14

# -- 先模拟运行注册
su - postgres
repmgr -f /etc/repmgr/14/repmgr.conf primary register --dry-run

# -- 将上面错误全部解决后,执行注册
repmgr -f /etc/repmgr/14/repmgr.conf primary register 


7. 清空备节点PG data目录, 搭建/clone备节点 <PG owner执行>

# - 确保从库PG以及PG 所有data目录是空的,如果不是,则手动删除即可

systemctl stop postgresql-14


rm -rf /var/lib/pgsql/14/data

8. 将备节点注册成standby server 并启动PG实例 <PG owner执行>

repmgr -f /etc/repmgr/14/repmgr.conf standby register  --force

9. 校验集群信息

-bash-4.2$ repmgr cluster show
DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=192.168.56.11 port=5432 fallback_application_name=repmgr options=-csearch_path="
DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=192.168.56.11 port=5432 fallback_application_name=repmgr options=-csearch_path="
DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=node2 port=5432 fallback_application_name=repmgr options=-csearch_path="
DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=192.168.56.11 port=5432 fallback_application_name=repmgr options=-csearch_path="
 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+--------------------------------------------------------------------------
 1  | node1 | primary | * running |          | default  | 100      | 1        | host=192.168.56.11 port=5432 user=repmgr dbname=repmgr connect_timeout=2
 2  | node2 | standby |   running | node1    | default  | 100      | 1        | host=node2 user=repmgr port=5432 dbname=repmgr connect_timeout=2

10. 常见错误

10.1 connection to server at "192.168.56.11", port 5432 failed: Connection refused

# -- 明明PG 实例都启动了,但是死活显示如题,最后排查到 默认情况下 postgresql只监听local,因此修改参数listen-addresses ='*'
-bash-4.2$ repmgr -f /etc/repmgr/14/repmgr.conf primary register --dry-run
INFO: connecting to primary database...
DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=192.168.56.11 port=5432 fallback_application_name=repmgr options=-csearch_path="
ERROR: connection to database failed
DETAIL:
connection to server at "192.168.56.11", port 5432 failed: Connection refused
        Is the server running on that host and accepting TCP/IP connections?

DETAIL: attempted to connect using:
  user=repmgr connect_timeout=2 dbname=repmgr host=192.168.56.11 port=5432 fallback_application_name=repmgr options=-csearch_path=

#-- solution 修改如下配置即可
查看 postgresql.conf 
listen-addresses ='*'

10.2.ERROR: "repmgr" is not a superuser and no superuser name supplied

# 需要确保做repmgr的账户拥有超级权限
postgres=# \du
                                   List of roles
 Role name |                         Attributes                         | Member of
-----------+------------------------------------------------------------+-----------
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
 repmgr    | Superuser                                                  | {}

10.3. Permission denied (publickey,gssapi-keyex,gssapi-with-mic).

# 系统级别的问题, 配置passwordless时,在节点node1上跑 ssh node2 date会直接拒绝

# - solution
修改如下 PasswordAuthentication = yes
PasswordAuthentication  yes

10.4.网络有问题

-- 各种查看,发现没有问题


-bash-4.2$ repmgr -hnode1 -p5432 -U repmgr -d repmgr -f /etc/repmgr/14/repmgr.conf standby clone --dry-run
NOTICE: destination directory "/var/lib/pgsql/14/data" provided
INFO: connecting to source node
DETAIL: connection string is: host=node1 port=5432 user=repmgr dbname=repmgr
ERROR: connection to database failed
DETAIL:
connection to server at "node1" (192.168.56.11), port 5432 failed: No route to host
        Is the server running on that host and accepting TCP/IP connections?

-- 最后发现是防火墙需要关掉 - systemctl stop firewalld 或者通过firewall-cmd放行PG 实例端口

标签:5432,14,--,PG14,192.168,host,集群,repmgr
From: https://www.cnblogs.com/Jeona/p/17378872.html

相关文章

  • MongoDB中缩减Shard集群(删除一个Shard)--删除一个分片
    关键字:MongoDB中缩减Shard集群(删除一个Shard)--删除一个分片对MongoDB的Shard集群来说,添加一个分片很简单,AddShard就可以了。但是缩减集群(删除分片)这种一般很少用到。由于某服务器挂了,所以想送修之前必须把它上面的数据自动迁移到其他Shard上。以下......
  • cassandra-0.6.1安装配置(与集群)
    关键字:cassandra-0.6.1安装配置(与集群)1下载Cassandrahttp://cassandra.apache.org/下载版本0.5.1。2安装Cassandra安装目录D:\apache-cassandra-0.5.1。在此需要修改两个配置文件1修改conf目录下的log4j.properties文件:log4j.appender.R.Fil......
  • 搭建keepalived+LVS+nginx高可用集群负载均衡
    在LVS服务器上安装Keepalived,参考(https://www.cnblogs.com/xiaodunan/p/17374699.html)修改核心配置文件cd/etc/keepalived/vimkeepalived.confglobal_defs{router_idLVS_1}vrrp_instanceVI_1{stateMASTERinterfaceeth0virtual_router_id41......
  • hadoop 3.3.5伪分布式集群部署
    hadoop包下载https://archive.apache.org/dist/hadoop/common/安装好jdk并配置环境变量下载hadoop压缩包并放至/data/hadoop目录解压tar-zxvfhadoop-3.3.5.tar.gz1配置1.1在Hadoop安装目录下进入到etc/hadoop目录,修改Hadoop相关配置文件。<property><name>f......
  • Jboss4集群配置之二:Jboss集群配置实例与负载均衡器配置
    1.前言 2.集群准备知识 3.Jboss集群配置实例概述4.Jboss集群负载均衡器mod_jk配置3.Jboss集群配置实例概述下文中,RubySun 将以实例来叙述Jboss集群配置。该实例包含3个Jboss节点。各节点被动接收负载均衡器转发的请求。各节点间没有横向的联系。4. Jboss集群负载均衡器配置步......
  • 使用kubeadm快速部署k8s集群
    目录使用kubeadm快速部署k8s集群一、环境准备二、kubeadm工具介绍三、实操1、所有节点(基础环境)2、master节点3、node节点(加入)4、查看集群使用kubeadm快速部署k8s集群一、环境准备操作系统CentOS7.x-86_x64硬盘:40G+角色IPmaster192.168.130.10node192.168.130.......
  • Kafka基础阶段与集群搭建详细教程
    Kafka第一天课堂笔记一.Kafka简介1.1消息队列消息队列——用于存放消息的组件程序员可以将消息放入到队列中,也可以从消息队列中获取消息很多时候消息队列不是一个永久性的存储,是作为临时存储存在的(设定一个期限:设置消息在MQ中保存10天)消息队列中间件:消息队列的组件,例如:Kafk......
  • elasticsearch集群及kibana安装
    系统配置创建一个用户elastic,不能使用root用户启动配置该用户环境变量,用户home目录.bash_profile文件#配置ES_JAVA_HOME使用es自带jdkexportES_JAVA_HOME=/data/es/elasticsearch/jdk#修改最大文件句柄数ulimit-n65535#修改最大线程数ulimit-u4096执行..bash_p......
  • k8s集群部署搭建
    一.搭建环境win11 16G+1t   VMware虚机 4G+2G+2G二.拓扑图 master+2node三.ip划分四.前置条件配置完成,按照文档进行搭建即可,最终实现效果如下图 ......
  • K8S集群管理工具、平台
    1RancherRancher中文官网:https://docs.rancher.cn/2KubeSphere官网:https://kubesphere.com.cn/面向云原生应用的容器混合云KubeSphere愿景是打造一个以Kubernetes为内核的云原生分布式操作系统,它的架构可以非常方便地使第三方应用与云原生生态组件进行即插即用(plug-a......