首页 > 数据库 >MySQL 组复制故障恢复的有效策略

MySQL 组复制故障恢复的有效策略

时间:2024-08-19 09:26:15浏览次数:11  
标签:none set group replication MEMBER 故障 复制 MySQL localhost

没有MGR环境,只是学学别人经验。原文地址:https://www.percona.com/blog/effective-strategies-for-recovering-mysql-group-replication-from-failures/

 

组复制是一种容错/高可用复制拓扑结构,可确保在主节点宕机时,由其他候选成员或辅助成员之一接管,从而使写入和读取操作不间断地继续进行。

 

不过,在某些情况下,由于故障、网络分区或数据库崩溃,组内成员资格可能会被破坏,或者我们最终会发现一些孤立的成员。在这种情况下,我们必须执行部分或全部恢复操作,使整个拓扑结构再次处于活动状态。

假设有下面的情况: 当主节点 [Node1] 正在进行读/写操作时,所有组节点都宕机了。

 

1)下面是使用 dbdeployer 工具设置的组复制拓扑。

node1 [localhost:23637] {root} ((none)) > select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | MEMBER_COMMUNICATION_STACK |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| group_replication_applier | 00023637-1111-1111-1111-111111111111 | 127.0.0.1   |       23637 | ONLINE       | PRIMARY     | 8.0.36         | XCom                       |
| group_replication_applier | 00023638-2222-2222-2222-222222222222 | 127.0.0.1   |       23638 | ONLINE       | SECONDARY   | 8.0.36         | XCom                       |
| group_replication_applier | 00023639-3333-3333-3333-333333333333 | 127.0.0.1   |       23639 | ONLINE       | SECONDARY   | 8.0.36         | XCom                       |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
3 rows in set (0.01 sec)

2)在主节点(127.0.0.1:23637)上运行负载

sysbench 
--db-driver=mysql 
--mysql-user=sbtest_user 
--mysql-password=Sbtest@2022 
--mysql-db=sbtest 
--mysql-host=127.0.0.1 
--mysql-port=23637 
--tables=10 
--table-size=10000 
--create_secondary=off 
--threads=50 
--time=0 
--events=0 
--report-interval=1 /opt/homebrew/Cellar/sysbench/1.0.20_6/share/sysbench/oltp_read_write.lua run

运行结果:

[ 1s ] thds: 50 tps: 1726.75 qps: 35178.95 (r/w/o: 24710.96/6964.73/3503.26) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 2s ] thds: 50 tps: 2528.19 qps: 50605.04 (r/w/o: 35405.76/10142.90/5056.39) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 3s ] thds: 50 tps: 2316.05 qps: 46257.26 (r/w/o: 32348.96/9290.15/4618.15) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
...

3)在这里,我们特意停止了所有的组节点。

./stop_all 
# executing 'stop' on /Users/aniljoshi/sandboxes/group_msb_8_0_36
executing 'stop' on node3
stop /Users/aniljoshi/sandboxes/group_msb_8_0_36/node3
executing 'stop' on node2
stop /Users/aniljoshi/sandboxes/group_msb_8_0_36/node2
executing 'stop' on node1
stop /Users/aniljoshi/sandboxes/group_msb_8_0_36/node1

至此,群集已完全瘫痪,所有群组成员都已停止工作。现在,我们将再次尝试恢复群集。

引导/恢复群集节点

4) 首先,让我们启动每个节点。由于我使用的是 dbdeployer,我可以通过下面的一些内部命令/脚本来管理(启动/停止)或做其他事情。

shell> ./start_all 
# executing 'start' on /Users/aniljoshi/sandboxes/group_msb_8_0_36
executing "start" on node 1
... sandbox server started
executing "start" on node 2
.. sandbox server started
executing "start" on node 3
.. sandbox server started

5) 连接到每个节点,验证当前状态。

node1 [localhost:23637] {root} ((none)) > select * from performance_schema.replication_group_members;
+---------------------------+-----------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| CHANNEL_NAME              | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | MEMBER_COMMUNICATION_STACK |
+---------------------------+-----------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| group_replication_applier |           |             |        NULL | OFFLINE      |             |                |                            |
+---------------------------+-----------+-------------+-------------+--------------+-------------+----------------+----------------------------+
1 row in set (0.00 sec)

node2 [localhost:23638] {root} ((none)) > select * from performance_schema.replication_group_members;
+---------------------------+-----------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| CHANNEL_NAME              | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | MEMBER_COMMUNICATION_STACK |
+---------------------------+-----------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| group_replication_applier |           |             |        NULL | OFFLINE      |             |                |                            |
+---------------------------+-----------+-------------+-------------+--------------+-------------+----------------+----------------------------+
1 row in set (0.01 sec)

node3 [localhost:23639] {root} ((none)) >  select * from performance_schema.replication_group_members;
+---------------------------+-----------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| CHANNEL_NAME              | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | MEMBER_COMMUNICATION_STACK |
+---------------------------+-----------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| group_replication_applier |           |             |        NULL | OFFLINE      |             |                |                            |
+---------------------------+-----------+-------------+-------------+--------------+-------------+----------------+----------------------------+
1 row in set (0.01 sec)

此时,所有的节点都是offline状态。

6) 我们可以通过引导群组中的一个节点来恢复群集,然后再启动其他节点后,它们将通过分布式恢复过程加入。但在此之前,了解每个节点上已认证/已提交事务的状态非常重要,这样我们就不会启动错误的节点。启动错误的节点会导致极大的数据丢失!

 

注意:在我们的例子中,“received_transaction_set ”为空,因为所有节点都重新启动了。但是,如果节点没有重启过,只有组复制因网络中断或其他事件而损坏,那么我们也可以观察这部分的值。

ode1 [localhost:23637] {root} ((none)) > SELECT @@GLOBAL.GTID_EXECUTED
    -> ;
+-----------------------------------------------+
| @@GLOBAL.GTID_EXECUTED                        |
+-----------------------------------------------+
| 00023636-bbbb-cccc-dddd-eeeeeeeeeeee:1-148034 |
+-----------------------------------------------+
1 row in set (0.00 sec)

node1 [localhost:23637] {root} ((none)) > SELECT received_transaction_set FROM 
    ->       performance_schema.replication_connection_status WHERE 
    ->       channel_name="group_replication_applier";
+--------------------------+
| received_transaction_set |
+--------------------------+
|                           |
+--------------------------+
1 row in set (0.01 sec)

node2 [localhost:23638] {msandbox} ((none)) > SELECT @@GLOBAL.GTID_EXECUTED;
+----------------------------------------------+
| @@GLOBAL.GTID_EXECUTED                       |
+----------------------------------------------+
| 00023636-bbbb-cccc-dddd-eeeeeeeeeeee:1-96208 |
+----------------------------------------------+
1 row in set (0.00 sec)


node2 [localhost:23638] {msandbox} ((none)) > SELECT received_transaction_set FROM 
    ->       performance_schema.replication_connection_status WHERE 
    ->       channel_name="group_replication_applier";
+--------------------------+
| received_transaction_set |
+--------------------------+
|                          |
+--------------------------+
1 row in set (0.01 sec)

node3 [localhost:23639] {msandbox} ((none)) > SELECT @@GLOBAL.GTID_EXECUTED;
+----------------------------------------------+
| @@GLOBAL.GTID_EXECUTED                       |
+----------------------------------------------+
| 00023636-bbbb-cccc-dddd-eeeeeeeeeeee:1-76208 |
+----------------------------------------------+
1 row in set (0.00 sec)


node3 [localhost:23639] {msandbox} ((none)) > SELECT received_transaction_set FROM 
    ->       performance_schema.replication_connection_status WHERE 
    ->       channel_name="group_replication_applier";
+--------------------------+
| received_transaction_set |
+--------------------------+
|                          |
+--------------------------+
1 row in set (0.00 sec)

从这里的 gtid_executed 信息可以看出,Node1(“00023636-bbbb-cccc-ddd-eeeeeeeeeeee:1-148034”) 拥有最新的 GTID ,因此我们应该引导这个节点。

 

请注意:收集完所有组员的事务集后,请对它们进行比较,找出哪个组员的事务集最大,包括已执行事务(gtid_executed)和已认证事务(在 group_replication_applier 通道上)。可以通过查看 GTID 手动进行比较,也可以使用存储函数(GTID_SUBSET 和 GTID_SUBSTRACT)比较 GTID 集。

mysql> SELECT @@GTID_SUBSET();
mysql> SELECT @@GTID_SUBSTRACT();

7) 让我们开始 Node1 的引导过程。

mysql> SET GLOBAL group_replication_bootstrap_group=ON;
mysql> START GROUP_REPLICATION;
mysql> SET GLOBAL group_replication_bootstrap_group=OFF;

现在,如果我们检查 Node1 的状态,就会发现该成员已在线。

node1 [localhost:23637] {msandbox} ((none)) > select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | MEMBER_COMMUNICATION_STACK |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| group_replication_applier | 00023637-1111-1111-1111-111111111111 | 127.0.0.1   |       23637 | ONLINE       | PRIMARY     | 8.0.36         | XCom                       |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
1 row in set (0.00 sec)

8) 现在,我们只需启动其他节点的组复制即可。如果二进制日志已不存在于任何可用的组节点上,则根据源上可用的二进制日志的状态转移,或通过克隆插件对数据进行完全克隆,通过分布式恢复机制加入节点。

node2 [localhost:23638] {msandbox} ((none)) > start group_replication;

 node2 [localhost:23638] {root} ((none)) > select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | MEMBER_COMMUNICATION_STACK |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| group_replication_applier | 00023637-1111-1111-1111-111111111111 | 127.0.0.1   |       23637 | ONLINE       | PRIMARY     | 8.0.36         | XCom                       |
| group_replication_applier | 00023638-2222-2222-2222-222222222222 | 127.0.0.1   |       23638 | ONLINE       | SECONDARY   | 8.0.36         | XCom                       |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
2 rows in set (0.00 sec)

node3 [localhost:23639] {root} ((none)) > start group_replication;
Query OK, 0 rows affected (1.36 sec)

node3 [localhost:23639] {root} ((none)) >  select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | MEMBER_COMMUNICATION_STACK |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| group_replication_applier | 00023637-1111-1111-1111-111111111111 | 127.0.0.1   |       23637 | ONLINE       | PRIMARY     | 8.0.36         | XCom                       |
| group_replication_applier | 00023638-2222-2222-2222-222222222222 | 127.0.0.1   |       23638 | ONLINE       | SECONDARY   | 8.0.36         | XCom                       |
| group_replication_applier | 00023639-3333-3333-3333-333333333333 | 127.0.0.1   |       23639 | ONLINE       | SECONDARY   | 8.0.36         | XCom                       |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
3 rows in set (0.00 sec)

注意:如果二进制日志以某种方式被rotated或清除,那么我们需要从备份中恢复,或者我们可以设置克隆插件,它将自动执行完整的快照过程。

 

此外,我们还可以通过执行以下命令来验证节点是否仍在进行分布式恢复。当恢复完成且节点显示 “SECONDARY”状态时,复制进程 [Slave_IO_Running/Slave_SQL_Running] 将停止。

node2 [localhost:23638] {root} ((none)) > SHOW SLAVE STATUS FOR CHANNEL 'group_replication_recovery'G;

到这个阶段,我们已经有了一个完全运行的集群。

从备份中恢复

在极少数情况下,当数据完全损坏、想要进行部分恢复或添加新节点时,我们可以直接执行备份/恢复活动。

 

让我们试着用下面的场景来理解一下。在此,我们将尝试使用物理备份 [Percona XtraBackup] 恢复群集。

 

1) 为演示目的,我们从群组中的一个节点进行了备份。

shell> xtrabackup --host=127.0.0.1 -usbtest_user -pSbtest@2022 -P23637 --backup --target-dir=/Users/aniljoshi/backup/

备份结果:

...
2024-08-03T19:54:50.102188+05:30 0 [Note] [MY-011825] [Xtrabackup] Transaction log of lsn (266657170) to (266657878) was copied.
2024-08-03T19:54:50.329784+05:30 0 [Note] [MY-011825] [Xtrabackup] completed OK!

2) 为了让备份有用,我们也要做好 prepare。

shell> xtrabackup --prepare --use-memory=2G --target-dir=/Users/aniljoshi/backup/

输出结果:

...
2024-08-03T19:58:00.432140+05:30 0 [Note] [MY-012980] [InnoDB] Shutdown completed; log sequence number 266658326
2024-08-03T19:58:00.434701+05:30 0 [Note] [MY-011825] [Xtrabackup] completed OK!

现在,考虑到我们要使用这些数据恢复群集。

3) 首先,我们将清理目标成员 [Node1] 的数据目录。

shell> rm -rf  /Users/aniljoshi/sandboxes/group_msb_8_0_36/node1/data/      

4) 然后,我们将把prepare好的数据复制到 Node1 数据目录下。

shell> cp -aR /Users/aniljoshi/backup/  /Users/aniljoshi/sandboxes/group_msb_8_0_36/node1/data/

5) 引导节点 [Node1]

mysql> SET GLOBAL group_replication_bootstrap_group=ON;
mysql> START GROUP_REPLICATION;
mysql> SET GLOBAL group_replication_bootstrap_group=OFF;
 
node1 [localhost:23637] {root} (performance_schema) > select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | MEMBER_COMMUNICATION_STACK |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| group_replication_applier | 11400bec-51a6-11ef-b950-29b40b1b64a7 | 127.0.0.1   |       23637 | ONLINE       | PRIMARY     | 8.0.36         | XCom                       |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
1 row in set (0.00 sec)

其他节点可以通过分布式恢复/克隆流程加入(如果已经设置);不过,这里我们将演示如何使用备份执行还原/恢复。

 

6) 让我们从 Node1 获取一个新的备份,并记住 GTID 执行信息。

shell> xtrabackup --host=127.0.0.1 -usbtest_user -pSbtest@2022 -P23637 --backup --target-dir=/Users/aniljoshi/backup/Primary/

备份结果:

...
2024-08-03T20:16:36.182978+05:30 0 [Note] [MY-011825] [Xtrabackup] Transaction log of lsn (266723771) to (266724479) was copied.
2024-08-03T20:16:36.408616+05:30 0 [Note] [MY-011825] [Xtrabackup] completed OK!

GTID的详细信息:

shell> cat /Users/aniljoshi/backup/Primary/xtrabackup_binlog_info
mysql-bin.000006    193    00023636-bbbb-cccc-dddd-eeeeeeeeeeee:1-148040

7) prepare备份数据

shell> xtrabackup --prepare --use-memory=2G --target-dir=/Users/aniljoshi/backup/Primary/

输出结果:

...
2024-08-03T20:17:47.817955+05:30 0 [Note] [MY-013072] [InnoDB] Starting shutdown...
2024-08-03T20:17:47.822229+05:30 0 [Note] [MY-013084] [InnoDB] Log background threads are being closed...
2024-08-03T20:17:47.865162+05:30 0 [Note] [MY-012980] [InnoDB] Shutdown completed; log sequence number 266724886
2024-08-03T20:17:47.867836+05:30 0 [Note] [MY-011825] [Xtrabackup] completed OK!

8) 清理 Node2 数据目录,并将prepare好的备份复制到 Node2 上。

shell> cp -aR /Users/aniljoshi/backup/Primary/  /Users/aniljoshi/sandboxes/group_msb_8_0_36/node2/data/

9) 重启服务后,我们就可以验证是否应用了 gtid_purged。

node2 [localhost:23638] {root} ((none)) > show global variables like '%gtid_purged%';
+---------------+-----------------------------------------------+
| Variable_name | Value                                         |
+---------------+-----------------------------------------------+
| gtid_purged   | 00023636-bbbb-cccc-dddd-eeeeeeeeeeee:1-148040 |
+---------------+-----------------------------------------------+
1 row in set (0.01 sec)

否则,我们可以使用下面的命令设置 gtid_purged。

MySQL> set global gtid_purged="00023636-bbbb-cccc-dddd-eeeeeeeeeeee:1-148040 ";

10) 最后,我们可以建立复制并启动组复制进程。

node2 [localhost:23638] {root} ((none)) > change master to master_user=rsandbox,master_password='rsandbox’ FOR CHANNEL 'group_replication_recovery';

node2 [localhost:23638] {root} ((none)) > start group_replication;
Query OK, 0 rows affected (1.97 sec)

node2 [localhost:23638] {root} ((none)) > select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | MEMBER_COMMUNICATION_STACK |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| group_replication_applier | 11400bec-51a6-11ef-b950-29b40b1b64a7 | 127.0.0.1   |       23637 | ONLINE       | PRIMARY     | 8.0.36         | XCom                       |
| group_replication_applier | c2b74588-51a7-11ef-81da-5f8da38bd06f | 127.0.0.1   |       23638 | ONLINE       | SECONDARY     | 8.0.36         | XCom                       |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
2 rows in set (0.00 sec)

节点 [Node3] 后续也可执行类似步骤。此外,除了 Percona XtraBackup ,我们还可以使用一些逻辑备份/恢复方式,如 [mydumper/mysqldump/MySQL shell 等] 来执行恢复并建立组复制进程。我们在此不对其进行讨论;不过,其过程通常与我们在任何异步复制设置中所做的相同。一旦以预期的 gtid 坐标完成恢复,我们就可以执行下面的命令。

MySQL> change master to master_user=rsandbox,master_password='rsandbox’ FOR CHANNEL 'group_replication_recovery';
MySQL> start group_replication;

此外,自 MySQL 8.0.17 起,克隆插件可用来在组复制中进行分布式恢复的远程克隆操作。这需要执行一些额外的配置和步骤。

通过克隆恢复

在 Donor 节点[Node1]上执行:

1) 动态安装插件

node1 [localhost:23637] {root} ((none)) > install plugin clone soname 'mysql_clone.so';
Query OK, 0 rows affected (0.16 sec)

2) 在数据库中持久化变更

[mysqld]
plugin-load-add=mysql_clone.so
clone=FORCE_PLUS_PERMANENT

3) 授予复制用户"rsandbox" BACKUP_ADMIN 权限,由其负责分布式恢复流程

node1 [localhost:23637] {root} ((none)) > show grants for rsandbox@'%';
+---------------------------------------------+
| Grants for rsandbox@%                       |
+---------------------------------------------+
| GRANT USAGE ON *.* TO `rsandbox`@`%`        |
| GRANT BACKUP_ADMIN ON *.* TO `rsandbox`@`%` |
| GRANT `R_REPLICATION`@`%` TO `rsandbox`@`%` |
+---------------------------------------------+
3 rows in set (0.00 sec)

4) 验证插件状态

node3 [localhost:23639] {root} ((none)) > select plugin_name,plugin_status from information_schema.plugins where plugin_name='clone';
+-------------+---------------+
| plugin_name | plugin_status |
+-------------+---------------+
| clone       | ACTIVE        |
+-------------+---------------+
1 row in set (0.00 sec)

在Recipient节点 [Node3] 上执行:

1) 动态安装插件

node1 [localhost:23637] {root} ((none)) > install plugin clone soname 'mysql_clone.so';
Query OK, 0 rows affected (0.16 sec)

2) 在数据库中持久化变更

[mysqld]
plugin-load-add=mysql_clone.so
clone=FORCE_PLUS_PERMANENT

3) 验证插件状态

node3 [localhost:23639] {root} ((none)) > select plugin_name,plugin_status from information_schema.plugins where plugin_name='clone';
+-------------+---------------+
| plugin_name | plugin_status |
+-------------+---------------+
| clone       | ACTIVE        |
+-------------+---------------+
1 row in set (0.00 sec)

4) 在所有相关/群组节点上定义克隆分布式恢复的阈值。节点将使用增量数据同步,除非事务间隙超过下面的阈值,在这种情况下将运行数据克隆过程。在本次测试中,我们使用了一个很小的值,尽管默认值很大。

node3 [localhost:23637] {root} ((none)) >  set global group_replication_clone_threshold = 10;
Query OK, 0 rows affected (0.00 sec)

5) 最后,启动 Node3。在 MySQL 日志中,我们可以看到通过克隆插件进行的分布式恢复已经开始。该过程将替换或克隆 Donor/Source [Node1] 的整个数据目录。

2024-08-03T18:21:04.039835Z 0 [System] [MY-013471] [Repl] Plugin group_replication reported: 'Distributed recovery will transfer data using: Cloning from a remote group donor.'
2024-08-03T18:21:03.033878Z 0 [Warning] [MY-013469] [Repl] Plugin group_replication reported: 'This member will start distributed recovery using clone. It is due to the number of missing transactions being higher than the configured threshold of 10.'
2024-08-03T18:21:04.150730Z 132 [Warning] [MY-013460] [InnoDB] Clone removing all user data for provisioning: Started
...
/Users/aniljoshi/opt/mysql/8.0.36/bin/mysqld: Shutdown complete (mysqld 8.0.36)  MySQL Community Server - GPL.
2024-08-03T18:21:09.6NZ mysqld_safe mysqld restarted

6) 接下来,我们可以使用下面的文件和命令跟踪 GTID 的执行和坐标信息。

shell> cat #view_status 
2 1
1722709264045254 1722709271116851
127.0.0.1:23637
0

./mysql-bin.000006
4836
00023636-bbbb-cccc-dddd-eeeeeeeeeeee:1-148056

或者

node3 [localhost:23639] {root} ((none)) > select * from performance_schema.clone_statusG;
*************************** 1. row ***************************
             ID: 1
            PID: 0
          STATE: Completed
     BEGIN_TIME: 2024-08-03 23:51:04.045
       END_TIME: 2024-08-03 23:51:11.117
         SOURCE: 127.0.0.1:23637
    DESTINATION: LOCAL INSTANCE
       ERROR_NO: 0
  ERROR_MESSAGE: 
    BINLOG_FILE: mysql-bin.000006
BINLOG_POSITION: 4836
  GTID_EXECUTED: 00023636-bbbb-cccc-dddd-eeeeeeeeeeee:1-148056
1 row in set (0.00 sec)



node3 [localhost:23639] {root} ((none)) > show variables like 'gtid_purged';
+---------------+-----------------------------------------------+
| Variable_name | Value                                         |
+---------------+-----------------------------------------------+
| gtid_purged   | 00023636-bbbb-cccc-dddd-eeeeeeeeeeee:1-148056 |
+---------------+-----------------------------------------------+
1 row in set (0.01 sec)

否则,我们可以使用下面的命令来设置 gtid_purged,方法是传递上面克隆状态中的gtid。

node3 [localhost:23639] {root} ((none)) > set global gtid_purged="00023636-bbbb-cccc-dddd-eeeeeeeeeeee:1-128056 ";

7) 运行 CHANGE REPLICATION 命令并启动组复制。

node3 [localhost:23639] {root} ((none)) > CHANGE REPLICATION SOURCE TO SOURCE_USER='rsandbox', SOURCE_PASSWORD='rsandbox' FOR CHANNEL 'group_replication_recovery';
Query OK, 0 rows affected, 2 warnings (0.01 sec)

node3 [localhost:23639] {root} ((none)) > start group_replication;
Query OK, 0 rows affected (10.69 sec)

8) 最后,我们可以看到 Node3 显示在组复制拓扑中。

node3 [localhost:23639] {root} ((none)) >  select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | MEMBER_COMMUNICATION_STACK |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| group_replication_applier | 00023639-3333-3333-3333-333333333333 | 127.0.0.1   |       23639 | ONLINE       | SECONDARY   | 8.0.36         | XCom                       |
| group_replication_applier | 11400bec-51a6-11ef-b950-29b40b1b64a7 | 127.0.0.1   |       23637 | ONLINE       | PRIMARY     | 8.0.36         | XCom                       |
| group_replication_applier | c2b74588-51a7-11ef-81da-5f8da38bd06f | 127.0.0.1   |       23638 | ONLINE       | SECONDARY   | 8.0.36         | XCom                       |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
3 rows in set (0.01 sec)

 

标签:none,set,group,replication,MEMBER,故障,复制,MySQL,localhost
From: https://www.cnblogs.com/abclife/p/18362836

相关文章

  • 不是 PHP 不行了,而是 MySQL 数据库扛不住啊
    大家好,我是码农先森。大多数的业务场景下PHP还没有达到性能瓶颈,然而MySQL数据库就先行驾崩了。但我们总是不分青红皂白,一股脑的把原因归结于是PHP语言不行了,每当遇到这种情形我就会感叹到PHP的命真苦啊。PHP作为一门优秀的开源编程语言,在编程语言界一直享有「PHP是世界......
  • 如何为MySQL数据库设置有效的账户权限管理策略?
    为MySQL数据库设置有效的账户权限管理策略是确保数据库安全的关键步骤。以下是一些最佳实践:1.**最小权限原则**:  -只为用户分配完成其任务所必需的最小权限集。例如,如果一个用户只需要从特定表中读取数据,就只为他们授予该表的`SELECT`权限。2.**用户账户隔离**:  -......
  • 第二章---MySQL部署
    Windows系统1、安装#1.先到官方进行下载http://dev.mysql.com/downloads/mysql/#2.解压如果想要让MySQL安装在指定目录,那么就将解压后的文件夹移动到指定目录,如:C:\mysql-5.7.16-winx64#3.添加环境变量【右键计算机】--》【属性】--》【高级系统设置】--》【高级】--》......
  • Abp vNext+SqlServer+MySqlServer
    目前项目使用的是ABPvNext版本8.2.0加上SQLServer。由于业务需求,我们需要与MySQL数据库进行对接,这意味着项目的主要功能将继续使用SQLServer,而部分特定功能将需要与MySQL数据库交互。步骤1.在项目“XXX.XXX.EntityFrameworkCore”中安装MySql包dotnetaddpackag......
  • 240815-PostgreSQL自带逻辑复制简单使用
    PostgreSQL自带逻辑复制简单使用一、逻辑复制说明角色IP端口数据库名用户名版本发布端192.168.198.1658432pubdbrepuserPostgreSQL13.13订阅端192.168.198.1628432subdbrepuserPostgreSQL13.13二、搭建逻辑复制环境2.1发布端配置发布端post......
  • 短视频 IP实战课,独创一键复制学习秘籍,转战新领域,月赚五万轻松行
    1.引言随着数字媒体的快速发展,短视频已成为内容传播和个人品牌建设的重要平台。对于寻求职业转型的专业人士而言,短视频IP运营不仅提供了灵活的工作时间,还有望带来可观的经济回报和与行业精英交流的机会。然而,这一新兴领域也存在诸多挑战,如行业门槛、专业技能缺乏等。2.短......
  • 基于VSC的MVDC微电网(±10kV)转换器的互连通过等效RL电缆模块实现,此外,在电缆侧引入了
     ......
  • MySQL中处理JSON数据案例示范和常见问题以及性能优化
    随着大数据技术的快速发展,数据格式和存储方式也变得越来越多样化。在业务系统中,很多数据经常以JSON格式存储。为了高效处理这些JSON数据,MySQL自5.7版本开始原生支持JSON数据类型,并在后续版本中不断扩展和优化相关功能。本文将以电商交易系统为示例,深入探讨MySQL在处理JSON数......
  • 【MySQL】order by 的排序策略
    目录一、全字段排序二、磁盘临时文件辅助排序三、RowId排序四、优化orderby在前面的文章中我们分析了explain执行计划的各个字段,其中有一个 Extra字段。在执行计划中,Extra字段如有Usingfilesort则表示使用到了排序。而MySQL的排序机制相当复杂,它会根据不......
  • JDBC链接MySQL day18
    packagecom.shujia.day18.ketang;importjava.sql.Connection;importjava.sql.DriverManager;importjava.sql.ResultSet;importjava.sql.Statement;/*我们要想操作mysql数据库,jdk本身是无法操作的,因为java并不知道将来开发者需要使用java连接什么样的数据库,所以j......