无损半同步复制基础
不管是无损半同步(lossless semi-sync replication)还是增强半同步,说的都是AFTER_SYNC模式的半同步复制,针对参数rpl-semi-sync-master-wait-point,官方文档如是说:
This variable controls the point at which a semisynchronous source waits for replica acknowledgment of transaction receipt before returning a status to the client that committed the transaction. These values are permitted:
AFTER_SYNC (the default): The source writes each transaction to its binary log and the replica, and syncs the binary log to disk. The source waits for replica acknowledgment of transaction receipt after the sync. Upon receiving acknowledgment, the source commits the transaction to the storage engine and returns a result to the client, which then can proceed.
AFTER_COMMIT: The source writes each transaction to its binary log and the replica, syncs the binary log, and commits the transaction to the storage engine. The source waits for replica acknowledgment of transaction receipt after the commit. Upon receiving acknowledgment, the source returns a result to the client, which then can proceed.
The replication characteristics of these settings differ as follows:
With AFTER_SYNC, all clients see the committed transaction at the same time: After it has been acknowledged by the replica and committed to the storage engine on the source. Thus, all clients see the same data on the source.
In the event of source failure, all transactions committed on the source have been replicated to the replica (saved to its relay log). An unexpected exit of the source and failover to the replica is lossless because the replica is up to date. Note, however, that the source cannot be restarted in this scenario and must be discarded, because its binary log might contain uncommitted transactions that would cause a conflict with the replica when externalized after binary log recovery.
With AFTER_COMMIT, the client issuing the transaction gets a return status only after the server commits to the storage engine and receives replica acknowledgment. After the commit and before replica acknowledgment, other clients can see the committed transaction before the committing client.
If something goes wrong such that the replica does not process the transaction, then in the event of an unexpected source exit and failover to the replica, it is possible for such clients to see a loss of data relative to what they saw on the source.
AFTER_COMMIT模式和AFTER_SYNC模式的主要区别在于MySQL主节点何时提交事务并释放事务持有的锁资源。
主从数据不一致问题
在合理正确使用无损半同步模式的前提下能有效保证主从数据一致性。导致使用无损半同步模式还出现主从数据不一致的场景有:
场景1:
当主节点长时间未收到从节点的ACK确认消息时,会将半同步模式转换为异步模式。异步复制模式下主节点宕机可能存在数据丢失风险。
解决办法:
- 设置较大的
rpl_semi_sync_master_timeout
,禁止半同步模式转换为异步模式。 - 增加半同步从节点数量,降低半同步模式切换为异步模式概率。
场景2:
加入主节点发生故障时,主节点已将事务A相关的binlog数据sync到本地磁盘但未成功发生给从节点,在主节点故障恢复过程中,MySQL根据binlog中记录的Xid信息确认事务A可以提交,而从节点未收到事务A完整的binlog数据未执行事务A相关操作,导致主节点和从节点数据不一致。
如果主节点故障恢复后继续作为主节点,从节点重新拉取未同步的binlog并重放执行后,能保证主节点和从节点的数据一致。
如果主节点故障恢复后被转换为从节点,直接挂载到新主节点(原从节点)上,则可能存在主节点和从节点数据不一致问题。
由于MySQL服务器未向提交事务A的客户端发送事务成功消息,也未允许其他客户端请求事务A成功后的数据,因此在处理故障时可选择:
- 丢弃事务A的数据,在新从节点上使用binlog2sql等工具"回滚"事务A,使得新从节点和新主节点数据一致。
- 重放事务A的数据,在新从节点上使用mysqlbinlog等工具获取到事务A的binlog,并在新主节点上执行,使得新从节点和新主节点数据一致。如果事务A涉及到的记录已经在新主节点上被其他事务修改,则可能导致事务A在新主节点上执行失败。
场景3:
主节点唤醒Dump进程向从节点发送Binlog Event的时间却决于参数sync_binlog:
- sync_binlog=1,主节点先将binlog数据sync到本地磁盘后再发送给从节点。
- sync_binlog>1,主节点先将binlog数据flush到文件系统缓存后再发送给从节点。
当sync_binlog>1时,主节点服务器发生宕机事件,可能存在:
- 主节点binlog数据未刷新到本地磁盘,服务器重启后数据丢失。
- 从节点binlog数据被正常应用执行,从节点比主节点多部分数据。
主节点宕机发生主从切换,主节点宕机恢复后直接作为新从库挂载到新主节点下,由于新主节点上"多出来的数据"是由原主节点产生,因此不会在原主节点上重放,导致新主从节点数据不一直。
解决办法:
- 采用严格的双1模式避免上述数据丢失风险。
- 将宕机恢复的原主节点作为新从节点挂载时,需严格对比主从节点状态确保原主节点能直接挂载为从节点,否则需要重做原主节点数据或进行主从数据对比校验。
场景4:
当管理员正常关闭MySQL服务时,如:
- 使用mysqladmin shutdown
- 使用service mysqld stop
- 使用kill mysql_pid
MySQL会关闭半同步插件并停止半同步消息确认线程,因此无法保证所有提交事务的binlog数据都已传递给从节点,可能导致数据丢失。
解决办法:
- 对于紧急故障,使用暴力方式kill -9 mysql_pid来杀死主节点。
- 对于非紧急问题,先进行故障切换再正常关闭已经成为从节点的实例。
参考资料
- 参数 rpl_semi_sync_master_timeout
- 参数 sync_binlog
- 参数 innodb_flush_log_at_trx_commit
- 聊聊 MySQL 关机的故事
- 无损半同步复制下主动数据一致么