首页 > 其他分享 >Kafka集群新增节点后数据如何重分配

Kafka集群新增节点后数据如何重分配

时间:2024-12-24 16:55:07浏览次数:3  
标签:dirs log replicas partition Kafka topic 集群 test 节点

新增节点的步骤

将其他节点的server.properties配置文件拷贝后修改以下参数

broker.id
log.dirs
zookeeper.connect

数据迁移原理

  1. 只有新增的Topic才会将数据分布在新节点上,如果要将现有数据也分配到新节点,需要将Topic中的数据迁移到新节点上。
  2. 数据迁移过程是手动启动的,但是是完全自动化的。Kafka会将新节点添加为要迁移的分区的追随者,并允许其完全复制该分区中的现有数据。新节点完全复制此分区的内容并加入同步副本后,现有副本之一将删除其分区的数据。

数据迁移工具介绍

分区重新分配工具可用于在代理之间移动分区。理想的分区分配将确保所有代理之间的数据负载和分区大小均匀。分区重新分配工具没有能力自动研究Kafka群集中的数据分布,并四处移动分区以实现均匀的负载分布。因此,必须弄清楚应该移动哪些主题或分区。

分区重新分配工具可以在3种模式下运行:

  • --generate:在此模式下,给定主题列表和代理列表,该工具会生成分区与副本重新分配的计划,以将指定主题的所有分区在所有节点上重新分配。在给定主题和目标代理的列表的情况下,此选项仅提供了一种方便的方式来生成分区重新分配计划。
  • --execute:在此模式下,该工具将根据用户提供的重新分配计划启动分区的重新分配。(使用--reassignment-json-file选项)。这可以是管理员手工制作的自定义重新分配计划,也可以使用--generate选项提供
  • --verify:在此模式下,该工具会验证上一次--execute期间列出的所有分区的重新分配状态。状态可以是成功完成,失败或进行中

示例:

现有5个节点的broker_id为1,2,3,4,5;新增节点broker_id为6

Topic:test 有6个分区,5个副本

创建要迁移的topic配置文件

topics-to-move.json

{
    "topics": [
        {"topic": "test"}
    ],
    "version": 1
}
生成重新分配计划
[root@k8s-node50 ~]#  kafka-reassign-partitions.sh --bootstrap-server 10.19.29.50:9092 --topics-to-move-json-file topics-to-move.json --broker-list "1,2,3,4,5" --generate
Current partition replica assignment
{"version":1,"partitions":[{"topic":"test","partition":0,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":1,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":2,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":3,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":4,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":5,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":6,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":7,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":8,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":9,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":10,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":11,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":12,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":13,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":14,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":15,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":16,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":17,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":18,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":19,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":20,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":21,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":22,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":23,"replicas":[2],"log_dirs":["any"]}]}

Proposed partition reassignment configuration
{"version":1,"partitions":[{"topic":"test","partition":0,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":1,"replicas":[3],"log_dirs":["any"]},{"topic":"test","partition":2,"replicas":[4],"log_dirs":["any"]},{"topic":"test","partition":3,"replicas":[5],"log_dirs":["any"]},{"topic":"test","partition":4,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":5,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":6,"replicas":[3],"log_dirs":["any"]},{"topic":"test","partition":7,"replicas":[4],"log_dirs":["any"]},{"topic":"test","partition":8,"replicas":[5],"log_dirs":["any"]},{"topic":"test","partition":9,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":10,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":11,"replicas":[3],"log_dirs":["any"]},{"topic":"test","partition":12,"replicas":[4],"log_dirs":["any"]},{"topic":"test","partition":13,"replicas":[5],"log_dirs":["any"]},{"topic":"test","partition":14,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":15,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":16,"replicas":[3],"log_dirs":["any"]},{"topic":"test","partition":17,"replicas":[4],"log_dirs":["any"]},{"topic":"test","partition":18,"replicas":[5],"log_dirs":["any"]},{"topic":"test","partition":19,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":20,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":21,"replicas":[3],"log_dirs":["any"]},{"topic":"test","partition":22,"replicas":[4],"log_dirs":["any"]},{"topic":"test","partition":23,"replicas":[5],"log_dirs":["any"]}]}

输出结果中有你当前的分区分配策略,也有 Kafka 期望的分配策略,在期望的分区分配策略里,kafka 已经尽可能的为你分配均衡。

我们先将 Current partition replica assignment 的内容备份,以便回滚到原来的分区分配状态。

然后将 Proposed partition reassignment configuration 的内容拷贝到一个新的文件中(文件名称、格式任意,但要保证内容为json格式)

以上命令将会产生以下内容,将 Proposed partition reassignment configuration 下的内容保存为reassignment.json文件

执行数据迁移
[root@k8s-node50 ~]#  kafka-reassign-partitions.sh --bootstrap-server 10.19.29.50:9092 --reassignment-json-file reassignment.json --execute
Current partition replica assignment

{"version":1,"partitions":[{"topic":"test","partition":0,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":1,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":2,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":3,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":4,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":5,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":6,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":7,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":8,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":9,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":10,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":11,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":12,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":13,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":14,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":15,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":16,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":17,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":18,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":19,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":20,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":21,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":22,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":23,"replicas":[2],"log_dirs":["any"]}]}

Save this to use as the --reassignment-json-file option during rollback
Successfully started partition reassignments for test-0,test-1,test-2,test-3,test-4,test-5,test-6,test-7,test-8,test-9,test-10,test-11,test-12,test-13,test-14,test-15,test-16,test-17,test-18,test-19,test-20,test-21,test-22,test-23

检查重新分配的分区状态
[root@k8s-node50 ~]#  kafka-reassign-partitions.sh --bootstrap-server 10.19.29.50:9092 --reassignment-json-file reassignment.json --verify
Status of partition reassignment:
Reassignment of partition test-0 is complete.
Reassignment of partition test-1 is complete.
Reassignment of partition test-2 is complete.
Reassignment of partition test-3 is complete.
Reassignment of partition test-4 is complete.
Reassignment of partition test-5 is complete.
Reassignment of partition test-6 is complete.
Reassignment of partition test-7 is complete.
Reassignment of partition test-8 is complete.
Reassignment of partition test-9 is complete.
Reassignment of partition test-10 is complete.
Reassignment of partition test-11 is complete.
Reassignment of partition test-12 is complete.
Reassignment of partition test-13 is complete.
Reassignment of partition test-14 is complete.
Reassignment of partition test-15 is complete.
Reassignment of partition test-16 is complete.
Reassignment of partition test-17 is complete.
Reassignment of partition test-18 is complete.
Reassignment of partition test-19 is complete.
Reassignment of partition test-20 is complete.
Reassignment of partition test-21 is complete.
Reassignment of partition test-22 is complete.
Reassignment of partition test-23 is complete.

Clearing broker-level throttles on brokers 5,1,2,3,4
Clearing topic-level throttles on topic test

查看,关注Isr
[root@k8s-node50 ~]#  kafka-topics.sh --bootstrap-server 10.19.29.50:9092 --describe --topic test
Topic: test	TopicId: iVbESSwBRlW-0yL-Lkyq6A	PartitionCount: 24	ReplicationFactor: 1	Configs: cleanup.policy=delete,flush.ms=50,segment.bytes=1073741824
	Topic: test	Partition: 0	Leader: 2	Replicas: 2	Isr: 2
	Topic: test	Partition: 1	Leader: 3	Replicas: 3	Isr: 3
	Topic: test	Partition: 2	Leader: 4	Replicas: 4	Isr: 4
	Topic: test	Partition: 3	Leader: 5	Replicas: 5	Isr: 5
	Topic: test	Partition: 4	Leader: 1	Replicas: 1	Isr: 1
	Topic: test	Partition: 5	Leader: 2	Replicas: 2	Isr: 2
	Topic: test	Partition: 6	Leader: 3	Replicas: 3	Isr: 3
	Topic: test	Partition: 7	Leader: 4	Replicas: 4	Isr: 4
	Topic: test	Partition: 8	Leader: 5	Replicas: 5	Isr: 5
	Topic: test	Partition: 9	Leader: 1	Replicas: 1	Isr: 1
	Topic: test	Partition: 10	Leader: 2	Replicas: 2	Isr: 2
	Topic: test	Partition: 11	Leader: 3	Replicas: 3	Isr: 3
	Topic: test	Partition: 12	Leader: 4	Replicas: 4	Isr: 4
	Topic: test	Partition: 13	Leader: 5	Replicas: 5	Isr: 5
	Topic: test	Partition: 14	Leader: 1	Replicas: 1	Isr: 1
	Topic: test	Partition: 15	Leader: 2	Replicas: 2	Isr: 2
	Topic: test	Partition: 16	Leader: 3	Replicas: 3	Isr: 3
	Topic: test	Partition: 17	Leader: 4	Replicas: 4	Isr: 4
	Topic: test	Partition: 18	Leader: 5	Replicas: 5	Isr: 5
	Topic: test	Partition: 19	Leader: 1	Replicas: 1	Isr: 1
	Topic: test	Partition: 20	Leader: 2	Replicas: 2	Isr: 2
	Topic: test	Partition: 21	Leader: 3	Replicas: 3	Isr: 3
	Topic: test	Partition: 22	Leader: 4	Replicas: 4	Isr: 4
	Topic: test	Partition: 23	Leader: 5	Replicas: 5	Isr: 5

批量多Topic脚本

[root@k8s-node50 ~]#  cat for_topic.sh 
list='
alg_ocrservice
alg_ocrservice_result
alg_politic_dbtopic
alg_politicservice
alg_politicservice_result
alg_porn
alg_pornservice
alg_pornservice_result
alg_terror
alg_terrorservice
alg_terrorservice_result
alg_vfp_dbtopic
alg_vfpservice
alg_vfpservice_result
alg_video_transcode
alg_videoservice
copyright_count
copyright_vfp_operation_result
handle_into_db
handle_into_db_media
handle_report
handle_report_all
hcy_all_record
hot_spot_count_key
keyword_tactic_count
media_download
media_download_result
office_helper
office_helper_result
politic_operation_copy_result
politic_operation_result
rd031_alg_keyword_tactic
rd031_alg_replace_word
rich_media_count
risk_control_log
syncapi-log
uiall_all
uiall_audit
uiall_report
uploader_frequency_count_key
vfp_operation_result
video_asr_result
video_capture_result
video_ocr_result
video_transcode_result
'

for topic in $list;do
cat <<EOF> topics-to-move.json
{
    "topics": [
        {"topic": "$topic"}
    ],
    "version": 1
}
EOF

echo "Topic: $topic 生成重新分配计划"
kafka-reassign-partitions.sh --bootstrap-server 10.19.29.50:9092 --topics-to-move-json-file topics-to-move.json --broker-list "1,2,3,4,5" --generate| awk '/Proposed partition reassignment configuration/,/}/'|awk 'NR>1{print}' > reassignment.json

echo "Topic: $topic执行数据迁移"
kafka-reassign-partitions.sh --bootstrap-server 10.19.29.50:9092 --reassignment-json-file reassignment.json --execute

echo "Topic: $topic 检查重新分配的分区状态"
kafka-reassign-partitions.sh --bootstrap-server 10.19.29.50:9092 --reassignment-json-file reassignment.json --verify

echo "查看Topic: $topic"
kafka-topics.sh --bootstrap-server 10.19.29.50:9092 --describe --topic $topic
done

本文转自 https://cloud.tencent.com/developer/article/1964425,如有侵权,请联系删除。

标签:dirs,log,replicas,partition,Kafka,topic,集群,test,节点
From: https://www.cnblogs.com/boradviews/p/18628085

相关文章

  • 【杂谈】合理使用Kafka,防止消息丢失
    前言并非所有业务场景都要求消息绝对不丢失。对很多应用来说,为了追求更高的吞吐量,少量的消息丢失是可以容忍的。然而,在一些关键的业务场景中,确保消息不丢失至关重要。本文将重点讨论需要保证消息可靠性的场景,并提供相关的优化建议。消息丢失的场景消息丢失的场景可以归纳为三......
  • 容器集群环境网络排查
    问题现象   kafka的一个topic被两个服务连接导致有两个消费组     排查步骤   1.登录两台主机查看端口连接情况     登录79主机查看          2.登录175主机进行交叉验证         3.查看79主机所有服务......
  • kafka教程
    kafka的安装以及使用kafka依赖jdk和zookeeper,jdk安装这里就不叙述了,从zookeeper开始介绍。一、zookeeper安装及使用1.单节点安装1.1安装tar-zxvfapache-zookeeper-3.8.0-bin.tar.gz-C/soft/ln-sv/soft/apache-zookeeper-3.8.0-bin/soft/zookeepercat>>/etc/profile......
  • kolla-ansible 部署多region集群
    1、先说什么是多Region2、多Region的应用场景:•1、Openstack集群位于不同的区域时,可以用多Region来管理,比如阿里云的北京地区的云主机、上海区的云主机等•2、可用于异构管理,比如当Kvm和Vcenter同时被Openstack管理时,由于网络、镜像等原因必须使用独立的环境来纳管,此时......
  • SpringBoot支持Kafka多源配置的同时还要支持启停配置化,是真的会玩
    开心一刻今早,女朋友给我发微信她:宝贝,你要记住她:我可是你女朋友,你相亲就亲,想抱就抱她:不要老是问我,男生要主动一些我:可是你上次报警可不是这么说的基础回顾SpringBoot集成Kafka非常简单,我相信你们都会,但我还是想带你们回顾下;只需要进行以下几步即可完成SpringBoot与......
  • 抽象节点
    抽象节点这个特性自小程序基础库版本1.9.6开始支持。在组件中使用抽象节点有时,自定义组件模板中的一些节点,其对应的自定义组件不是由自定义组件本身确定的,而是自定义组件的调用者确定的。这时可以把这个节点声明为“抽象节点”。例如,我们现在来实现一个“选框组”(selectable......
  • 谷歌集群数据集:负载均衡云服务测试数据
    谷歌集群数据集以下为你举例说明各文件中一条数据的具体含义,方便你更好地理解这个数据集:1.machine_events文件示例假设其中有这样一条数据:123456789,101,0,"platform_abc",4,8含义:表示在时间戳为123456789微秒时,机器ID为101的这台机器发生了“ADD(添加)”事件(因为......
  • LeetCode:222.完全二叉树节点的数量
    跟着carl学算法,本系列博客仅做个人记录,建议大家都去看carl本人的博客,写的真的很好的!代码随想录LeetCode:222.完全二叉树节点的数量给你一棵完全二叉树的根节点root,求出该树的节点个数。完全二叉树的定义如下:在完全二叉树中,除了最底层节点可能没填满外,其余每层节......
  • 使用 C# WinForms 中使用 DevExpress TreeList 实现科室节点的增删改功能
    引言在医院管理系统中,科室管理是一个非常重要的模块。通过使用DevExpress的TreeList控件,我们可以方便地以树形结构展示科室信息,并实现对科室节点的增删改操作。本文将详细介绍如何在C#WinForms项目中使用DevExpressTreeList控件来构建一个完整的科室管理系统。完......
  • DM-DSC集群配置
    转自:https://www.cnblogs.com/binliubiao/p/15416631.htmlDMDSC概述DM共享存储数据库集群全称DMDataSharedCluster,简称DMDSCDMDSC特性DM共享存储数据库集群,允许多个数据库实例同时访问、操作同一数据库,具有高可用、高性能、负载均衡等特性。DMDSC支持故障自动切换和......