首页 > 其他分享 >Introducing the core concepts of Kafka

Introducing the core concepts of Kafka

时间:2023-11-09 21:01:36浏览次数:34  
标签:Introducing ActiveMQ messages RabbitMQ kafka concepts message Kafka data

Introduction

I  have learnt the kafka since 5 years, I believe I learnd somthing, It is on time for improving english. So I decided to pick up my blogs, to writing some concepts of kafka for consolidating memory. By the way,  making my english better. However , this is a series of course, I will explain it one by one.

Today, I want introduce basic information about Kafka, including the differences between kafka and other messaging system, and some core concepts of Kafka.

What is messaging system

Obviously, this is a popular question, I do not want to be very technical to difine the messaging system ,  simply, there is a server, it can handle the message, on the other way, how to difine the message,  it so simply, every can be message,  but at this point, message represents some information is that everyone can understand in bussiness. This is very important , we can understand that in business,  maybe you may say, this is not big deal ,  it very simply,  techically speaking , it not very right , if you are the perpon who is chager of technology,  usually, we are nagging in creating models of databases, the fact is we must put the unordered data of real world in order to store in some relational databases, this mean that we need to not only collect data, but aslo find out the important data or the data is what the business people want it. there is not such issuse, Kafka can store all kinds of information, we can don't care what exactly business people want it , Kafka can store all messages no matter what you want. let us recap , messaging system can store many kinds of information , it is very convenient to statisfy the requirment form business, even meet the needs from future, because all the message stored the messaging system . this is the reason why we usually use kafka to collect the buiness information, we ues kafka as a collector of frontend, even this is not only reason, kafka has many useful features, I will do my best cover all concepts of kafka, now let see it .

In this article , I think we have already known the definition of messaging system, I will highlight the features and characteristics of Apache kafka , that will help us to understand how kafka is better than traditional message server , I will compare the traditional message server RabbitMQ and ActiveMQ.

RabbitMQ

RabbitMQ and Kafka are message queue systems you can use in stream processing, so both of them allow produces to send messages to consumers. Producers are applications that publish information, while consumers are applications that subscribe to and process information, but producers and consumers ineract differently in RabbitMQ and kafka . In RabbitMQ, the producer sends and monitors if the message reaches the intended consumer, on the other hand , kafka producers publish messages to the queue regardless of whether consumers have retrieved them. there is a good metaphor to help you understand them , you can think of RabbitMQ as a post office that receives mail and delivers it to the intended recipients , Meanwhile , kafka is similar to a library, which organizes messages on shelves with different groups that producers publish, then, consumers read the messages from respective shelves and remember what they have read. this is good metaphor, isn't it?

Additionally, In RabbitMQ,  there is a routing key as a message attribute, that is used to route message from an exchange  to a specific queue. When a producer sends a message to an exchange , it includes a routing key as part of message. The exchange then uses this routing key to determine which queue the message should be delivered to . In contrast, kafka is a little simple, producers in kafka assign a message key to each message, then, the kafka broker stores the message in the leading partition of that specific topic, it is nothing to do with consumer or deliver.

Moreover, there are some differents in handling messaging betweent in Kafka and RabbitMQ, in fact , RabbitMQ and kafka are designed for different use cases, let us talk about message consumption. In RabbitMQ, the consumer application takes a passive role and wait for the producer to push the message to queue, which means consumer application will lose data if producers send the message and consumers are inactive. this situation do not happen in kafka , because kafka consumers are more proactive in reading and tracking information, which means kafka consumers get the data from topic anytime , anywhere, anyperson, in reality, when kafka consumer get the data, it will keep track of the last message they have read and update their offset tracker accordingly, an offest tracker is a counter that increments after reading a message, with kafka, the producer is not aware of meassage retrieval by consumers.

let us see the message priority , RabbitMQ brokers allow producer software to escalate certain messages by using the priority queue, the broker processes higher priority messages ahead of normal message, for example , a retail application might queue sales transactions every hour. however , if the system administrator issues a priority backup database message, the broker sends it immdiately. and kafka do not hava priority queue , it treats all messages as equal when distributing them to their respective partitions. there is another thing I need to say, RabbitMQ sends and queues messages  in a specific order, Unless a higher priority message is queued into system , consumers recieve messages in the order they were sent. Meanwhile, kafka prodeuers sends message into specific topic and partition. Because kafka dose not support direct producer-consumer exchanges, the consumer pulls messages from the partition in a different order. 

Finally, there is another different about message deletion, A BabbitMQ broker routes the message to the destination queue. once read, the consumer sends an acknowledgement replay to the broker , which then deletes the message from queue, Unlike BabbitMQ, kafka appends the message to a log file, which remains until its retention period expires, That way, consumers can reprocess streamed data at any time within the stipulated period.

I think we have known the differences between  RabittMQ and kafka, totally, RabbitMQ is a traditional message queue system, to compare the kafka, kafka has a higher performance, and there is few computer language to support the RabbitMQ.  now let see the ActiveMQ.

ActiveMQ

ActiveMQ and kafka are two of most popular open source messaging systems, in short, ActiveMQ is a message broker, kafka is a event streaming platform, why do me say that? although both message broker and event streaming platform can be uesd to implement asynchronous, scalable applications,  but there are some differents about using cases, why I said ActiveMQ was a message broker, what is message broker? A message   broker is a software application  that translates messages between formal messaging protocols,  what that mean formal massging protocols? it means ActiveMQ can support a wide range of messaging protocols including JMS, AMQP, and ,MQTT,  or you can say message brokers enable applications and services to effectively exchange data, even if they are written in different languages or are on different platform, they can provide a standardized flow of data.

If I say kafka is a event streaming , so what that's mean? in a nutshell, event stream prcessing is a programming technique that analyzes continuous data, what that mean event? An event is any change in state tracked by a business system, This is can be anything from a transaction to user navigation on a website, so naturlly , event streams are the ordereds sequence of these business events, Event stream processing manages and stores many related events together, not just one event at a time. Unlike message brokers, which often delete data after it received , event stream data is processed and stored, allowing new consumers to repley events. 

One of the biggest differences between kafka and activeMQ is how they handle messages. kafka not only transfers but is capable of permanently storing messages for multiple applications, while permanent storage is possible, retention time for a given topic can be set to whatever the use case dictates, even down to the millisecond . To avoid unnecessary retention of data, the reigning best pratice is to set retention time to as short a time as the use case allows, kafka has the capability to either preserve or ignore order of messaging , This depends on if a partition key is identified and what what partition method is used, In some cases , the order of message will not be maintained , which can be a preferable configuration depending on the use case, With ActiveMQ, it uses a push-type platform where providers push messages to consumers, Unlike kafka, ActiveMQ can filter the messages so consumers only receive messages they are interested in, it is the responsibility of the producers to ensure the message is delivered , To guarantee that messages are received.

It is also important to note that ActiveMQ cannot ensure that message are received in the same order they were sent, In the event of a failure, message can be duplicated, and always will be received. ActiveMQ is not designed for long-term data storage,  once consumed, the message is temporarily retained using virtual memory but then deleted, also , ActiveMQ can be uesd to easily implement one-time message delivery.

Kafka is a distributed system, which allows it to process massive amounts of data, It won't slow down with the addition of new consumers, due to the replication of partions, kafka easily scales , offering higher availability, but speed is not the only thing to conder , in kafka, producers do not wair for delivery acknowledgement from brokers . brokers can write messages at a high rate causing higher throughout , but data can potentially be lost,ActiveMQ is know for speed when manging small amounts of data to mumerous consumers, and is ofen picked for systems requiring lower throughout of messages.

Conclusion

I think all of us have already know the differences betweent kafka and AbbitMQ, between kafka and ActiveMQ. To sum up,  we can learn kafka following these aspects:

Scalability: kafka is designed to handle high throughput, low latency, and high scalability. It uses a publish-subscribe model and is built on a distributed architecture, allowing it to handle large amounts of data and handle high levels of consurrency.RabbitMQ excels in single broker implementation and is typically used for simple scenario. ActiveMQ , on the other hand, is designed for more traditional messaging scenarios and may not be as well suited for extremely high scalability.

Durability: kafka's message are written to disk and replicated across multiple nodes, providing a high level of durability in case of node failures. In RabbitMQ, messages in BabbitMQ are acknowledgment-based, meaning they are deleted as soon as they are acknowleged by a client,  messages are deteted once sucessfully acknowledged by a consumer . In order for multiple consumers to get the same message, multiple queues have to be created , ActiveMQ also provides a high level of durability, but it may not be as robust as kafka in certain scenarios.

Performance: kafka has been designed for very high performance and can handle millions of messages per scecond , RabbitMQ I have to say RabbitMQ is not designed into higher performance,it has a small broker and dump consumer broker, all the routing and decisions are made in the broker, so it is not hard to understand RabbbitMQ dose not have a higher performance , ActiveMQ also provides good performance, but it may not be as fast as kafka in certain scenarios.

Latency: kafka has a lower latency than ActiveMQ as it uses a zero-copy design and a memory-mapped file system ,RabbitMQ has a low latency in a small amount of data  but if a large amount of data kafka is better,ActiveMQ latency is higher as it requires message to be copied between different layers of system.

Use case: kafka is often used in big data and streaming scenarios, such as real-time data pipelines and event-driven architectures,RabbitMQ's message queuing capabilities make it an excellent choice for building decouped and asynchronous systems by enabling loose coupling between components and facilitating fault tolerance and scalability, applications that require reliable message delivery , like order processing systems, email notifications, and task scheduling systems, can leverage RabbitMQ to ensure message persistence and quaranteed delivery . Facebook uses RabbitMQ to implement its real-time chat system ,This allows Facebook users to communicate with each other in real time, ActiveMQ is more commonly used in traditional messaging scenarios, such as enterprise application integration and message-oriented middleware.

Partitioning: kafka supports partitioning of messages across multiple servers which allows it to scale horizontally, Both of  RabbitMQ and ActiveMQ do not have built-in support for partitioning,

I hope this blog can help you to know some knowledge about this messaging system. see you!

标签:Introducing,ActiveMQ,messages,RabbitMQ,kafka,concepts,message,Kafka,data
From: https://www.cnblogs.com/boanxin/p/17810457.html

相关文章

  • kafka第三天学习笔记
    在第三天学习Kafka中,你可能会遇到一些关于Kafka的核心概念和特性的深入讨论。以下是一些可能的学习点:Kafka的设计理念:Kafka的设计理念是“发布-订阅”模型,允许消费者根据其需求从多个生产者那里接收消息。这种模型使得Kafka能够以高吞吐量和可扩展的方式处理实时数据流。Ka......
  • kafka第二天学习笔记
    第二天学习Kafka,我们继续深入了解这个分布式流处理平台的核心概念和功能。以下是一些重要的知识点和概念:Kafka的消费者组:消费者组是多个消费者实例的组合,可以共同消费一个topic中的消息。消费者组中的每个消费者会均匀分配topic中的消息,实现负载均衡和高可用性。Kafka的分区策略:当......
  • Spring Kafka: UnknownHostException: 34bcfcc207e0
    参考:https://stackoverflow.com/questions/69527813/spring-kafka-unknownhostexception-34bcfcc207e0我遇到的问题和@AdánEscobar是一样的。在SpringBoot整合kafka的时候日志报了SpringKafka:UnknownHostException:34bcfcc207e0,34bcfcc207e0经过排查是容器的ID。解决......
  • kafka配置-代码配置篇
    KafkaProducerConfig@Configuration@EnableKafkapublicclassKafkaProducerConfig{/***ProducerTemplate配置*/@Bean(name="kafkaTemplate")publicKafkaTemplate<String,String>kafkaTemplate(){returnne......
  • kafka配置-yml篇
    spring:kafka:template:#当使用kafkaTemplate的sendDefault方法的时候,使用的是这里配置的topicdefaultTopic:topic-1#partition-num和replication-numKafkaProperties没有提供配置的地方bootstrap-servers:127.0.0.1:9092produ......
  • kafka第一天学习笔记
    以下是Kafka第一天的学习笔记:Kafka是什么?ApacheKafka是一个开源的分布式流处理平台,用于构建实时数据管道和流应用程序。它提供了高吞吐量、可扩展、可靠的消息传递,可以处理来自多个源的大量数据。Kafka的核心组件Kafka有四个核心组件:生产者(Producer)、代理(Broker)、消费者(Consumer)和......
  • [Flink/Kafka] Flink消费Kafka消息的检查点设置方式 [转载]
    flink消费kafka本机java代码测试flink消费kafka机制flink消费kafka数据,提交消费组offset有三种类型1、开启checkpoint:                         在checkpoint完成后提交 2、开启checkpoint,禁用checkpoint提......
  • Kafka常用命令
    Kafka实操命令kafka版本:scala2.11,kafka1.1.0kafka_2.11-1.1.0.jarKafka命令行操作1)查看当前服务器中的所有topickafka-topics.sh--zookeeperhadoop111:2181/kafka--list2)创建topickafka-topics.sh--zookeeperhadoop111:2181/kafka--create--replication-factor3......
  • springboot第44集:Kafka集群和Lua脚本
    servers:Kafka服务器的地址。这是Kafka集群的地址,生产者将使用它来发送消息。retries:在消息发送失败时,生产者将尝试重新发送消息的次数。这个属性指定了重试次数。batchSize:指定了生产者在发送消息之前累积的消息大小(以字节为单位)。一次性发送多个消息可以提高性能。linger:指定了生......
  • java——kafka随笔——broker&主题-topic&分区-partition理解
                  首先,让我们来看一下基础的消息(Message)相关术语:名称解释Broker消息中间件处理节点,⼀个Kafka节点就是⼀个broker,⼀个或者多个Broker可以组成⼀个Kafka集群TopicKafka根据topic对消息进⾏归类,发布到Kafka集群的每条消息都......