Apache Pulsar is a cloud-native and distributed messaging and streaming platform originally created in Yahoo! and now a top-level Apache project. The latest 2.7 version supports transactions, Azure Blob storage offloader, topic-level policy, and more. The new version enables event streaming applications to consume, process, and produce messages in one atomic operation and also allows Pulsar users to offload their historical data to Azure Cloud.
Main features in the new release include:
Transactional semantics enable event streaming applications to consume, process, and produce messages in one atomic operation. With transactions, Pulsar achieves the exactly-once semantics for a single partition and multiple partitions as well. This enables new use cases with Pulsar where a client (either as a producer or consumer) can work with messages across multiple topics and partitions and ensure those messages will all be processed as a single unit. This will strengthen the message delivery semantics of Apache Pulsar and processing guarantees for Pulsar Functions.
Currently, Pulsar transactions are in developer preview. The community will work further to enhance the feature to be used in the production environment soon.
Pulsar 2.7.0 supports Azure Blob storage offloader. With this offloader, users can offload their historical data to Azure Blob Storage. It greatly benefits Azure Cloud users, and effectively reduces the cost of managing massive historical data in BookKeeper. Pulsar will add more support on Azure Cloud in the upcoming releases.
Pulsar 2.7.0 introduces the system topic which can maintain all policy change events to achieve the topic level policy. All policies at the namespace level are now also available at the topic level, so users can set different policies at the topic level flexibly without using lots of metadata service resources. The topic level policy enables users to manage topics more flexibly and adds no burden to ZooKeeper.
CSDN spoke with Penghui Li, an Apache Pulsar PMC, about the Pulsar benchmark report they published recently.
Question: You recently wrote Benchmarking Pulsar and Kafka, why do you want to conduct the benchmark?
Penghui Li: This year, Confluent ran a benchmark to evaluate how Kafka, Pulsar, and RabbitMQ compare in terms of throughput and latency. According to Confluent, Kafka was the "fastest" in all scenarios. Given our knowledge of Pulsar's capabilities, this did not seem accurate.
For the community, we have already met many users who hope to get official benchmark results for reference, and even the performance comparison with other messaging systems. So we think this is also an opportunity to push us to do this. So we set out to repeat the benchmark.
Taking a deeper look at Confluent's benchmark, we noticed a number of issues with the setup, framework, and methodology. We identified and fixed these issues and also added additional test parameters that would provide insights on more real-world use cases. You can read the full benchmark.
Although in the test results, Pulsar is better than Kafka in many aspects of latency. But we still think that this cannot cover all user scenarios. Different physical resource environments may get completely different results, we also recommend that users have a better understanding of Pulsar's design and performance-related knowledge, this will allow Pulsar to perform better in a real environment. We have published a whitepaper which introduces many aspects of the performance tuning of Pulsar. You can read the full whitepaper.
High performance is only one aspect of Pulsar. Pulsar has advanced architecture, better scalability, and easy operations and maintenance. We sincerely invite you to download Pulsar and try it out, and you will have a better understanding of Pulsar. To download the Apache Pulsar 2.7.0, click here.
For more information on the new release, check out the release notes on Pulsar website.