Alpakka vs Project Reactor vs Kafka API for Kafka: The hardest thing in life is to know which bridge to cross and which to burn

The saying in the title belongs to classical guitarist David Russell and it is so real in many ways. For the way for Kafka, come and let’s find out which bridge to cross together.

There are lots of different ways to process your entities nowadays. Apache Kafka is the one of the most popular software bus in the open source market. The popularity and quality of the work bring with (external) delivery and improvement. Moreover if you are on the open source platform, the other providers may put a lot of effort on top of your work, as we see for Apache.

Today we will talk about two other Kafka Reactive Streams and the Vanilla Kafka API’s performances.

We will also refer the topics:

  • Reactive Programming
  • A Microservice pattern (Transactional Outbox Pattern)

The first comer is the little definition about what Reactive Programming is.

In computing, reactive programming is a declarative programming paradigm concerned with data streams and the propagation of change.

From Wikipedia

So what is the context of Reactive programming that is deducted from the definition briefly: We will have a declarative programming paradigm that can be consisted from pipes and keep track of the data changes on it. There are several algorithms to trace of those data changes propagations. Let’s get back to our topic now.

Before we deep dive into benchmarking process, I want to mention the scenario that I used to get the results. I used Transactional Outbox Pattern that is mostly used in microservices architecture. In a nutshell, there are multiple services that needs to be communicated with each other and there is a message broker that serves as a third party information channel between them. We need to keep track for each published messages asynchronously to provide eventual consistency on the broker and in the system.

Preparation For Benchmark

Yeah! Eventually we can dive deeply. Ready for that extraordinary journey ?

The tests are done onto Intel® Core™ i7–6820HQ Dell Laptop computer.

The version for Kafka API:

<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>2.6.0</version>
</dependency>

The version for Project Reactor Streams:

<dependency>
<groupId>io.projectreactor.kafka</groupId>
<artifactId>reactor-kafka</artifactId>
<version>1.2.2.RELEASE</version>
</dependency>

The version for Akka Streams:

<dependency>
<groupId>com.typesafe.akka</groupId>
<artifactId>akka-stream-kafka_2.13</artifactId>
<version>2.0.5</version>
</dependency>

So there is a scheduled job that runs every 3 seconds and look at the database producer table for unsent messages to Kafka. If there is any available, it sends it to Kafka and updates state in the database atomically.

We need a message mechanism entity that can hold content of the message and the unique message identifier that provides eventual consistency that we mentioned above.

Vanilla Kafka API

Let’s start with non-reactive style, the first of its API, breaker of the chains: Vanilla Kafka API. First we can implement our Producer. The configurations we need to know are:

I need to expose my Producer entity for clarity:

We will create a ProducerRecord from the persistent Producer entity and send it to Kafka (the polling mechanism from persistent storage for sending message to Kafka is up to your implementation).

We face a crossroads in that point. It likes a riddle.

It changes according to our scenario,

We want to send message to Kafka in that way

So what is it ?

The answer is: The fashion relevant to order. Yes the order. We want to send message to Kafka in an ordered or non-ordered, with multi-threaded, fashion.

Let’s consider ordered fashion to send messages to Kafka. The result is:

The producer table after messages sent to Kafka

The screenshot above tells us the group of 500 messages are sent sequentially to Kafka within10876ms. processed_time column shows that the time when the entry first persisted to database.

After we send message to Kafka we can compare between the results for 100, 200, 300, … 1000 messages produce time (in ordered fashion):

Depending on the number of data to be published to Kafka, we can see that time increases almost gradually.

Let’s look at the consumer side. The configurations for the consumers are:

Both producer and consumer properties are closed to production environment. For example acks=all provides the other brokers to take messages and acknowledged the master for that; or enable auto commit is false, because we need to rollback our transactions in case of failure and handle our commit scenario in our way.

The consumer entity is:

We can consume messages with a certain number of consumers provided by threads. In other words, each consumer works on another thread in the consumer group (the polling mechanism from Kafka Broker for consuming messages is again up to your implementation). I just calculate 46th number in the Fibonacci series in a recursive way. Therefore the time you will see is an average of 20–30 seconds.

The result after the messages were consumed:

The consumer table after messages are consumed

The deviations between different time_takes will give us an average result. There is a tendency to differentiate on time, as there is CPU usage by specially created and managed threads. Because there is no reactive strategy, like backpressure, fusing, etc., used here.

The end-to-end result of Kafka API is:

We just calculate the time_takes between Producer and Consumer tables. The result will give us the end-to-end consumption times. Not bad right ?

Project Reactor Kafka Streams

So far so good. If you endured until this step I have a great news for you. The rest of the processes will give you pure results. Because you got the big picture. Let’s take a deep breathe and jump into second benchmark test.

Now we can move on to a reactive way of processing messages. Project Reactor way. It completely follows the Reactive Streams Specifications which is created by a group of software engineers from Netflix, Lightbend and Pivotal to solve common reactive programming problems (like the flow control problem, slow producer-fast consumer, fast producer-slow consumer, unbounded queue, etc.) and provide a standard to JVM community.

The scenario of the Reactor way is the same with Vanilla Kafka. Publish a message, take the message, calculate Fibonacci value and write the value to the database. Before publish the results, we can see the general configuration for Producer and Consumer.

For Producer:

Also we can review the Producer part:

At line 4: A ProducerRecord is created to send it to Kafka.

At line 6: While a SenderRecord is created, we can pass two arguments. The second argument is the passThrough argument that we can utilize it to reach from downstream.

At line 13: messageIdsToBeUpdated is a CopyOnWriteArrayList to hold messages that send to Kafka. We can use it to provide idempotence of the messages.

At line 14: A CountDownLatch can be used to make the caller thread wait for a certain amount of time so that all messages are sent to Kafka.

Publish performance of Project Reactor is:

We can observe from the publish results that if there are 500 messages that is unsent to Kafka and the time to send it to Kafka with Project Reactor Streams 10434ms.

For Consumer:

As you can see, we don’t batch and commit messages automatically. Also we need to see the pipeline for consuming operations (There are specified number of threads that execute startConsumer() operation. In our case 10).

The important part in the above code block is only how pipeline created. Because functionality in the pipeline is same for all of 3 benchmark tests.

At line 8: flatMap() pipe is created and process each message one by one by sending them to the downstream.

At line 9: Controlling if the message is consumed before, by querying the database with the unique message id. If it doesn’t, then send it to downstream (to line 13).

At line 17: Do a transactional operation and run the Fibonacci function.

At line 20: Save the result of the Consumer to the database to come through.

At line 29: If any exception occurs while in message processing, we will have a back-off mechanism to re-try to consume the message that thrown exception.

At line 30: We have all records and we want to commit their offset to be committed manually.

To make sure that the processing order in a partition is conserved; we can see here all the messages in partition 3 are consumed in order.

The end-to-end total process time is:

Let’s end the journey with Alpakka Kafka Streams.

Alpakka Kafka Streams

I mentioned above about Reactive Stream Specifications. Lightbend has taken in part for this job and Akka is a platform that is a part of Lightbend.

We talked a lot about the benchmark scenario. Now we can directly talks how pipeline is created and the results.

For Producer:

At line 3 (map operator summary): A ProducerRecord is created to send downstream.

At line 9 (map operator summary): After sending the record to Kafka add it into a thread-safe collection to update sent status.

At line 24: CountDownLatch is decreased for caller thread to continue.

Publishing messages in unordered fashion to Kafka results are:

For Consumer:

At line 4, 5 and 6: Back-off strategy in case of failure of the stream. It is an exponential back-off that the time interval between retries is increased exponentially.

At line 9: Logging options for internal state of the stream.

At line 15: Controlling if the message is consumed before, by querying the database with the unique message id. If it doesn’t, then send it to downstream.

At line 19: The message that comes from Kafka, its processing stage and updating the state of this message id in database.

At line 30: The record’s offset is committed.

To make sure that the processing order in a partition is conserved; we can see here all the messages in partition 3 are consumed sequentially.

The end-to-end processing performance:

Conclusion

As you can see Project Reactor and Vanilla Kafka API gives us close results in a small dataset. But the larger dataset you use, the larger gap between performances for Kafka API and Project Reactor. According to the results of the benchmark tests, Alpakka has the best sustainable result in terms of different dataset sizes among the streams.

Here we come. Have you decided which bridge to cross?

Curious Software Developer growing with DevOps Culture