Server-Sent Events Using Spring WebFlux and Reactive Kafka
Let's walk through the architecture of servers pushing data to the clients using Server-Sent-Events and Reactive Kafka. This is the talk that was presented at Kafka Summit Europe in 2021.
Today, we're going to take a look at data streaming from a reactive Kafka-based WebFlux REST server to a Webflux client in a non-blocking manner.
Below designed architecture can be used to:
- Push data to external or internal apps in near real-time.
- Push data onto the files and securely copy them to cloud services.
- Push the same data over to multiple clients from a Kafka topic.
Let's get started!
Before we execute a sample application to demonstrate Server-Sent Events (SSE) using Spring WebFlux and Reactive Kafka, let's understand the fundamental concepts:
What are Server-Sent Events?
Server-Sent Events (SSE) is a Server Push technology that allows a client to receive automatic server updates through the HTTP connection.
The SSE can be used to:
- Replace Long polling (which creates a new connection for every pull) by maintaining a single connection and keeping a continuous event stream going through it.
- Enable apps that use one-way data communication (eg: e-commerce websites, live stock price updates).
What is Spring WebFlux?
Spring WebFlux framework is a fully asynchronous and non-blocking reactive web stack that enables the handle of a massive number of concurrent connections. WebFlux supports Reactive Streams back pressure and runs on such servers as Netty. It enables us to vertically scale the services to handle the greater load on the same hardware.
What is Reactive Kafka?
Reactive Kafka is a reactive API for Kafka based on project Reactor and the Kafka producer/Consumer API. It enables data to be published and consumed from Kafka using functional API with non-blocking back pressure and low overheads which allows reactive Kafka to be integrated with other reactor systems and provide an end-end reactive pipeline.
Note: to gain a thorough grasp of Webflux and Reactive Kafka, make sure to understand the terminology.
As per the architecture shown in image-1, we will build a WebFlux Server using the Spring WebFlux framework and reactive Kafka, exposing a REST API for the clients to make secure HTTP requests.
Once a secure connection is established between the client and the web flux server, it consumes messages from Kafka topics and pushes the data asynchronously without closing connection with the client unless required.
Rather than build a reactive Kafka producer, we will leverage the existing producer example on the reactor repository. Additionally, instead of building a web flux client, we will test the server's SSE response by using a curl command on the terminal.
- Java v1.8+
- Apache Kafka + basic understandings
- Intellij or Eclipse or Sprint Tool Suite
- Kafka Conduktor tool
Let's create a Spring boot application using the following dependencies shown below.
Let's create a Kafka Receiver Configuration, which is a consumer as shown below. It is configured with generic GROUP_ID_CONFIG since we are working on handling a single client for now by enabling auto commits & always reading the earliest messages but we can also update it to the _latest.
If we enable multiple clients, each client can receive messages from the same topic based on the last committed offset.
In order to keep things simple, we will be dealing with String Deserializers which can be extended to generic JSON/AVRO schemas.
After the configurations are ready, we will create a REST controller that consumes messages from Kafka topics and sends back responses as the flux of data. Use MediaType.TEXT_EVENT_STREAM_VALUE as the content-type. This tells the client that a connection will be established and the stream is open for sending events from the server to the client.
Before we test this application, here is a sample producer we're leveraging from the reactor repository with StringSerializer:
Now, start the Kafka server in the localhost:9092 and create a topic as shown in the above configurations.
The following commands would be helpful.
$ bin/zookeeper-server-start.sh config/zookeeper.properties $ bin/kafka-server-start.sh config/server.properties $ bin/kafka-topics.sh — create — bootstrap-server localhost:9092 — replication-factor 1 — partitions 1 — topic <topic_name> $ bin/kafka-topics.sh — list — bootstrap-server localhost:9092
The image below shows the topic created with partition 1 and without any message (Count=0) being produced from Kafka Sender.
Next, let's run the Spring-boot application on localhost:8080 and make a sample request to this server using a curl command on the terminal, which keeps the connection alive.
curl — location — request GET 'localhost:8080/sse' \\ — header 'Content-Type: text/event-stream;charset=UTF-8' \\ — header 'Accept: text/event-stream;charset=UTF-8'
Once the curl command is executed on the terminal, a Kafka receiver is registered (as shown in the console above). Now, let's push some data onto the Kafka topic by running Kafka Sender and see how the data is being received in the terminal, which is acting as the client here.
As we can see in Image-9, once messages are published onto the topic, the data gets pushed onto the terminal- which is Server-Sent Events (SSE).
Image-10 shows the Conduktor tool tracking the consumer and displaying the messages being consumed by the respective client.
We can see that 40 messages were published & all messages were consumed (Lag=0, End-Current=40–40) and sent to the client, as shown in the terminal.
This project can be expanded to support multiple clients by making GROUP_ID_CONFIG unique and setting offsets to the latest for each client, this creates a new consumer group to each client by keeping the connection alive and streaming the data asynchronously. In case if any client loses the connection with the server and is able to re-establish a secure connection post-interruption, the client will rejoin the existing consumer group and receive data from the previous offset committed.
This architecture can also be used to create a batch scheduler to consume & transfer the message to the files and securely copy them onto cloud services, allowing the client to access files for further processing.
If you enjoyed this basic concept walkthrough of SSE using Spring WebFlux and Reactive Kafka, please feel free to share & follow our publication!
Refer code here.
You might also like
Generative AI vs Traditional Machine Learning: What Sets Them Apart?
The future of AI is filled with endless possibilities. Understanding the differences in AI is crucial for businesses to make informed decisions about which approach best suits their needs.Read article
5 Key Ways Data is Transforming Healthcare and Life Sciences
Healthcare accounts for 30% of global data, growing at 36% annually. Don't let your data drown—with structured storage and BI tools. Discover our 5 real-life use cases.Read article
10 Reasons Why Businesses Consider Google Cloud Platform (GCP)
Unleash the power of Google Cloud Platform (GCP) for your business. From scalability and big data processing to cost reduction and application development, find out why GCP is a top choice.Read article
Navigating HIPAA Compliance: A Checklist for Healthcare Organizations
Ease your HIPAA compliance journey with our curated checklist— your trusty roadmap to navigate the complexities of data protection.Read article
Deep Learning & Computer Vision: A Hybrid Future
AI's potential has drawn vast interest, with many rushing to harness its prowess. Deep Learning & Computer Vision are now hot topics, yet their complex inner workings are often overlooked.Read article
How to Reduce AWS Costs: Strategies and Best Practices
Discover how to reduce your AWS bill without sacrificing functionality with our tried-and-test tips and expert guidance.Read article
How AI and Personalized Marketing are Transforming Retail Sales
How AI/ML, CDP, personalization, and BI are revolutionizing retail, fashion, and beauty. Dive into brand examples from Sephora, ThredUp, and H&M.Read article
19 Cloud Computing Statistics You Need to Know in 2023
By 2025, over 100 zettabytes of data will be stored in the cloud—50% of all global data storage.Read article
5 Ways to Transform Grocery Retail with an AI-Driven Data Strategy
Explore 5 AI-driven data strategies for grocery retail. Learn how to solve challenges like workforce management, pricing, and disconnected CX.Read article