blog by OSO

How to build a custom Kafka Streams Statestores

Sion Smith 31 July 2023

Blogs 8 mins read

Building Kafka Streams Statestores? We have built our fair share of Kafka Streams applications, and there are a number of challenges and opportunities of building your own state source. Using Nitrite Database as an feasible alternative to internal state stores and has the benefits of being easily integrated with existing systems together with a good developer understanding. Lets dig into some actionable design decisions!

Using Nitrite Database as a Kafka Streams Statestores

Managing Kafka Streams statestores in a distributed platform like Kubernetes is extremely complex, however many companies demand this level of high availability from multiple running pods. To achieve this there has been a paradigm shift to move data into embedded databases instead of using the internal state stores. When restarting or deploying a new pod, using the standard KeyValueStore would result in a full scan of the statestores, filtering out the records that do not apply to the search criteria. For small statestores, this is not an issue. For larger ones it becomes quite a performance bottleneck.

The benefits of using something like H2, Lucene or NitriteDB backed by PVC (persistent Volume Claims) are the ability to integrate with existing underlying technologies for statestores. By choosing a document store as the base technology, we are able to push the querying down to the underlying technology instead of having to deal with that ourselves. You still have the flexibility to easily reset and restore from the changelog if any issues arise. This shared state store also helps running multiple instances of the same Kafka Streams app, something you cannot do when in a single pod. However, do not make the mistake of writing directly from applications to data sources, as it can lead to difficulties in maintaining and guaranteeing data integrity.

The important factor here is to offload this processing to a robust technology, leveraging existing infrastructure patterns that can handle a large volume of transactions per second.

We open sourced an example project for anyone wishing to adopt this approach, please reach out to use for more information.

Fore more content:

How to take your Kafka projects to the next level with a Confluent preferred partner

Event driven Architecture: A Simple Guide

Watch Our Kafka Summit Talk: Offering Kafka as a Service in Your Organisation

Successfully Reduce AWS Costs: 4 Powerful Ways

Protecting Kafka Cluster

Apache Kafka Common Mistakes

Kafka Cruise Control 101

Kafka performance best practices for monitoring and alerting

Get started with OSO professional services for Apache Kafka

Have a conversation with a Kafka expert to discover how we help your adopt of Apache Kafka in your business.

Latest blog posts

See more →

Blogs 4 mins read

Current 2025: Watching Confluent Prepare for Sale in Real Time

Sion Smith 13 November 2025

Blogs 14 mins read

Why You Don’t Need Apache Flink for Agentic AI (And Why Akka Is the Simpler Choice)

Sion Smith 18 October 2025

How to build a custom Kafka Streams Statestores

Using Nitrite Database as a Kafka Streams Statestores

Get started with OSO professional services for Apache Kafka

Latest blog posts

Current 2025: Watching Confluent Prepare for Sale in Real Time

Why You Don’t Need Apache Flink for Agentic AI (And Why Akka Is the Simpler Choice)

Subscription form (footer)