Hey everyone—thanks for joining us for the April issue of The Kafka Report. This month, we’re focusing on Debezium updates and Ansible updates, which you might find useful if you’re looking for ways to integrate Kafka into your day-to-day operations, plus some examples of Confluent and Kafka success. Don’t forget: there’s still time to join us on May 16th and 17th for Kafka Summit London 2023!
The latest Kafka updates in Debezium Updates and Ansible Updates
Debezium
- Debezium 2.2 has now been released, which includes some cool new features. The team usually releases new updates once every three months or so, but this time they specifically held off to make sure they had everything included in the new version that they wanted: new schema history storage, sink adaptors, Quarkus 3 support, and a parallel snapshots feature. As always please check out the breaking changes before doing anything.
- For the next version, Debezium is focusing on adding a Kubernetes operator to the Debezium Server, making metrics more visible, and overall, improving the platform’s user experience. Since their code is open-source (and on GitHub), you can head over and check out what they’re working on, if you’re interested in contributing.
Confluent Ansible
- Also available on GitHub is the latest Confluent CP-Ansible version, 7.3.4. This one comes with a timeout feature while you’re deploying a Confluent connector, and fixes a few problems that were originally there with the SSL rule logic and Confluent Platform deployments. (You’ll also see a few new optional variables for the kerberos kdc_port and admin_port.)
Twitter and Kafka
- You’ve read Kafka success stories before in this newsletter. But what you may not have known is that one of the most well-known social media companies of the decade—one that let journalists, politicians, celebrities, entrepreneurs, and regular people share their thoughts with the world—has used Apache Kafka for going on five years. Twitter.
- By adopting Kafka, Twitter cut resource costs (68%-75% resource savings in two of their tests), and its developers liked the level of engagement in the Kafka community. This is typical: in the Kafka community, members fix bugs, chat about updates, and make it easy to find solutions, as opposed to platforms with only a few loyal contributors.
- That being said, switching over from EventBus (Twitter’s original system) to Kafka wasn’t a piece of cake. Whenever a huge organisation switches from one system to another, they usually run into a few configuration and debugging issues. This is pretty normal, and it’s just part of architectural change. Don’t let legacy systems stop you from implementing Kafka. There are ways to make the transition more seamless—just contact us if you’d like to learn more.
Confluent and Storyblocks
- For those of you who don’t want to manage a Kafka service yourselves, you might find similarities with Storyblocks, a media service for stock audio, video, and graphics. They had a lot of the features that made them a good Kafka prospect—tons of transaction events and a need to understand what their customers wanted in real time—but they didn’t want to get into the details of managing a data architecture.
- Storyblocks’ Director of Engineering liked that Confluent was cloud-native (which helps companies scale up and scale down, only paying for what they use and need), and like Twitter, Storyblocks’ team found that the Kafka community was really eager to support and help them work through roadblocks. They’re now focusing on pushing event-driven streaming even further and ideally shifting from batch data to live streaming, just like we recently did for Dufry, a travel retail business.
Understand How to Address Magic Byte Errors
- Recognise this message? “Caused by: org.apache.kafka.common.errors. SerializationException: Unknown magic byte!” Want to fix it? Essentially, magic byte errors mean that the first few pieces of data aren’t matching up, which means that you’ll need to go back and check your configuration. There is now a super easy to follow step by step guide (Check out this tutorial.)
Set up a Simple Kafka Stack on AWS
- Jan, one of our developers, recently wrote up a blog that walks you through the basics of setting up a single-node Apache Kafka cluster on Kubernetes. What’s nice about this one is that (after a few rounds of practice), the setup should only take about ten minutes, and Jan breaks down the steps one by one so that you can see each part of the process.
- You’ll create an EC2 instance, copy the Kubeconfig to your local machine, install Confluent for Kubernetes, create the Confluent Platform CRDs, and after waiting about sixty seconds for the components to come up, access the Confluent Control Center. Then you’ve done it! Jan includes some code samples and screenshots along the way, which is helpful if you’re testing the steps as you follow along.
Kafka Summit London
- Join OSO at stand U4. We’re currently making the final preparations for this event, and Rich and I would love to see you there.
Current 2023 | The Next Generation of Kafka Summit
- Registration for the Current 2023 in San Jose, California is now live. The event doesn’t take place until September 26th and 27th of this year, but their early bird promo ends May 5th. You can grab an in-person pass with access to all their keynotes and sessions for US$699 on the Confluent website.
Kafka for your organisation
If you’re liking this newsletter content, email me or reach out on LinkedIn to learn more about how Kafka might apply to your context, or how you can start small with a Kafka proof of concept. We’ll be releasing a new case study in May that shows how Dufry trialled Confluent Kafka with great results.
Until then, that’s all for April! Wishing you warm weather and a productive month. ☀️