blog by OSO

The Kafka Report 007: Batch to Real-Time with Kafka, and Resources

Sion Smith 31 August 2023
Newsletter 007 - The Kafka Report

☀️Welcome to this month’s issue of the Kafka Report! We’ve started gearing up for Big Data LDN, added more info on migrating from batch to real-time to our Expert Resource Hub, and launched Kafka Support, a monthly service option where you can get guaranteed Kafka support for a fraction of the typical cost. 

If you’re already up-to-speed on Kafka technology, this newsletter reads smoothly. For those of you new to Kafka, some sections may be a little dense. If so, we invite you to check out our intro blog, Master Apache Kafka 101!

🤖 What’s New This Month | Customer 360 

💥Customer 360 is more important than ever.

What it is: Customer 360 is a holistic account of each of your customers so that you can tailor offers to their needs. Essentially, you’re combining your internal data on an individual (such as their account information, postcode, age, and contact information) with data such as what types of customer service they’ve requested in the past, how often they interact with your firm in social media, and how they click through your website to explore services. 

Why it matters: With Customer 360, you’re able to more accurately identify when clients are in the market for a particular product. Maybe they move locations and downsize, and they have additional capital they want to invest in the market. Maybe they express significant interest in your Socially Responsible Investing package. Maybe they buy a certain type of advisory support each year on a certain date, and you’re able to make them happy by anticipating their request. 

How Kafka helps: Kafka is an event-driven architecture, which means that every time an event, such as a website click or transaction, occurs, collated data is published and stored in real time. Whereas other systems might get out-of-sync, event-driven architecture (or EDA) is really good at taking information from many different sources and integrating it into a single source of truth. 

🏆 Featured Success Story | Cambridge University Press

💥Meet Cambridge University Press! With OSO’s Kafka Support, the Press added a data streaming pipeline and submission service to enhance its online learning system.

Challenge: As a publishing organisation with a global mission, Cambridge University Press needed to be able to easily update its syllabi, tests, and assessments, as well as collect and store accurate data in real time. 

Solution: To implement a solid and reliable data pipeline and submission service, Cambridge University Press selected Kafka Support, a dedicated Kafka enterprise service that offered Kafka expertise at a fraction of the cost of hiring a full-time engineer on staff. 

Results: Though the project is still in its early phases, Cambridge University Press expects that the new data pipeline and submission service will improve the learning experience, enhance assessment feedback, and allow the Press to make more strategic decisions. 

👉 Read the full customer story here.

Cambridge University

➡️ Kafka Essentials | Moving From Batch to Real-Time

💥More than 75% of the Fortune 500 has internally adopted Kafka. Now it’s your turn.

How batch to real-time works: Instead of processing data in ‘batches’, or discrete chunks, real-time processing publishes events as soon as they take place. To migrate to real-time, you’ll first select a batch process on which to pilot your proof of concept. You’ll then use a tool like Kafka Connect to integrate your current database with Kafka, set up a data streaming pipeline, and validate the pipeline to make sure it’s working properly. 

First steps: Identify the initial batch process you want to migrate with a framework. Will proving success on a smaller scale help you make the case for Kafka to senior leadership? If you’re not sure they’ll listen, think about how you can position real-time data in light of your company’s bigger strategic goals and objectives. 

Where to learn more: Our comprehensive guide, ‘Convert Batch to Real-Time’. Here’s a sneak peek of what’s inside: 

📚 The Limitations of Batch Processing Unveiled

In the digital age, data volume is skyrocketing, and batch processing is falling short. Dive deep into the intricacies and understand why this method is no longer the gold standard.

💸 Legacy Systems: A Costly Affair

Ever wondered about the true cost of legacy platforms? From hefty licensing fees to the pitfalls of prepaid processing power, our eBook uncovers hidden expenses and inefficiencies.

🔄 Beyond Ingestion: The Magic of Real-Time Processing

Ingesting data in real-time is just the beginning. Our guide will introduce you to the world of streaming analytics, unlocking the true potential of each data point.

🔽 Ready to Revolutionise Your Data Strategy? Download the eBook now and embark on a transformative journey towards real-time data excellence. Elevate your knowledge, elevate your business.

📅 Upcoming Industry Events | Big Data LDN x Kafka MeetUp London Special Event

💥Register now for this September’s premier Kafka event, Big Data LDN x Kafka Meetup London!

What it is: A three-hour event featuring industry speakers from Quiz, OSO, and Aerospike, discussion panels about real-time data, pizza, networking, and Q&As about how Kafka might apply to the goals you’ve set for yourself and your business. 

Who should attend: You, if you want to connect with Kafka professionals, build or acquire knowledge of real-time data streaming, learn more about the industry, or attend an informal networking event. 

🗓️ Date and Time
   { September 20th, ⏰ 6:00 PM - 9:00 PM 🕘 }

🏟️ Venue
   { Fast Data Theatre, Big Data LDN, Olympia, London W14 8UX } 

🌐 Theme
   { Exploring Fast Data with Kafka Moving from Batch to Real-Time }

How to register: Check out the Meetup page and sign up at Big Data LDN 2023 – Register.

🥳 Testimonials | Cambridge University Press

‘The successful integration of content from Contentful into our internal systems in real-time improved the overall user experience and data accuracy, setting a strong foundation for future data-related initiatives.’ 

— CAMBRIDGE UNIVERSITY PRESS —

📖  Expert Hub | Batch to Real-Time Data

💥We’ve added more blogs and guides on migrating your batch processes to real-time. Here’s a roundup of the resources we think you’ll find most relevant.

  • Batch versus real-time stream processing: A beginner’s guide — Understanding schema evolution can help you make informed decisions about whether to use batch or real-time processing. Typically, you’ll want to make incremental changes up until a fundamentally incompatible schema change occurs. GitOps can help you bridge this gap! 
  • The future of data streaming using shared data products — Building data products from external data sources raises questions about governance, data security, and data policy. If you want to collaborate with clients but want more control over access management and encryption controls, one option is Confluent’s feature for stream sharing.
  • Tips for moving from batch to real-time data streaming — One big tip for building a real-time data pipeline is to break your algorithm into smaller steps! If you’re using Kafka Streams, think about the most efficient way to process the data at each stage. If you’re using ksqlDB, implement multiple queries to optimise processing.
  • 5 steps for real-time data magic with DBT transformation — DBT is great at transforming data that’s already uploaded to your warehouse, but to avoid duplicating messages, subscribe to multiple topics. This double subscription lets you distribute your data load, enable parallel processing, and easily segment your data.  

To read more, head to the Hub or download OSO’s comprehensive guide ‘Convert Batch to Real-Time.’

🧠 Interested in Kafka Support?

💥Get on-demand access to Kafka experts for a fraction of the cost.

What it is: Dedicated monthly enterprise support for Apache Kafka via Slack. Instead of hiring a full-time Kafka engineer, work with a team of Kafka experts to implement and troubleshoot your setup. If you have Kafka-related questions, you get a guaranteed response in 2 business days or less. 

How to prep: Consider the level of support you’ll need and what you’re planning to budget. If you need a plan of action to take to senior leadership, Sion can help you outline some of the specific outcomes and benefits of real-time data—like increased fraud detection, 360-degree visibility, and more accurate data for reporting and compliance. 

Next steps: Book a call or come talk to us at Big Data LDN on September 20th!

☀️That’s all for August. Enjoy the remainder of the summer months—and come see us in London next month!

Get started with OSO professional services for Apache Kafka

Have a conversation with a Kafka expert to discover how we help your adopt of Apache Kafka in your business.

Contact Us