blog by OSO

Multi-message types in Kafka topics: Lessons from a real-world pipeline

Sion Smith 25 April 2025
Multi-message types in Apache Kafka topics

When your Kafka pipeline must handle dozens of message types from diverse systems, the conventional “one message type per topic” approach quickly breaks down. This is especially true in complex domains like travel, where data comes in various formats and structures, from room reservations to spa bookings and flight bookings.

At OSO, we recently worked with a travel company serving tens of thousands of hotels globally, we faced exactly this challenge. What makes our use case particularly challenging is the diversity of our data sources. We had to connect to hotel systems ranging from modern cloud-based APIs to legacy DOS-based platforms, with everything in between. These systems send data through various methods: webhooks, APIs, FTP transfers, and even email. Some data arrives in real-time, while other data comes in batches.

In this post, I’ll share how OSO solved one of our biggest architectural challenges: creating a schema approach that could handle multiple message types while maintaining order, simplifying development, and ensuring consistent processing.

The Limitations of Traditional Topic Design

The conventional wisdom in Kafka suggests that each topic should contain just one message type. This approach functions as a form of “strong typing” for Kafka topics, which can prevent errors and simplify processing logic. Initially, we followed this guidance: room reservation messages went into a room reservation topic, restaurant reservation messages into a restaurant reservation topic, and so on.

However, as our schema expanded and grew more complex, we encountered several significant challenges:

Cross-Topic Ordering Problems

When related messages exist in different topics, enforcing message ordering becomes difficult. For example, if a spa reservation (in the spa reservation topic) can only be processed after a specific room reservation (in the room reservation topic), coordinating this dependency becomes unnecessarily complex.

Variations Within Message Types

Even within a single conceptual message type, variations exist. A room reservation update event might contain dozens of fields (reservation ID, account information, guest name, check-in/out dates), while a room reservation deletion event might contain only the reservation ID. Both are technically “room reservation events,” but they have different structures and processing needs.

Nested Message Structures

In hospitality data, complex relationships exist between entities. Room reservations and golf reservations can both contain guest profiles. These profiles are valid messages in their own right and might need separate processing, yet they’re nested within other message types. With the one-message-type-per-topic approach, should these embedded profiles be extracted and placed in a profile topic? This quickly creates confusion and complexity.

Service Configuration Overhead

Services often need to consume multiple message types. When adding a new message type in a traditional setup, you’d need to:

  1. Create a new topic for the message type
  2. Update your service to consume from this new topic
  3. Implement new serialisers/deserialisers
  4. Add logic to handle the new message type

This overhead becomes significant when your domain contains dozens of related message types, especially when many services could process new message types without code changes if only they were receiving them.

Designing a Unified Schema Approach

To address these challenges, we designed a generalised schema that could be used across all topics. We implemented this using Protocol Buffers (protobuf), which provided strong typing while allowing for flexibility.

The Hotel Event Pattern

At the core of our solution is the concept of a “Hotel Event” – a parent message type that can contain any of our specialised message types. Within protobuf, we used the “oneof” field type to implement this pattern:

This approach allows us to maintain strong typing (each message is a specific type with a defined structure) while unifying all messages under a common parent type. Every message on any topic is a HotelEvent, regardless of its specific type.

Advantages of the Unified Schema

This approach offered several immediate benefits:

  1. Simplified Topic Structure: Since every message is a HotelEvent, we could be more flexible with our topic organization, grouping messages by function or processing requirements rather than rigidly by type.

  2. Preserved Ordering Across Types: When message ordering matters across different types (like ensuring a deletion event processes after its creation event), we can place these messages in the same topic and partition, guaranteeing ordered processing.

  3. Consistent Processing: Components within messages (like profiles embedded in reservations) are processed consistently regardless of which container message they arrive in, because they’re always the same structured type.

Simplified Service Development: Services can process any message by checking its type at runtime, rather than requiring separate connection logic for each topic and message type.

Implementation Examples

While the unified schema approach solved many problems, it introduced some implementation challenges we had to address.

Handling Switch Statement Complexity

When using a unified schema with the “oneof” pattern, your processing code will contain switch statements to handle different message types:

public void process(HotelEvent event) {
  switch (event.getEventTypeCase()) {
    case ROOM_RESERVATION:
      RoomReservation reservation = event.getRoomReservation();
      switch (reservation.getEventActionCase()) {
        case UPDATED:
          handleRoomReservationUpdate(reservation.getUpdated());
          break;
        case DELETED:
          handleRoomReservationDelete(reservation.getDeleted());
          break;
        // ... other cases
      }
      break;
    case RESTAURANT_RESERVATION:
      // Similar nested switch
      break;
    // ... other event types
  }
}

This can lead to deep nesting and verbose code. We mitigated this by:

  1. Using the visitor pattern where appropriate
  2. Creating utility functions to handle common processing paths
  3. Breaking down large switch statements into smaller, focused methods

Serialisation/Deserialisation Strategies

With a unified schema, serialisation and deserialisation become more consistent. Every service can use the same serialiser/deserialiser for HotelEvent, rather than needing specialised ones for each message type. This significantly reduced boilerplate code.

Schema Evolution Management

We used the buf CLI tool for protobuf linting, formatting, and detecting breaking changes. This was crucial for maintaining schema compatibility as our system evolved. The tool helped us:

  • Ensure consistent naming conventions
  • Detect accidental breaking changes before deployment
  • Auto-format protobuf files for readability
  • Generate code consistently across services

Real-world Results and Lessons Learned

After implementing the unified schema approach across our Kafka ecosystem, we observed several positive outcomes:

Development Velocity Improvements

Adding new message types became significantly easier. Instead of creating new topics and updating multiple services, we simply added new types to our protobuf schema and deployed the updated schema package. Services that needed to process the new types could update their logic, while others continued to function without changes.

Onboarding Experience

New developers found the system more intuitive. With a consistent message structure across the platform, they could understand the data flow more quickly and focus on business logic rather than the intricacies of Kafka topic configuration.

Performance Considerations

We did observe minor overheads from using a more complex schema structure, primarily in serialisation/deserialisation time and message size. However, these were minimal compared to the architectural benefits gained.

When This Approach Might Not Be Right

The unified schema approach isn’t suitable for every use case:

  • If your message types are fundamentally different with no conceptual relationship
  • If you have strong security requirements that necessitate physical separation of message types
  • If different message types have vastly different retention or processing requirements
  • If your organisation has strict team boundaries where different teams own different message types

Practical Takeaways for Your Kafka Architecture

If you’re considering implementing a unified schema approach, here are some practical steps to get started:

Questions to Ask Before Implementation

  1. Do your message types share conceptual relationships?
  2. Do you need to maintain order across different message types?
  3. Are you currently managing a complex web of topics that’s becoming difficult to maintain?
  4. Would your services benefit from more flexible consumption patterns?

Transitioning from Separate Topics

  1. Start with a schema audit: Document all your current message types and their relationships
  2. Design your unified schema: Create a common parent type with specialized subtypes
  3. Build a proof-of-concept: Implement the new schema in a limited scope
  4. Develop a migration strategy: Consider running dual systems temporarily
  5. Roll out incrementally: Convert one message flow at a time

Configuration Recommendations

  • Increase the default message size limit if your unified messages are larger
  • Adjust consumer configurations to account for potentially more complex processing
  • Configure serialisers to handle the unified schema efficiently
  • Consider compression if message size becomes an issue

Managing Schema Evolution

  • Establish clear guidelines for adding new message types
  • Use tooling to detect breaking changes
  • Create a versioning strategy for your unified schema
  • Consider backward and forward compatibility requirements

Beyond One-Topic-Per-Type

The journey from a traditional one-topic-per-type Kafka architecture to a unified schema approach taught us that conventional wisdom doesn’t always apply to complex, real-world systems. By challenging the standard pattern and designing a solution tailored to our domain, we created a more flexible, maintainable, and developer-friendly system.

The key insight is that your Kafka architecture should reflect the natural structure of your domain. In hospitality, where data entities are richly interconnected and have complex relationships, a unified schema approach allowed us to model these relationships more accurately and process them more consistently.

As you evaluate your own Kafka architecture, consider whether your current topic structure is serving your domain needs or creating unnecessary complexity. Sometimes, the simplest solution is to embrace the complexity of your domain within your schema, rather than trying to flatten it into disconnected topics.

By thinking beyond the one-topic-per-type pattern, you can create event-driven architectures that are both powerful and manageable, even as they scale to handle complex, real-world domains.

Get started with Apache Kafka today

Have a conversation with one of our experts to discover how we can work with you to adopt multi-message structure in your Apache Kafka topics to keep your data first strategy alive.

CONTACT US
OSO
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.