blog by OSO

5 Kafka Error Handling Tips

Sion Smith 21 June 2024
Kafka error handing tips

When working with Kafka, it is important to handle errors gracefully to maintain data integrity and ensure message delivery. There are many ways to cater for such errors in your application code when using Kafka. We have seen our fair share, here are our top 5:

1. Dead Letter Topic

The Dead Letter Topic design pattern is used to handle messages that cannot be processed successfully. In this pattern, messages with invalid payloads or exceptions are routed to a separate topic called the Dead Letter Topic. This ensures that all messages with the same key are routed to the Dead Letter Topic, maintaining message ordering. This is commonly used in Kafka Connect, as failed messages will cause the connector to fail, make sure you set up a Dead Letter Topic in production.

2. Transactional Writes

Use Transactional Writes to ensure that writes to Kafka and a database are performed in a transactional manner. This means that either both the write operations to the database and Kafka succeed, or neither does, ensuring data consistency across your systems. The exact code will vary depending on the programming language, Contact us if you get suck.

3. Retry Topics

Retry Topics is another option where if a message fails to be processed, it is retried a specified number of times before being considered a failure. This pattern is useful for handling transient failures that may resolve themselves after a few attempts.

4. Ordered Delivery

The Ordered Delivery approach ensures that messages are processed in the order they are sent, even in the presence of retries or failures. This is particularly important in systems where the order of operations is critical to the application’s correctness.

5. Stop on error

There are times when an application should stop if an error occurs. An example of this is a CDC connector stream from a database of financial transactions. When an error occurs, there is no other path that the event can take, therefore the connector is automatically stopped and action needs to be taken.

Implementing Error Handling with Spring

Now, let’s dive into the practical implementation of these patterns using Spring Framework, which simplifies the integration with Kafka through its abstraction layer.

Transactional Writes Implementation

In Spring, transactional writes can be easily managed with the @Transactional annotation. Here’s a simplified example of how you can use this in your service:

@Service 
public class PendingTransferService { 

   @Autowired 
   private KafkaTemplate<String, String> kafkaTemplate; 

   @Autowired 
   private TransferRepository transferRepository;

   @Transactional 
   public void processTransfer(Transfer transfer) {     
        transferRepository.save(transfer); 
        kafkaTemplate.send("transfers", transfer); 
   }

}

In this code snippet, both the database operation (transferRepository.save()) and the Kafka send operation (kafkaTemplate.send()) are wrapped in a transaction. If either operation fails, the entire transaction is rolled back, preventing data inconsistency.

Retry Topics Implementation

Handling retries in Spring can also be streamlined using annotations. For instance, you can configure a method to retry automatically if it throws a specific exception:

@Service 
public class TransferService { 

    @Retryable(value = RemoteServiceNotAvailableException.class, maxAttempts = 5 , backoff = @Backoff(delay = 1000, multiplier = 2)) 

    public void processTransfer(Transfer transfer) throws RemoteServiceNotAvailableException { 
    // Code to send transfer to a remote service 
    } 
}

This configuration will retry the processTransfer method up to five times if a RemoteServiceNotAvailableException is thrown. The delay between retries increases exponentially due to the multiplier attribute, starting from one second.

Handling Ordered Delivery

Implementing ordered delivery in Spring with Kafka can be challenging, as it requires maintaining the order of messages manually. Here’s a basic example of how you might attempt to handle this:

@Service 
public class OrderMaintainingService { 

    @Autowired private KafkaTemplate<String, String> kafkaTemplate;

    public void sendOrderedMessages(List messages) { 
        for (Message message : messages) { 
            kafkaTemplate.send("orderedTopic", message.getKey(), message); 
            // Additional error handling logic here 
        } 
    }
}

In this example, messages are sent one at a time in a loop, which can help maintain order but may not be efficient. Error handling needs to be robust to ensure that a failure in sending one message doesn’t disrupt the entire process.

Advanced Integration Patterns

Beyond basic error handling, integrating Kafka with other systems like databases and web services requires more sophisticated patterns. Here are two advanced patterns:

Transactional Outbox Pattern

The Transactional Outbox pattern is used to ensure reliable messaging when working with transactions spanning multiple systems. This pattern involves writing messages to an outbox table in a database as part of the transaction. A separate process then reads from this table and publishes the messages to Kafka, ensuring that messages are only published if the transaction commits successfully.

 

Orchestration and Choreography

These patterns involve coordinating multiple services and transactions. Choreography uses events to trigger and manage interactions between services, while orchestration involves a central coordinator directing the interactions. Here’s how these might look in a Kafka-integrated system:

Kafka Orchestration-Based Integration: In contrast, orchestration involves a central coordinator (often a service itself) that controls the interaction between services. The coordinator knows the complete business transaction flow and instructs each service when to perform its task. If a service reports a failure, the coordinator can trigger compensating transactions to rollback changes.

Kafka Choreography-Based Integration: This approach relies on events that are published and consumed by different services autonomously. Each service performs its part of the transaction and publishes an event upon completion, which the next service listens to and acts upon. This pattern is decentralised and each service knows what to do next.

Kafka Error Handling Example: Failures in money transfer 

Let’s consider a practical scenario involving a money transfer between banks, which illustrates the use of these patterns:

  1. Initial Transaction: A transfer is initiated from a local bank to a foreign bank.
  2. Choreography: The local bank publishes an event stating that money has been debited. The foreign bank, upon receiving this event, attempts to credit the amount.
  3. Orchestration: If the foreign bank service is down, the orchestrator will retry the transaction based on predefined logic (like the Retry Topics pattern). If retries fail, it triggers a rollback at the local bank.

This example shows how both choreography and orchestration can be used to manage complex transactions across different systems and services, ensuring consistency and reliability.

For those interested in diving deeper into the code and practical implementations, the resources and examples can be found on our GitHub. Whether you are a novice looking to understand the basics, reach out to us and we can advise you on the best approach for your use case.

Get started with OSO professional services for Apache Kafka

Have a conversation with a Kafka expert to discover how we help your adopt of Apache Kafka in your business.

CONTACT US