Key configurations of Kafka producers
- Metadata Cache: This is where information about the Kafka cluster, such as broker locations, topics, and partitions, is stored. It’s crucial for directing messages to the correct broker and handling changes in broker leadership.
- Record Accumulator: This component batches messages to optimise network usage. The size of these batches can significantly affect performance. The sizes of these batches can be configured by the following
- Batch Size: Determines how many messages are sent in a single batch. Adjusting this can reduce the number of network round trips required, which can enhance throughput.
- Linger.ms: Defaults to 0, this setting controls how long the producer waits before sending a batch, even if it’s not full. A small delay can allow more messages to accumulate, reducing network calls. If you set linger.ms too high, you will introduce latency if throughput is low.
- Max.request.size: If the message is bigger than the batch size then the message is sent immediately, there is also a broker size configuration which can limit the size of a single message.
- Send Buffer: This configuration dictates the size of the buffer that stores unsent messages. While increasing this buffer can allow your application to handle more messages at once, it also increases the memory footprint and can lead to performance degradation if not managed correctly.
- Buffer.memory: The size of this buffer defaults to 33mb, but increasing the buffer size doesn’t translate into increased performance as it doesn’t affect the batch size. Having a bigger buffer means a bigger memory footprint of the producer application and increases the chances of messages staying in memory for a longer period.
- max.block.ms: If you have increased the buffer size, this increases the possibility of messages staying in the buffer for a longer period of time, you might start seeing expectations as the message has expired, so you will need to change this value if you increase the buffer size.
- Compression: Enabling compression can reduce the size of the messages sent over the network and stored on disk, potentially lowering costs. However, the choice of compression algorithm (GZIP, Snappy, LZ4, or zstd) can impact both performance and CPU usage.
- ACKS Message Delivery Assurance: The acks configuration determines the level of assurance the producer requires before considering a message sent. acks=all ensures high reliability as it waits for all replicas to receive the message, whereas acks=1 or acks=0 can improve performance at the cost of reliability.