Explore how Terraform-based GitOps helped Drivvn automate Kafka
Project Overview
- When digital platform provider Drivvn selected Kafka as the technology to migrate to real-time streaming data, OSO advocated for a DevOps solution when it comes to terraform based GitOps.
- To automate how Drivvn deployed Kafka, we used Terraform Cloud, a leading infrastructure as code (IaC) tool.
- We wanted Drivvn to achieve industry-standard best practices of IaC, so we deployed, monitored, and updated the infrastructure delivery process with Terraform-based GitOps.
Challenge: Automating Kafka deployment to introduce real-time streaming data
In our first Drivvn case study, “How Drivvn migrated batch to real-time streaming data with Apache Kafka”, we explained that Drivvn’s developers decided to adopt Kafka to provide clients with faster insights into customer behaviours, decouple their site’s architecture, and create a single source of truth for their data.
Adopting Apache Kafka would give Drivvn a distributed software platform to help its developers build real-time applications. As illustrated below, a Kafka cluster stores data events in topics, somewhat similar to how Google users store documents in folders. To access and add new data events, developers publish streams with producers and subscribe to streams with consumers.
To implement Apache Kafka, Drivvn could have installed and configured Kafka on its own server infrastructure. For this project, however—since Drivvn’s developers didn’t want to manually operate and maintain the Kafka platform over time—we recommended that Drivvn adopt and deploy Confluent’s managed Kafka solution: Confluent Cloud
Advocating for a DevOps solution to streamline deployment
To deploy Kafka, we urged Drivvn to automate the solution with development operations. At its core, DevOps is a way of developing and delivering software that stresses collaboration, communication, and integration between software developers and operations professionals. OSO recommends that all our clients use DevOps, as the philosophy typically helps teams improve the quality of their software and speeds up the software delivery process.
Decision-making: Choosing Confluent’s Terraform Provider to deploy the new Kafka resources
To automate Drivvn’s Kafka deployment, OSO needed to select an infrastructure as code tool. IaC tools help teams manage, provision, and deploy new computing infrastructure by automating manual tasks typically performed by system administrators.
First, we debated using Azure DevOps pipelines versus Terraform Cloud. We could use a custom Ruby Terraform wrapper to automate the build, test, and release process with Azure DevOps pipelines, but the Terraform wrapper would require upkeep, saddling Drivvn’s team with technical debt. So we selected Terraform Cloud, which handles edge cases more effectively, allows Drivvn’s technical team to collaborate at a reasonable cost, and stands out as the industry standard.
What is Terraform?
Terraform is a software tool created by the company HashiCorp that enables developers to create, manage, and provision resources in cloud environments such as Amazon AWS, Google Cloud Platform, Microsoft Azure and Confluent Cloud. Developers use Terraform to provision infrastructure components such as compute instances, storage disks and volumes, networking components such as subnets and gateways, load balancers, and DNS entries.
Terraform automates changes to existing infrastructure with plans. Essentially, a Terraform plan is a document that describes the changes that will be made to infrastructure, including what resources will be created, destroyed, and modified. Terraform writes plans in the same language it uses to configure resources, allowing developers to see what changes Terraform will make before it executes them. For this project, we could have used Terraform Apply to completely automate Drivvn’s changes, but we chose to use Terraform Plan to preview changes and confirm they were correct.
Roadmap: Using Terraform-based GitOps to deploy, monitor, and update Kafka
To help Drivvn achieve industry-standard best practices of infrastructure as code and automate its Kafka resource provisioning, we used Terraform-based GitOps. At each step, we aligned Drivvn’s project with a few fundamental GitOps principles, briefly shown below:
GitOps Principles
- Develop and deploy software using a single, unified workflow
- Automate the entire software delivery process
- Use Git as the source of truth for all configuration management
- Declaratively specify desired application states
- Continuously monitor and enforce compliance with desired states
- Use feedback loops to self-correct any drift from desired states
These GitOps principles are especially helpful when a client is deploying a new system for the first time—like Drivvn deploying Confluent’s Kafka solution—since they minimise human error and speed up software delivery. For Drivvn’s Kafka deployment on Confluent Cloud, therefore, we relied on the following GitOps workflow.
Our GitOps workflow
1. Update the Terraform module configuration
We opened the file in our favourite text editor and made the necessary changes. Once we made the changes, we saved and closed the file.
2. Commit branch to Git
With GitOps, developers make all changes to their local version of the Git repository, which is used to manage the entire software lifecycle from development to production. To commit the branch, we opened a terminal window and typed the following command, replacing “`updated Terraform module configuration“` with our own commit message.
```git commit -am "updated Terraform module configuration"```
After committing the branch to Git, we then pushed it to GitHub with this command:
```git push origin <branch name>```
3. Use Terraform Cloud to automatically run a plan against the newly pushed branch
Terraform Cloud automatically runs a plan on all new branches. Heading over to Terraform Cloud and logged in, you will see in workspace settings, under “Branches”, the newly pushed branch and the latest plan.
4. Output changes to the console, notifying console users
Once Terraform runs the plan, the console displays the delta (changes). We then notified users of the changes, using webhooks to request their review.
5. Perform a peer review, triggering Terraform Cloud to apply changes and updates
Reviewing and approving the changes in Git results in the merged code configuration being pushed and released into the main branch. Terraform Cloud then automatically runs an apply.
Throughout the Kafka infrastructure provisioning process, we also shared context from our past projects and provided in-depth documentation of Terraform best practices.
Problem-solving: Creating manual steps to fill the gaps in Confluent Cloud’s current Terraform provider
Although Terraform was OSO’s natural choice, we knew from the start that Confluent Cloud’s Terraform provider is very new and lacks specific features. With the current Terraform provider, we couldn’t create KsqlDB and Schema Registry resources, so we built them manually. Even with a manual workaround, however, we still applied the principles of GitOps, repeatedly testing and thoroughly documenting all manual steps and capturing them in a BitBucket repository.
In the future, as these manual steps are based on Confluent’s command line interface (CLI), Drivvn’s technical team can choose to script them. For now, we’ve decided to leave them as manual steps: Confluent eventually plans to introduce KsqlDB and Schema Registry resources to its Terraform provider, making a scripted approach redundant.
Result: Drivvn successfully navigates the complexities of adopting Kafka
During this project, Drivvn’s developers upskilled in DevOps and GitOps, learned Terraform best practices, and deployed Confluent Cloud’s managed Kafka solution with Terraform-based GitOps. This deployment allowed Drivvn to switch from batch to real-time streaming data and improve the quality and speed of its platform data.
For Drivvn, automating its Kafka deployment also helped the agency create a reliable, secure, and trusted software delivery process that minimised human error. Over time, the agency will now be able to offload how it maintains and manages its Kafka solution to Confluent Cloud, allowing its developers to focus on more complex tasks.
Read more in our first case study, “How Drivvn migrated from batch to real-time streaming data with Apache Kafka”.
OSO’s contributions to the project
- Leveraged the latest Terraform infrastructure as code modules to integrate Confluent Cloud’s managed Apache Kafka solution with Microsoft Azure
- Introduced a repeatable Apache Kafka deployment process
- Upskilled Drivvn’s developers in DevOps and GitOps
- Fine-tuned Drivvn’s underlying infrastructure provisioning to be secure by default
- Laid the foundations for Drivvn to scale its adoption of real-time streaming data