Public sector

How the Department for Education became a data-synchronous organisation

After carefully analysing its options, the DfE selected a Kafka architecture to improve its digital infrastructure and generate new value from its data.

Discover how designing the right Apache Kafka Architecture helped the Department for Education become a data-synchronous organisation

The increasing cost and complexity of data 

In Boston Consulting Group’s recent study, A New Architecture to Manage Data Costs and Complexity, more than 50% of data leaders surveyed across the US and Europe described their data architecture as overly complex and costly. ‘Today’s reality is that data is everywhere’, commented Justin Borgman, CEO and co-founder of analytics startup Starburst, ‘and companies can’t afford the time, cost, and architectural complexity to centralise it.’

Of the study’s SVP, Director, and Senior Manager respondents, 55% reported that they planned to invest in more federated architecture moving forward. Notably, companies within the telecoms, industrial goods, education, healthcare, finance, retail, and energy sectors all faced similar challenges when it came to their data architecture. Across all represented sectors, companies struggled to manage vendors, identify a single source of truth, and generate concrete value.

An unexpected leader in Kafka data architecture: the Department for Education

Against this backdrop, the Department for Education (DfE) decided to modernise its data infrastructure and minimise data loss. Even though education is associated with slow technological change, the department chose to commission Sion Smith, the CTO of OSO, to analyse and recommend a new data architecture. Like the organisations in the Boston Consulting Group study, the DfE had a data problem. It managed data for thousands of secondary schools across the United Kingdom, collecting information related to curriculums, attendance, and performance ratings. Records languished offline, and when the source data inevitably changed over time, the DfE’s records fell out of sync. So when senior leaders had to make strategic decisions, they spent hours examining and comparing different records, uncertain if they could trust the data.

Who are the Department for Education?

The Department for Education (DfE) is a government department in the United Kingdom responsible for education and children’s services. Mostly, it develops policies and initiatives to enhance how the UK prepares children and young adults for the workforce. Here are a few ways it executes that mandate. 

  1. Sets education policy. 
  2. Funds initiatives. 
  3. Regulates standards. 
  4. Provides oversight, guidance, and support. 
  5. Develops strategies to improve how students succeed post-graduation.

As Sion saw it, the DfE had four choices. It could synchronise its data using an API solution or leverage an enterprise integration platform, a simple event-driven method, or a more custom data integration pipeline.

The DfE’s Four Prospective Data Architecture Solutions

one

APIs

Typically straightforward to implement, but have a number of failure scenarios and may not scale well in synchronising data across distributed systems. Adequate for some data use cases, but not optimal. 

two

Enterprise integration platform

Allows the DfE to synchronise data but requires major architectural changes and causes additional dependencies. Will likely be more expensive both to implement and to keep up operations over time. 

    three

    Kafka architecture

    Synchronises data as soon as an event occurs and can broadcast updates to multiple places. More complex than an enterprise integration platform but not so complex that it can’t be managed through a single programme of work. Flexible and extremely scalable.

      four

      Custom data integration pipelines

      Fits the department’s data strategy but is more applicable to a data migration or analysis project. Like the enterprise integration platform, it can be costly to implement and manage and involves a large number of connectors. 

        Employing a decision matrix: how the DfE selected a Kafka architecture

        But figuring out the pros and cons of each prospective solution was only the start of the decision-making process. The DfE also had to carefully consider factors such as delivery timescale and risk. As a large government department, the DfE shouldered a responsibility to justify its technology implementation decisions. From start to finish, the department needed to display a clear and rational line of thought, researching and rating each architecture before recommending a final solution.

        First, to formalise the process and minimise bias, Sion drew up a decision matrix. He’d evaluate each architecture on five core categories: 

        1. How well it fits the DfE’s requirements
        2. How well it aligns with the DfE’s strategic vision and data over data strategy
        3. Whether its benefits balance out the drawbacks
        4. How rapidly and risk-free could it be rolled out within the department

        Whether the solution delivers enough value to justify its upfront cost

         

        Measured trade-offs: analysing a data architecture’s value for cost

        “Sion — Think twice before you assign a rating to the ‘value-for-money’ matrix box. With a platform-as-a-service solution, you’ll pay more on the front end to use the service…but you might spend less money overall. You’ve used fewer developer and vendor hours, and the solution is less likely to run into failure modes that cost time and money.”

        For areas in which the solution was clearly subpar, he marked the spot with an X—Inadequate. If it seemed sufficient but not optimal, the solution secured a single check. If it exceeded expectations, he gave it two bold checks. At last, he ended up with a finished table filled with plus marks and minuses. As the matrix demonstrates, the APIs method had a good balance of technical pros and cons but would be difficult to scale, being more effectively applied to individual use cases. Meanwhile, the enterprise integration platform failed the ‘value-for-money’ test, and both the custom data pipelines and delivery risk were deemed overall too costly to roll out. Only one column displayed a line of light blue boxes and bold checks.

        APIsEnterprise integration platformKafka architectureCustom data integration pipeline
        Requirements Fit👍🙌
        Strategic Alignment 👍🙌👍
        Technology Pros and Cons👍🙌👍
        Delivery Timescale and Risk 👍
        Value for Money👍👍👍
        Recommendation👍
        Appraising the Options – Data Architecture

        To simplify its ratings to present to non-technical team members, the DfE went with an easy, three-level rating system.

         

        🙌 – Fully met or exceeded requirements

        👍 – Adequate / viable / feasible

        ❌ – Inadequate

         

        The decision matrix in which Option 2 won out was the event-driven method that matched the DfE’s overall requirements, aligned with the strategic direction, and played well across all five categories.

        The event-driven method scored well across categories. ‘Overall, the Apache Kafka architecture approach was much less complex and introduced much less overhead’, Sion explains. ‘There’s a cost to deliver the programme, but the benefits, over time, pay off. Teams can adopt the new Kafka architecture at their own pace as they improve their knowledge and skill maturity.’  

        Event-driven Kafka architecture had an appropriate delivery timescale and was a fit with the department’s core requirements and broad strategic alignment of providing the ability of streaming data from real-time applications. This particular architecture also broadcast events simultaneously to multiple subscribers, which would allow the DfE to keep track of curriculum and policy changes in individual schools nationwide in real time.

        Confluent Enterprise Platform: The best fit for the DfE’s synchronised data aspirations

        Once the department had chosen to move forward with the Kafka architecture, relevant internal stakeholders had to evaluate three data integration tool options: Microsoft’s Azure Integration resources, Confluent Enterprise Platform, and Informatica Integration Cloud Services (IICS). After much deliberation, they decided to select the Confluent Enterprise Platform due to its accessibility, capabilities, features, functionality, and scalability.  

        Confluent’s Enterprise Support Agreement raised the cost of owning and managing the system ever so slightly, but not enough to cancel out its overall value. ‘With Confluent and [Kafka], you have this confidence that your data is consistently up-to-date across platforms’, says Sion. ‘For Head of Data, CMO, Head of Innovation folks, that’s a huge deal. You have the ability to rely on real-time data to make enterprise decisions.’

        Confluent Enterprise Platform:
        Capabilities and Characteristics

        • Supports massive Kafka cluster scalability and high availability for enterprise platforms.
        • Includes Kafka Connect, a connector framework and over 100 connectors for a variety of data sources and sinks (consumers).
        • Provides schema validation and strong typing, which limits messaging errors and consumer problems.
        • Supports durability for different client requirements.
        • Offers a number of different management and monitoring tools to run service operations across all producers and consumers
        • Cloud Platform-agnostic and suitable for integrating services in Gov.UK PaaS and MS Azure.
        • Kafka stream processing, to enable real-time applications and data transformation

        After finalising his recommendation for Kafka architecture and Confluent Enterprise Platform, Sion felt confident in the department’s future. He’d given the DfE a broad, balanced set of options. Now, once the department shifts to an event-driven method in which producers always push updated data to multiple brokers, its senior leaders can finally focus on using that data to drive improved standards across the UK’s education system. The ability to handle large volumes of data, segmenting departments in Kafka topics and using replication to push updates across multiple data sources also allowed the team to scale with the business demand.

        Cohesive data for a forward-looking organisation

        After working with the DfE to optimise its decision-making process, Sion hopes that data and innovation leaders recognise that Kafka architecture is more accessible than ever for companies of all shapes and sizes and across sectors—from energy to education.

        ‘Implementing architecture of Kafka isn’t just for early adopters’, he concludes. ‘It’s for everyone. If you’re struggling with siloed data, you can absolutely use it to generate value for your organisation.’

        Accelerate your Kafka adoption with the right support

        Get in touch with us today to discover how you can leverage your data to build a faster, more responsive business

        Book a call