Peter Birkholm-Buch

Stuff about Software Engineering

From DevOps to Platform Engineering: How Gaia Transformed Our Approach to Infrastructure, Alignment, and Developer Experience

Introduction

In the world of cloud development, managing infrastructure effectively while maintaining alignment across teams is a constant challenge. Historically, our DevOps team played a pivotal role in provisioning and managing cloud resources, ensuring developers had what they needed to build and deploy solutions. However, this model wasn’t sustainable as the number of projects grew and cloud environments became more complex. We needed a way to streamline infrastructure management without losing sight of alignment across teams and solutions, while also improving the overall Developer Experience (DevEx).

This realization led us to shift our DevOps team from a traditional support role into a platform engineering team, focused on building and maintaining tools that provide a golden path for developers. The result? Gaia—a platform that has radically transformed how we manage cloud infrastructure, maintain alignment throughout the organization, and drastically improve Developer Experience by embedding infrastructure creation into developers’ existing workflows.

The Evolution from DevOps to Platform Engineering

When we started, our DevOps team handled infrastructure provisioning manually and on a request basis. While this ensured quality control, it created bottlenecks as the number of requests grew, leading to slower project deliveries. Developers were often left waiting for infrastructure to be set up, while the DevOps team struggled to keep up with the workload.

This wasn’t a scalable model, so we pivoted. Rather than manually provisioning infrastructure, we built Gaia—a platform that automates infrastructure creation while maintaining alignment with company policies. Gaia represents our “golden path”—a set of pre-built modules that allow developers to provision cloud resources without needing to worry about governance, security, or configuration inconsistencies.

Not only did Gaia eliminate bottlenecks, but it also integrated directly into the GitHub workflow developers were already using, significantly improving Developer Experience. Developers now interact with the same tools they use for coding, making infrastructure requests feel like a natural extension of their development work.

The Remarkable Impact of Gaia on Developer Experience

Gaia’s impact has been nothing short of remarkable. By automating the infrastructure creation process, we’ve effectively removed the need for the DevOps team to manually create infrastructure for developers. Developers now have a self-service capability to quickly and easily provision what they need on their own, directly from within their existing GitHub workflows, without waiting for approval or intervention from the DevOps team.

This seamless integration has significantly improved Developer Experience in several key ways:

  • Familiarity: Developers don’t have to learn new tools or processes to request infrastructure. They use GitHub, the platform they are already familiar with, ensuring minimal friction when interacting with infrastructure.
  • Speed and Efficiency: With Gaia, infrastructure requests are submitted via GitHub pull requests (PRs), allowing developers to spin up resources quickly. This eliminates the lag time that often occurs when requests are handled through manual ticketing systems.
  • Embedded Governance: Developers no longer have to worry about compliance or governance rules. Every infrastructure resource created via Gaia is automatically aligned with company policies, freeing developers to focus entirely on building solutions without getting bogged down in regulatory details.

By embedding infrastructure creation into the developer workflow through GitHub, Gaia significantly boosts DevEx. Developers are empowered to take control of infrastructure setup, while still benefiting from built-in quality and governance checks that ensure alignment with the company’s standards.

The New Focus of Our Platform Engineering Team

With manual infrastructure creation largely eliminated, the role of the DevOps team has shifted to that of a platform engineering team. Their primary focus is now on maintaining Gaia and the shared modules that are used to provision infrastructure. Whenever new infrastructure resources or cloud services are introduced, the team ensures they are incorporated into Gaia in a way that adheres to company policies, ensuring alignment as our cloud architecture evolves.

This centralized approach allows the platform engineering team to ensure that the development process is as smooth as possible, enhancing the overall Developer Experience by constantly improving the tools developers rely on. Developers no longer need to spend time learning about the intricacies of cloud infrastructure or worry about whether their configurations meet governance requirements.

Integrating Infrastructure Creation into the Developer Workflow

One of the most significant achievements of Gaia is how seamlessly it integrates into the developer workflow. As mentioned, we built Gaia to work within a central repository in GitHub, where developers create pull requests to request infrastructure. These PRs are then reviewed and approved by the platform engineering team, ensuring that every infrastructure change aligns with company policies and best practices.

By embedding infrastructure creation into the PR process, we’ve achieved several goals:

  • Speed: Developers can request infrastructure as part of their normal workflow, without delays or waiting for separate approvals.
  • Quality Control: The PR process provides a natural checkpoint for the platform engineering team to ensure consistency and alignment across all teams and solutions.
  • Alignment: Centralizing infrastructure requests in a single repository ensures that all teams are working from the same set of standards, preventing silos and ensuring that every team follows best practices.
  • Enhanced Developer Experience: Since developers no longer need to switch between tools or wait for external teams, the process feels fluid and integrated. This reduces the cognitive load on developers and enables them to focus more on writing code and building features rather than managing infrastructure logistics.

Gaia’s GitHub-based process has streamlined how developers interact with infrastructure, further aligning infrastructure creation with developer workflows and enhancing their experience by reducing friction and improving productivity.

Conclusion

The transition from a traditional DevOps model to a platform engineering team centered around Gaia has been a game changer for us. By providing developers with a golden path for creating infrastructure, we’ve freed up their time to focus on what they do best: building innovative software solutions. At the same time, we’ve ensured that every infrastructure deployment is aligned with our policies and governance frameworks, without the need for constant oversight.

Gaia has made our infrastructure provisioning faster, more reliable, and more scalable, while allowing our platform engineering team to focus on higher-level work—maintaining the tools that enable this. By embedding infrastructure creation into GitHub workflows, we’ve also enhanced Developer Experience, making infrastructure provisioning a natural extension of the development process.

The future of DevOps, for us, lies in platform engineering, where teams enable developers rather than managing infrastructure requests. Alignment and Developer Experience are no longer afterthoughts—they’re built into the process, ensuring that as we scale, we do so efficiently, consistently, and with a developer-centric approach.

Gaia was built by:

Balancing Autonomy and Alignment in Engineering Teams

The Spotify model has often been referenced as a way to structure engineering teams for agility and independence. It promotes business-owned product teams, where engineers report into product owners, and uses guilds to ensure that teams stay aligned on best practices. However, guilds often become more like “book clubs,” where participation is optional and relies on personal time. This happens because business line managers prioritize deliverables over cross-organizational collaboration, making it difficult to maintain alignment at scale.

Meanwhile, Team Topologies offers a different focus, looking at how different types of teams interact and organize. It doesn’t rely on guilds but instead emphasizes reducing dependencies and clarifying team responsibilities.

One of the main reasons I organize engineers into a single reporting line, rather than under product ownership, is to avoid these pitfalls. By centralizing the reporting structure, I can prioritize time for engineers to focus on cross-organizational standards and collaboration, ensuring alignment across teams without relying on optional participation.

The Importance of Alignment and Shared Processes

While models like Spotify emphasize team independence, they sometimes miss the mark on alignment. It’s critical that teams don’t end up siloed, each solving the same problems in different ways, or worse, working against established company practices. This is where alignment on best practices, methods, and tools becomes crucial.

Take the US Navy SEAL teams as an example. They are known for their ability to operate independently, much like Scrum teams. However, what people tend to overlook is that all SEAL teams undergo the same training, use the same equipment, and follow standardized methods and processes. This shared foundation is what allows them to operate seamlessly when they come together, even though they work independently most of the time.

In the same way, my approach ensures that engineering teams can solve problems on their own, but they’re aligned on best practices, tools, and processes across the company. This alignment prevents the issues often seen in the Spotify model, where teams risk becoming too focused on their own product work, losing sight of the bigger organizational picture.

Scrum Teams Need Independence from Authority

In Scrum teams, the issue goes beyond just estimation—it’s about the entire collaboration model. Scrum is designed to foster equal collaboration, where team members work together, estimate tasks, and solve problems without a hierarchy influencing decisions. When someone on the team, such as a Product Owner, is also the boss, this balance is broken. The premise of Scrum, which relies on collective responsibility and open communication, collapses.

If the Product Owner or any other leader on the team has direct authority over the others, it can lead to a situation where estimates are overridden, team members feel pressured to work longer hours, and decisions are driven by power dynamics rather than collaboration. This undermines the core principles of Scrum, where the goal is for teams to self-organize and be empowered to make their own decisions.

By keeping authority structures out of the Scrum team, we ensure that collaboration is truly equal, and that decisions are made based on the team’s expertise and collective input—not on the directives of a boss.

How We Balance Autonomy and Alignment

Instead of organizing engineers strictly around product owners and budgets—like in the Spotify model—we’ve created a framework where engineers report through a central engineering line. This keeps everyone on the same page when it comes to methods and processes. Engineers still work closely with product teams, but they don’t lose sight of the bigger picture: adhering to company-wide standards.

This approach solves a problem common in both the Spotify and Team Topologies models. In Spotify, squads may go off and build things their way, leading to inconsistencies across the organization. In Team Topologies, stream-aligned teams can become too focused on optimizing their flow, which sometimes means diverging from company-wide practices. By maintaining a central engineering line, we keep our teams aligned while still giving them the autonomy they need to innovate and move quickly.

The Result

Our approach strikes a balance. Teams are free to innovate and adapt to the challenges of their product work, but they aren’t reinventing the wheel or deviating from best practices. We’ve managed to avoid the pitfalls of silos and fragmented processes by ensuring that every team operates within a shared framework—just like how SEAL teams can work independently, but they all share the same training, tools, and methods.

At the end of the day, it’s not about limiting autonomy; it’s about creating the right kind of autonomy. Teams should be able to act independently, but they should do so in a way that keeps the organization moving in the same direction. That’s the key to scaling effectively without losing sight of what makes us successful in the first place.

MACH Architecture: The Promise of Speed, The Reality of Integration Complexity

Introduction

In recent years, the MACH principles—Microservices, API-first, Cloud-native, and Headless—has been touted as the future of composable, agile digital platforms. Advocates of MACH argue that it allows organizations to innovate quickly by decoupling core systems and exposing functionality through APIs. However, as appealing as this modular approach is, it doesn’t introduce fundamentally new principles. In fact, it builds on ideas that have been around for decades, particularly from Service-Oriented Architecture (SOA). See also The Four Tenets of SOA.

Yet, the biggest challenge lies not in the principles or technologies themselves but in the complexity of integration. While MACH tends to gloss over integration issues by positioning APIs as the ultimate solution, the reality is much more nuanced—especially when viewed through the lens of Pace Layered Architecture.

MACH vs. SOA: New Technology, Old Principles

MACH principles are often positioned as cutting-edge compared to older architectures like SOA. However, if we break it down:

  • Microservices in MACH are essentially an extension of SOA’s focus on service decomposition.
  • API-first continues SOA’s emphasis on loose coupling through well-defined interfaces.
  • Cloud-native leverages modern cloud infrastructure, but the idea of distributed systems was central to SOA.
  • Headless separates the front-end from the back-end, much like SOA’s separation of presentation and business layers.

In essence, MACH doesn’t introduce new architectural principles but rather modernizes existing ones with updated technology stacks and operational patterns.

The Forgotten Challenge: Integration

While MACH focuses on modularity, flexibility, and speed, it tends to oversimplify the complexities of integration. Simply saying “APIs solve everything” is reductive. APIs are indeed critical, but the actual process of integrating systems—especially in complex, distributed environments—requires addressing far more nuanced challenges. Integration is not a one-size-fits-all solution; it involves handling synchronous, asynchronous, and event-driven communication depending on the use case.

For example:

  • Synchronous vs. Asynchronous: Synchronous APIs may introduce latency or timeout issues, while asynchronous patterns require careful coordination to ensure data consistency across systems.
  • Event-Driven Architectures: While event-driven patterns can reduce complexity at certain levels, they also introduce new challenges, like handling event sequencing and guaranteeing delivery in distributed systems.

For even more complexity watch “I Made Everything Loosely Coupled. Does My App Fall Apart?” by Gregor Hohpe.

Pace Layered Architecture: A Critical Lens

One of the most valuable ways to understand the complexity of integration is through the Pace Layered Architecture model. This model divides systems based on their rate of change:

  • Systems of Record: These foundational systems (like ERP and finance) change slowly and are stable. Integration at this layer is generally easier because the interfaces are well-defined and don’t change often.
  • Systems of Differentiation: These systems allow for more customization and differentiation, such as customer loyalty programs or product catalogs, and change moderately.
  • Systems of Innovation: Fast-moving, innovation-driven systems, such as user interfaces or headless CMS, evolve rapidly.

If we look at the MACH principles through the lens of a Pace Layered Architecture, then the further down the stack, the slower the systems change and the easier the integration becomes. Conversely, at the top of the stack—where MACH promises speed and flexibility—the challenge of integrating diverse systems and managing rapid change becomes significantly harder.

The Reality of Integration Complexity

As the Pace Layered Architecture model shows, systems lower in the stack (such as ERP or finance) are easier to integrate because they change less often. However, the MACH promise of speed, especially in the higher layers (front-end, APIs), requires solving difficult integration problems. 

In the real world, APIs alone don’t automatically resolve the complexities of integrating systems across different domains and layers. You need to account for:

  • Latency and data consistency: Ensuring real-time data syncs across systems that update at different speeds.
  • Governance and ownership: Defining who controls the data and how changes are propagated across services.
  • Error handling and recovery: Building resilience into systems so that failures in one service don’t cascade into others.

Conclusion

MACH architecture brings modern technology to the table, but it doesn’t fundamentally change the underlying principles of service orientation or integration. While APIs are a key enabler of composable systems, they don’t eliminate the complexities of integration—especially in a world where synchronous, asynchronous, and event-driven patterns coexist. 

Understanding these challenges through the lens of Pace Layered Architecture helps us see that the deeper we go into the stack, the easier integration becomes. But achieving the speed and flexibility that MACH promises at the higher layers requires solving some of the hardest integration problems in modern software architecture.

Do you have GitHub Copilot?

Is a question I’ve been getting more and more at job interviews over the past year and when I say yes we’ve been using it for almost two years I see happy faces.

So having access to GitHub Copilot not only is a key decision making factor for software engineers looking to join your organization but also GitHub Copilot Probably Saves 50% of Time for Developers and GitHub Copilot drives better Developer Experience.

GitHub was also named a Leader in the Gartner first-ever Magic Quadrant for AI Code Assistants: https://github.blog/news-insights/company-news/github-named-a-leader-in-the-gartner-first-ever-magic-quadrant-for-ai-code-assistants

So if you’re a Software Engineering Leader there’s really no (business) reason not to get GitHub Copilot (or any other AI Coding Assistant) for your developers – it will (soon) be a requirement by new hires.

Standard: Event Driven Architectures

This page is an example standard which explains how all systems in Software Engineering in Digital Products in Carlsberg must use event driven architectures.

For context please see: Software Architecture Patterns

This means that each system or service that masters data (Order Service, Customer Service etc.) must raise (create, produce, publish, etc) an appropriate event to the central event hub using a specific topic every time a pre-defined action (business or just plain CRUD) occurs on data in that system or service. Systems or services which listens to these events (consumers, subscribers etc) can then based on meta data in the event decide if it’s required to go to the system or service API and get the full data which is associated with the event.

This means that every system or service which we build must support the following architecture pattern:

This is the order of events:

  • The client performs an action through the API on Service A
  • The change of state on the data in the Service causes the Service to raise an event on the central event hub on a specific topic B
  • The event hub notifies all the consumers that an event on topic B has occurred
  • The consumer examines the event and decides if it’s required to call the producers API to get the entire data of the event
  • The consumer call the producers API with a direct link to get the entire data of the event (ideally using a unique identifier so that the data can be found in the in storage layer without having to search for it)

Building a Better Software Practice: A Guide to Policies, Rules, Standards, Processes and Guidelines

Introduction

When organizations add more people (scale up) it quickly becomes impossible to have everyone sitting in the same room to discuss and agree on matters and share the same approach to doing business.

This is my version of a Software Engineering Quality Handbook where it’s easy to get an overview of how an organization can use a well-structured hierarchy for describing how work is done which is crucial for ensuring clarity, consistency, and compliance within a team. The handbook is built up with 5 layers:

  • Policies: Broad statements that define the organization’s principles and compliance expectations.
  • Rules: Strict directives that must be followed to ensure specific outcomes in certain situations.
  • Standards: Mandatory technical and operational requirements that ensure consistency and quality.
  • Processes: Detailed, step-by-step instructions that specify how to perform specific tasks.
  • Guidelines: Recommended best practices that guide decision-making but are not mandatory.

Each layer of this structure supports the one above it, providing more detail and specificity. Policies and rules sets the foundation and create alignment within the goals of Software Engineering, while standards, procedures and guidelines provide the specifics on how to achieve those goals effectively.

This structure also facilitates easier updates and management. Policies and standards often require more thorough review and approval processes due to their impact and scope, whereas guidelines and procedures can be more dynamic, allowing for quicker adaptations to new technologies or methodologies..

Policies :scroll:

  • Definition: Broad, high-level statements of principles, goals, and overall expectations of the organization.
  • Purpose: To establish core values, company vision, and overarching compliance standards.
  • Lifecycle: Reviewed annually to ensure alignment with evolving legal, technological, and business conditions.
  • Compliance: Mandatory; non-compliance can result in significant legal and business risks.
  • Icon: :scroll: (scroll). It symbolizes official documents, which aligns well with the formal, foundational nature of policies. It’s often used to represent ancient laws and decrees, making it fitting for the foundational rules and standards within an organization.
  • Example: A policy might state that all software developed must comply with GDPR and other relevant data protection regulations.

Rules :scales:

  • Definition: Explicit, often granular directives that are compulsory and usually narrow in scope.
  • Purpose: To ensure specific outcomes or behaviors in particular scenarios.
  • Lifecycle: Reviewed frequently (e.g., annually) to refine and ensure they address current challenges effectively and are adhered to.
  • Compliance: Strictly mandatory; non-negotiable and must be followed exactly as prescribed.
  • Icon: :scales: (scales). It represents justice, balance, and fairness, aligning with the concept of rules ensuring specific outcomes and behaviors are maintained in a structured and equitable manner within an organization.
  • Example: A rule might state that commit messages must include a ticket number from the issue tracker.

Standards :straight_ruler:

  • Definition: Specific mandatory requirements for how certain policies are to be implemented.
  • Purpose: To ensure consistency and quality across all projects by defining technical and operational criteria that must be met.
  • Lifecycle: Updated biennially or as needed to reflect new industry practices and technological advancements.
  • Compliance: Mandatory; essential for maintaining quality and uniformity in outputs.
  • Icon: :straight_ruler: (ruler). It symbolizes measurement, precision, and consistency, which align closely with the idea of standards setting specific requirements and guidelines to ensure quality and uniformity across projects.
  • Example: A standard might specify that all code must undergo peer review or adhere to a particular coding standard like ISO/IEC 27001 for security.

Processes :tools:

  • Definition: Detailed, step-by-step instructions that must be followed in specific situations.
  • Purpose: To ensure activities are performed consistently and effectively, especially for complex or critical tasks.
  • Lifecycle: Regularly tested and updated, ideally after major project milestones or annually, to adapt to process improvements and feedback.
  • Compliance: Mandatory where specified; critical for ensuring consistency and reliability of specific operations.
  • Icon: :tools: (hammer and wrench). It symbolizes tools and construction, fitting for the concept of processes as they provide detailed, step-by-step instructions necessary to construct or execute specific tasks systematically and efficiently.
  • Example: A process might outline the steps for a release process, including code freezes, testing protocols, and deployment checks.

Guidelines :compass:

  • Definition: Recommended approaches that are not mandatory but are suggested as best practices.
  • Purpose: To guide developers in their decision-making processes by providing options that align with best practices.
  • Lifecycle: Evaluated and possibly revised every two to three years, or more frequently to incorporate innovative techniques and tools.
  • Compliance: Optional; best practice recommendations that are advisable but not required.
  • Icon: :compass: (compass). It symbolizes guidance, direction, and navigation, which aligns well with the purpose of guidelines to provide recommended approaches and best practices that help steer decisions in software development.
  • Example: Guidelines might suggest using certain frameworks or libraries that enhance productivity and maintainability but are not strictly required.

GitHub Copilot Probably Saves 50% of Time for Developers

Introduction

Recently GitHub released the GitHub Copilot Metrics API which provides customers the ability to view how Copilot is used and as usual someone created an Open Source tool to view the data: github-copilot-resources/copilot-metrics-viewer.

So let’s take a look at the usage of Copilot in Software Engineering in Carlsberg from end of May to end of June 2024.

I’m focusing on the following three metrics:

  • Total Suggestions
  • Total Lines Suggested
  • Acceptance Rate

As I think they are useful for understanding how effective Copilot is and I would like to get closer to an actual understanding of the usefulnes of Copilot rather than the broad statement offered by both GitHub and our own developers that it saves 50% of their time.

The missing data in the charts is due to an error in the GitHub data pipeline at the time of writing and data will be made available at a later stage.

The low usage in the middle of June is due to some public holidays with lots of people taking time off.

Total Suggestions

Total Lines Suggested: Showcases the total number of lines of code suggested by GitHub Copilot. This gives an idea of the volume of code generation and assistance provided.

Total Lines Suggested

Total Lines Accepted: The total lines of code accepted by users (full acceptances) offering insights into how much of the suggested code is actually being utilized incorporated to the codebase.

Acceptance Rate

Acceptance Rate: This metric represents the ratio of accepted lines to the total lines suggested by GitHub Copilot. This rate is an indicator of the relevance and usefulness of Copilot’s suggestions.

Conclusion

The overall acceptance rate is about 20% which resonates with my experience as Copilot tends to either slightly miss the objective and/or be verbose so that you have to trim/change a lot of code. So if Copilot suggests 100 lines of code you end up accepting 20.

Does this then align with the statements from developers in Software Engineering and GitHub which claim that you save 50% of time using Copilot?

Clearly reviewing and changing code is faster than writing, so even if you end up only using 20% of the suggested code, you will save time.

Unfortunately we don’t track actual time to complete tasks in Jira, so we don’t have hard data to prove the claim.

But is the claim true? Probably – however, I’m 100% convinced that GitHub Copilot drives better Developer Experience.

Patterns for Artificial Intelligence Solutions

Introduction

This is an attempt at trying to create some highlevel patterns for Artificial Intelligence (AI) solutions in order to be able to more easily choose a pattern based on type and problem area.

The goal is to have something like Software Architecture Patterns which again is based on Useful resources on Software & Systems Architecture so that we quickly can choose how to solve problems with AI. This page is also a companion to Four and not 3 Categories of AI Solutions as the patterns on this page is meant for AI-Brewers.

Emerging Types of AI Solution Patterns

  • Chatbots with LLMs: Automate interactions and provide instant, contextually relevant responses, enhancing customer service, information retrieval, and user engagement across various domains like e-commerce and healthcare.
  • Chaining LLMs: Link multiple LLMs in sequence, leveraging their specialized capabilities for more nuanced and accurate solutions, enabling sophisticated workflows where each model performs tasks it excels at.
  • State machines and directed graphs: This approach introduces cycles, meaning the system can loop back and reconsider previous steps, which allows for more complex decision-making and adaptive behaviors. The state machine can maintain a state based on previous interactions, which can influence future decisions and actions.
  • Orchestrating LLMs: Simplify the integration and management of multiple LLMs to work in harmony, improving the development, performance, and scalability of AI-driven applications by leveraging the strengths of diverse models.

Would be a neglect not to mention RAG here although more of a feature than a solution pattern:

  • Retrieval Augmentation Generation (RAG): RAG combines the power of LLMs with a retrieval mechanism to enhance response accuracy and relevance. By fetching information from a database or collection of documents before generating a response, RAG models can provide answers that are more detailed and contextually appropriate, drawing from a wide range of sources. This approach significantly improves performance on tasks requiring specific knowledge or factual information, making RAG models particularly useful for applications like question answering and content creation.

Graphs and orchestration is also commonly referred to as “Agentic” architectures.

Chatbots with LLMs

Bots and chat-based interfaces powered by Large Language Models (LLMs) address a wide array of problem areas by automating interactions and processing natural language inputs to provide instant, contextually relevant responses.

These AI-driven solutions revolutionize customer service, information retrieval, and interactive experiences by enabling scalable, 24/7 availability without the need for human intervention in every instance.

They excel in understanding and generating human-like text, making them ideal for answering queries, offering recommendations, facilitating transactions, and supporting users in navigating complex information landscapes.

Furthermore, they significantly enhance user engagement by providing personalized interactions, thereby improving satisfaction and efficiency in areas such as e-commerce, education, healthcare, and beyond. By harnessing the capabilities of LLMs, bots and chat interfaces can decode intricate user intents, engage in meaningful dialogues, and automate tasks that traditionally required human intelligence, thus solving key challenges in accessibility, scalability, and automation in digital services.

Chaining LLMs

Chaining LLMs involves linking multiple LLMs in sequence to process information or solve problems in a stepwise manner, where the output of one model becomes the input for the next. This technique utilizes the specialized capabilities of different LLMs to achieve more complex, nuanced, and accurate solutions than could be provided by any single LLM.

Through this approach, developers can create advanced workflows in which each model is tasked with a specific function it excels at, ranging from understanding context to generating content or refining answers. This method significantly enhances the effectiveness and efficiency of AI systems, allowing them to address a wider variety of tasks with greater precision and contextual relevance. Chaining LLMs thus represents a strategic approach to leveraging the complementary strengths of various models, paving the way for more intelligent, adaptable, and capable AI-driven solutions.

Chaining LLMs is particularly effective for solving problems that benefit from a multi-step approach, where each step might require a different kind of processing or expertise. Here are some examples of problems typically solved using chaining:

  • Complex Query Resolution: Simplifying and addressing multifaceted queries through a stepwise refinement process.
  • Content Creation and Refinement: Generating drafts and then improving them through editing, summarization, or styling in successive steps.
  • Decision Support Systems: Deriving insights and suggesting actions through a sequential analysis and decision-making process.
  • Educational Tutoring and Adaptive Learning: Providing personalized educational feedback and instruction based on initial assessments.

These examples highlight the versatility of chaining LLMs, enabling solutions that are not only more sophisticated and tailored but also capable of handling tasks that require depth, precision, and a layered understanding of context.

Directed Graphs

State machines (a directed graph) are abstract machines that can be in exactly one of a finite number of states at any given time. In the context of LLMs and LangChain, a state machine would manage the flow of interactions with the LLM, keeping track of the context and state of conversations or processes.

  • LangGraph is designed to facilitate the creation, expansion, and querying of knowledge graphs using language models. It uniquely combines the representational power of knowledge graphs—structures that encode information in a graph format where nodes represent entities and edges represent relationships between entities—with the generative and understanding capabilities of language models. This integration allows for sophisticated semantic reasoning, enabling applications to derive insights and answers from a rich, interconnected dataset.
  • The primary value of LangGraph lies in its ability to leverage the contextual awareness and depth of language models to enrich knowledge graphs. This makes it particularly well-suited for applications requiring complex query answering, semantic search, and dynamic knowledge base expansion. It’s about not just processing language but understanding and organizing information in a way that mirrors human cognition.

Orchestrating LLMs

A framework for orchestrating LLMs is aimed at tackling the intricate challenges of integrating and managing multiple LLMs to work in harmony. Such a framework simplifies the process of combining the capabilities of diverse LLMs, enabling developers to construct more complex and efficient AI-driven solutions. It offers tools and methodologies for seamless integration, enhancing the development process, and allowing for the creation of applications that leverage the strengths of various LLMs. This not only streamlines the development of sophisticated applications but also boosts their performance and scalability, facilitating the customization of AI solutions to meet specific needs and contexts.

Orchestrating LLMs involves coordinating multiple models to work together efficiently, often in parallel or in a dynamic sequence, to tackle complex tasks. This approach is particularly useful for problems that benefit from the combined capabilities of different LLMs, each bringing its unique strength to the solution. Here are some examples of problems typically solved using orchestration:

  • Multi-domain Knowledge Integration: Coordinating specialized LLMs to offer solutions that require expertise across various fields.
  • Personalized User Experiences: Dynamically combining LLM outputs to customize interactions according to user data.
  • Complex Workflow Automation: Utilizing different LLMs for distinct tasks within a broader workflow, optimizing for efficiency and effectiveness.
  • Advanced Customer Support Systems: Integrating various LLMs to understand, process, and respond to customer inquiries in a nuanced and effective manner.

Orchestration enables the leveraging of multiple LLMs’ strengths in a coordinated manner, offering solutions that are more versatile, scalable, and capable of addressing the multifaceted nature of real-world problems.

Software Architecture Patterns

Introduction

The following is an subset of software architecture patterns, which tend to be referenced when academic discussions around patterns arise. The following are my comments.

Patterns

Application Architecture

Microservice

The microservice pattern comes from domain-driven design where in particular the concept of bounded context came to be the decoupling of services. The post Microservices by Martin Fowler also played a large part in naming this pattern.

Drivers:

  • Loosely coupled with other services – enables a team to work independently the majority of time on their service(s) without being impacted by changes to other services and without affecting other services
  • Independently deployable – enables a team to deploy their service without having to coordinate with other teams
  • Capable of being developed by a small team – essential for high productivity by avoiding the high communication head of large teams
  • The application must be easy to understand and modify
  • You must run multiple instances of the application on multiple machines in order to satisfy scalability and availability requirements
  • You want to take advantage of emerging technologies (frameworks, programming languages, etc)

Problems:

  • Managing dependencies
  • Deployments of the entire system may become complex
  • Developers must implement the inter-service communication mechanism and deal with partial failure
  • Implementing requests that span multiple services is more difficult
  • Testing the interactions between services is more difficult
  • Implementing requests that span multiple services requires careful coordination between the teams

Database

Database per Service

This pattern is basically the natural follow-on to choosing the microservice application architecture pattern, where if a service is to become independant, then it must have it’s own independant data layer.

Drivers:

  • Services must be loosely coupled so that they can be developed, deployed and scaled independently
  • Databases must sometimes be replicated in order to scale
  • Different services have different data storage requirements, like relational database or NoSQL

Problems:

  • How to manage consistency across services
  • Who owns (masters) data?

Messaging

Messaging is a communications pattern which uses asynchronous messaging to replace the synchronous style of request/response used in most REST-style APIs. Most common styles of asynchronous messaging are:

  • Notifications – a sender sends a message a recipient but does not expect a reply. Nor is one sent.
  • Request/asynchronous response – a service sends a request message to a recipient and expects to receive a reply message eventually
  • Publish/subscribe – a service publishes a message to zero or more recipients
  • Publish/asynchronous response – a service publishes a request to one or recipients, some of whom send back a reply

In the following there’s no differentiation between “event”-driven and “data”-driven, as a message will always contain the full message body.

Event Driven with full Messages

Drivers:

  • The complete body of the message must be sent in each event
  • The broker may or may not allow for queries against the messages
  • Extreme loose coupling as the sender is completely decoupled from the receiver

Problems:

  • Subscriber/consumer overload which requires buffering so that events are not lost
  • The broker/mediator must always be available
  • Almost always creates mixed messaging styles with events for downstream and request/response across and upstream which requires carefull planning

Broker

Event driven messaging with a broker implies that all messages are delivered through a central broker but that there’s no processing control flow and messages are delivered using a publish/subscribe pattern.

Drivers:

  • Scalability

Problems:

  • How to handle orchestration? With “normal” synchronous request/response messaging services orchestrate business processes in the order that they are called – with broker driven messaging everybody potentially could get the same message at the same time, so who orchestrates the overall process?

Mediater

A mediater expands on the broker with support for business process workflows usually with support for BPEL.

Drivers:

  • Business processes change often
  • Traceability and governance is important

Problems:

  • Management and governance of platform takes time
  • Requires turnkey solutions from major vendors
  • Not really available as a cloud solution

Event Driven with Notifications

Drivers:

  • You don’t know how many subscribers/consumers of your events there will be, so you must preserve bandwidth
  • The size of the payload in the event is limited
  • A subset of the events doesn’t require the complete body of the message to be sent
  • Subscribers explicitly want to chose which message bodies they want to pull

Problems:

  • Requires a two-step dance where the consumer of an event must issue a request to the central broker and ask for the full body of the message
« Older posts

© 2024 Peter Birkholm-Buch

Theme by Anders NorenUp ↑