Is a question I’ve been getting more and more at job interviews over the past year and when I say yes we’ve been using it for almost two years I see happy faces.
So if you’re a Software Engineering Leader there’s really no (business) reason not to get GitHub Copilot (or any other AI Coding Assistant) for your developers – it will (soon) be a requirement by new hires.
This page is an example standard which explains how all systems in Software Engineering in Digital Products in Carlsberg must use event driven architectures.
This means that each system or service that masters data (Order Service, Customer Service etc.) must raise (create, produce, publish, etc) an appropriate event to the central event hub using a specific topic every time a pre-defined action (business or just plain CRUD) occurs on data in that system or service. Systems or services which listens to these events (consumers, subscribers etc) can then based on meta data in the event decide if it’s required to go to the system or service API and get the full data which is associated with the event.
This means that every system or service which we build must support the following architecture pattern:
This is the order of events:
The client performs an action through the API on Service A
The change of state on the data in the Service causes the Service to raise an event on the central event hub on a specific topic B
The event hub notifies all the consumers that an event on topic B has occurred
The consumer examines the event and decides if it’s required to call the producers API to get the entire data of the event
The consumer call the producers API with a direct link to get the entire data of the event (ideally using a unique identifier so that the data can be found in the in storage layer without having to search for it)
When organizations add more people (scale up) it quickly becomes impossible to have everyone sitting in the same room to discuss and agree on matters and share the same approach to doing business.
This is my version of a framework for a Software Engineering Quality Handbook where it’s easy to get an overview of how an organization can use a well-structured hierarchy for describing how work is done which is crucial for ensuring clarity, consistency, and compliance within a team. The handbook is built up with 5 layers:
Policies: Broad statements that define the organization’s principles and compliance expectations.
Rules: Strict directives that must be followed to ensure specific outcomes in certain situations.
Standards: Mandatory technical and operational requirements that ensure consistency and quality.
Processes: Detailed, step-by-step instructions that specify how to perform specific tasks.
Guidelines: Recommended best practices that guide decision-making but are not mandatory.
Governance and Compliance Controls: A collection of key controls and processes designed to ensure adherence to governance standards and compliance with both internal policies and external regulations.
Each layer of this structure supports the one above it, providing more detail and specificity. Policies and rules sets the foundation and create alignment within the goals of Software Engineering, while standards, procedures and guidelines provide the specifics on how to achieve those goals effectively.
This structure also facilitates easier updates and management. Policies and standards often require more thorough review and approval processes due to their impact and scope, whereas guidelines and procedures can be more dynamic, allowing for quicker adaptations to new technologies or methodologies..
Policies
Definition: Broad, high-level statements of principles, goals, and overall expectations of the organization.
Purpose: To establish core values, company vision, and overarching compliance standards.
Lifecycle: Reviewed annually to ensure alignment with evolving legal, technological, and business conditions.
Compliance: Mandatory; non-compliance can result in significant legal and business risks.
Icon: (scroll). It symbolizes official documents, which aligns well with the formal, foundational nature of policies. It’s often used to represent ancient laws and decrees, making it fitting for the foundational rules and standards within an organization.
Example: A policy might state that all software developed must comply with GDPR and other relevant data protection regulations.
Rules
Definition: Explicit, often granular directives that are compulsory and usually narrow in scope.
Purpose: To ensure specific outcomes or behaviors in particular scenarios.
Lifecycle: Reviewed frequently (e.g., annually) to refine and ensure they address current challenges effectively and are adhered to.
Compliance: Strictly mandatory; non-negotiable and must be followed exactly as prescribed.
Icon: (scales). It represents justice, balance, and fairness, aligning with the concept of rules ensuring specific outcomes and behaviors are maintained in a structured and equitable manner within an organization.
Example: A rule might state that commit messages must include a ticket number from the issue tracker.
Standards
Definition: Specific mandatory requirements for how certain policies are to be implemented.
Purpose: To ensure consistency and quality across all projects by defining technical and operational criteria that must be met.
Lifecycle: Updated biennially or as needed to reflect new industry practices and technological advancements.
Compliance: Mandatory; essential for maintaining quality and uniformity in outputs.
Icon: (ruler). It symbolizes measurement, precision, and consistency, which align closely with the idea of standards setting specific requirements and guidelines to ensure quality and uniformity across projects.
Example: A standard might specify that all code must undergo peer review or adhere to a particular coding standard like ISO/IEC 27001 for security.
Processes
Definition: Detailed, step-by-step instructions that must be followed in specific situations.
Purpose: To ensure activities are performed consistently and effectively, especially for complex or critical tasks.
Lifecycle: Regularly tested and updated, ideally after major project milestones or annually, to adapt to process improvements and feedback.
Compliance: Mandatory where specified; critical for ensuring consistency and reliability of specific operations.
Icon: (hammer and wrench). It symbolizes tools and construction, fitting for the concept of processes as they provide detailed, step-by-step instructions necessary to construct or execute specific tasks systematically and efficiently.
Example: A process might outline the steps for a release process, including code freezes, testing protocols, and deployment checks.
Guidelines
Definition: Recommended approaches that are not mandatory but are suggested as best practices.
Purpose: To guide developers in their decision-making processes by providing options that align with best practices.
Lifecycle: Evaluated and possibly revised every two to three years, or more frequently to incorporate innovative techniques and tools.
Compliance: Optional; best practice recommendations that are advisable but not required.
Icon: (compass). It symbolizes guidance, direction, and navigation, which aligns well with the purpose of guidelines to provide recommended approaches and best practices that help steer decisions in software development.
Example: Guidelines might suggest using certain frameworks or libraries that enhance productivity and maintainability but are not strictly required.
Governance and Compliance Controls
Definition: A collection of key controls and processes designed to ensure adherence to governance standards and compliance with both internal policies and external regulations.
Purpose: To track compliance with the mandatory sections of the handbook, including policies, rules, and standards. This section ensures that all critical governance measures are documented and enforced, providing oversight on adherence to the core practices that uphold the quality and security of our software development process.
Lifecycle: Controls are reviewed quarterly or following significant changes in regulations, technology, or business needs to ensure they remain effective and relevant.
Compliance: Mandatory for all teams. Any deviation or non-compliance may result in audits, corrective actions, or further review, ensuring alignment with organizational standards and legal requirements.
Icon: (shield). It symbolizes protection and security, representing the safeguarding of our software engineering practices through strong governance and compliance.
Example: A control might track instances where code merges bypass branch protection, ensuring that changes still follow the correct peer review process to maintain code integrity.
Recently GitHub released the GitHub Copilot Metrics API which provides customers the ability to view how Copilot is used and as usual someone created an Open Source tool to view the data: github-copilot-resources/copilot-metrics-viewer.
So let’s take a look at the usage of Copilot in Software Engineering in Carlsberg from end of May to end of June 2024.
I’m focusing on the following three metrics:
Total Suggestions
Total Lines Suggested
Acceptance Rate
As I think they are useful for understanding how effective Copilot is and I would like to get closer to an actual understanding of the usefulnes of Copilot rather than the broad statement offered by both GitHub and our own developers that it saves 50% of their time.
The missing data in the charts is due to an error in the GitHub data pipeline at the time of writing and data will be made available at a later stage.
The low usage in the middle of June is due to some public holidays with lots of people taking time off.
Total Suggestions
Total Lines Suggested: Showcases the total number of lines of code suggested by GitHub Copilot. This gives an idea of the volume of code generation and assistance provided.
Total Lines Suggested
Total Lines Accepted: The total lines of code accepted by users (full acceptances) offering insights into how much of the suggested code is actually being utilized incorporated to the codebase.
Acceptance Rate
Acceptance Rate: This metric represents the ratio of accepted lines to the total lines suggested by GitHub Copilot. This rate is an indicator of the relevance and usefulness of Copilot’s suggestions.
Conclusion
The overall acceptance rate is about 20% which resonates with my experience as Copilot tends to either slightly miss the objective and/or be verbose so that you have to trim/change a lot of code. So if Copilot suggests 100 lines of code you end up accepting 20.
Does this then align with the statements from developers in Software Engineering and GitHub which claim that you save 50% of time using Copilot?
Clearly reviewing and changing code is faster than writing, so even if you end up only using 20% of the suggested code, you will save time.
Unfortunately we don’t track actual time to complete tasks in Jira, so we don’t have hard data to prove the claim.
This is an attempt at trying to create some highlevel patterns for Artificial Intelligence (AI) solutions in order to be able to more easily choose a pattern based on type and problem area.
Chatbots with LLMs: Automate interactions and provide instant, contextually relevant responses, enhancing customer service, information retrieval, and user engagement across various domains like e-commerce and healthcare.
Chaining LLMs: Link multiple LLMs in sequence, leveraging their specialized capabilities for more nuanced and accurate solutions, enabling sophisticated workflows where each model performs tasks it excels at.
State machines and directed graphs: This approach introduces cycles, meaning the system can loop back and reconsider previous steps, which allows for more complex decision-making and adaptive behaviors. The state machine can maintain a state based on previous interactions, which can influence future decisions and actions.
Orchestrating LLMs: Simplify the integration and management of multiple LLMs to work in harmony, improving the development, performance, and scalability of AI-driven applications by leveraging the strengths of diverse models.
Would be a neglect not to mention RAG here although more of a feature than a solution pattern:
Retrieval Augmentation Generation (RAG): RAG combines the power of LLMs with a retrieval mechanism to enhance response accuracy and relevance. By fetching information from a database or collection of documents before generating a response, RAG models can provide answers that are more detailed and contextually appropriate, drawing from a wide range of sources. This approach significantly improves performance on tasks requiring specific knowledge or factual information, making RAG models particularly useful for applications like question answering and content creation.
Graphs and orchestration is also commonly referred to as “Agentic” architectures.
Chatbots with LLMs
Bots and chat-based interfaces powered by Large Language Models (LLMs) address a wide array of problem areas by automating interactions and processing natural language inputs to provide instant, contextually relevant responses.
These AI-driven solutions revolutionize customer service, information retrieval, and interactive experiences by enabling scalable, 24/7 availability without the need for human intervention in every instance.
They excel in understanding and generating human-like text, making them ideal for answering queries, offering recommendations, facilitating transactions, and supporting users in navigating complex information landscapes.
Furthermore, they significantly enhance user engagement by providing personalized interactions, thereby improving satisfaction and efficiency in areas such as e-commerce, education, healthcare, and beyond. By harnessing the capabilities of LLMs, bots and chat interfaces can decode intricate user intents, engage in meaningful dialogues, and automate tasks that traditionally required human intelligence, thus solving key challenges in accessibility, scalability, and automation in digital services.
Chaining LLMs
Chaining LLMs involves linking multiple LLMs in sequence to process information or solve problems in a stepwise manner, where the output of one model becomes the input for the next. This technique utilizes the specialized capabilities of different LLMs to achieve more complex, nuanced, and accurate solutions than could be provided by any single LLM.
Through this approach, developers can create advanced workflows in which each model is tasked with a specific function it excels at, ranging from understanding context to generating content or refining answers. This method significantly enhances the effectiveness and efficiency of AI systems, allowing them to address a wider variety of tasks with greater precision and contextual relevance. Chaining LLMs thus represents a strategic approach to leveraging the complementary strengths of various models, paving the way for more intelligent, adaptable, and capable AI-driven solutions.
Chaining LLMs is particularly effective for solving problems that benefit from a multi-step approach, where each step might require a different kind of processing or expertise. Here are some examples of problems typically solved using chaining:
Complex Query Resolution: Simplifying and addressing multifaceted queries through a stepwise refinement process.
Content Creation and Refinement: Generating drafts and then improving them through editing, summarization, or styling in successive steps.
Decision Support Systems: Deriving insights and suggesting actions through a sequential analysis and decision-making process.
Educational Tutoring and Adaptive Learning: Providing personalized educational feedback and instruction based on initial assessments.
These examples highlight the versatility of chaining LLMs, enabling solutions that are not only more sophisticated and tailored but also capable of handling tasks that require depth, precision, and a layered understanding of context.
Directed Graphs
State machines (a directed graph) are abstract machines that can be in exactly one of a finite number of states at any given time. In the context of LLMs and LangChain, a state machine would manage the flow of interactions with the LLM, keeping track of the context and state of conversations or processes.
LangGraph is designed to facilitate the creation, expansion, and querying of knowledge graphs using language models. It uniquely combines the representational power of knowledge graphs—structures that encode information in a graph format where nodes represent entities and edges represent relationships between entities—with the generative and understanding capabilities of language models. This integration allows for sophisticated semantic reasoning, enabling applications to derive insights and answers from a rich, interconnected dataset.
The primary value of LangGraph lies in its ability to leverage the contextual awareness and depth of language models to enrich knowledge graphs. This makes it particularly well-suited for applications requiring complex query answering, semantic search, and dynamic knowledge base expansion. It’s about not just processing language but understanding and organizing information in a way that mirrors human cognition.
Orchestrating LLMs
A framework for orchestrating LLMs is aimed at tackling the intricate challenges of integrating and managing multiple LLMs to work in harmony. Such a framework simplifies the process of combining the capabilities of diverse LLMs, enabling developers to construct more complex and efficient AI-driven solutions. It offers tools and methodologies for seamless integration, enhancing the development process, and allowing for the creation of applications that leverage the strengths of various LLMs. This not only streamlines the development of sophisticated applications but also boosts their performance and scalability, facilitating the customization of AI solutions to meet specific needs and contexts.
Orchestrating LLMs involves coordinating multiple models to work together efficiently, often in parallel or in a dynamic sequence, to tackle complex tasks. This approach is particularly useful for problems that benefit from the combined capabilities of different LLMs, each bringing its unique strength to the solution. Here are some examples of problems typically solved using orchestration:
Multi-domain Knowledge Integration: Coordinating specialized LLMs to offer solutions that require expertise across various fields.
Personalized User Experiences: Dynamically combining LLM outputs to customize interactions according to user data.
Complex Workflow Automation: Utilizing different LLMs for distinct tasks within a broader workflow, optimizing for efficiency and effectiveness.
Advanced Customer Support Systems: Integrating various LLMs to understand, process, and respond to customer inquiries in a nuanced and effective manner.
Orchestration enables the leveraging of multiple LLMs’ strengths in a coordinated manner, offering solutions that are more versatile, scalable, and capable of addressing the multifaceted nature of real-world problems.
The following is an subset of software architecture patterns, which tend to be referenced when academic discussions around patterns arise. The following are my comments.
Patterns
Application Architecture
Microservice
The microservice pattern comes from domain-driven design where in particular the concept of bounded context came to be the decoupling of services. The post Microservices by Martin Fowler also played a large part in naming this pattern.
Drivers:
Loosely coupled with other services – enables a team to work independently the majority of time on their service(s) without being impacted by changes to other services and without affecting other services
Independently deployable – enables a team to deploy their service without having to coordinate with other teams
Capable of being developed by a small team – essential for high productivity by avoiding the high communication head of large teams
The application must be easy to understand and modify
You must run multiple instances of the application on multiple machines in order to satisfy scalability and availability requirements
You want to take advantage of emerging technologies (frameworks, programming languages, etc)
Problems:
Managing dependencies
Deployments of the entire system may become complex
Developers must implement the inter-service communication mechanism and deal with partial failure
Implementing requests that span multiple services is more difficult
Testing the interactions between services is more difficult
Implementing requests that span multiple services requires careful coordination between the teams
Database
Database per Service
This pattern is basically the natural follow-on to choosing the microservice application architecture pattern, where if a service is to become independant, then it must have it’s own independant data layer.
Drivers:
Services must be loosely coupled so that they can be developed, deployed and scaled independently
Databases must sometimes be replicated in order to scale
Different services have different data storage requirements, like relational database or NoSQL
Problems:
How to manage consistency across services
Who owns (masters) data?
Messaging
Messaging is a communications pattern which uses asynchronous messaging to replace the synchronous style of request/response used in most REST-style APIs. Most common styles of asynchronous messaging are:
Notifications – a sender sends a message a recipient but does not expect a reply. Nor is one sent.
Request/asynchronous response – a service sends a request message to a recipient and expects to receive a reply message eventually
Publish/subscribe – a service publishes a message to zero or more recipients
Publish/asynchronous response – a service publishes a request to one or recipients, some of whom send back a reply
In the following there’s no differentiation between “event”-driven and “data”-driven, as a message will always contain the full message body.
Event Driven with full Messages
Drivers:
The complete body of the message must be sent in each event
The broker may or may not allow for queries against the messages
Extreme loose coupling as the sender is completely decoupled from the receiver
Problems:
Subscriber/consumer overload which requires buffering so that events are not lost
The broker/mediator must always be available
Almost always creates mixed messaging styles with events for downstream and request/response across and upstream which requires carefull planning
Broker
Event driven messaging with a broker implies that all messages are delivered through a central broker but that there’s no processing control flow and messages are delivered using a publish/subscribe pattern.
Drivers:
Scalability
Problems:
How to handle orchestration? With “normal” synchronous request/response messaging services orchestrate business processes in the order that they are called – with broker driven messaging everybody potentially could get the same message at the same time, so who orchestrates the overall process?
Mediater
A mediater expands on the broker with support for business process workflows usually with support for BPEL.
Drivers:
Business processes change often
Traceability and governance is important
Problems:
Management and governance of platform takes time
Requires turnkey solutions from major vendors
Not really available as a cloud solution
Event Driven with Notifications
Drivers:
You don’t know how many subscribers/consumers of your events there will be, so you must preserve bandwidth
The size of the payload in the event is limited
A subset of the events doesn’t require the complete body of the message to be sent
Subscribers explicitly want to chose which message bodies they want to pull
Problems:
Requires a two-step dance where the consumer of an event must issue a request to the central broker and ask for the full body of the message