Stuff about Software Engineering

Category: AI

AI for Data (Not Data and AI)

Cold Open

Most companies get it backwards.

They say “Data and AI,” as if AI is dessert—something you get to enjoy only after you’ve finished your vegetables. And by vegetables, they mean years of data modeling, integration work, and master‑data management. AI ends up bolted onto the side of a data office that’s already overwhelmed.

That mindset isn’t just outdated—it’s actively getting in the way.

It’s time to flip the script. It’s not Data and AI. It’s AI for Data.

AI as a Data Appendage: The Legacy View

In most org charts, AI still reports to the head of data. That tells you everything: AI is perceived as a tool to be used on top of clean data. The assumption is that AI becomes useful only after you’ve reached some mythical level of data maturity.

So what happens? You wait. You delay. You burn millions building taxonomies and canonical models that never quite deliver. When AI finally shows up, it generates dashboards or slide‑deck summaries. Waste of potential.

What If AI Is Your Integration Layer?

Here’s the mental flip: AI isn’t just a consumer of data—it’s a synthesizer. A translator. An integrator – an Enabler!

Instead of cleaning, mapping, and modeling everything up front, what if you simply exposed your data—as is—and let the AI figure it out?

That’s not fantasy. Today, you can feed an AI messy order tables, half‑finished invoice exports, inconsistent SKU lists—and it still works out the joins. Sales and finance data follow patterns the model has seen a million times.

The magic isn’t that AI understands perfect data. The magic is that it doesn’t need to.

MCP: OData for Agents

Remember OData? It promised introspectable, queryable APIs—you could ask the endpoint what it supported. Now meet MCP (Model Context Protocol). Think OData, but for AI agents.

With MCP, an agent can introspect a tool, learn what actions exist, what inputs it needs, what outputs to expect. No glue code. No brittle integrations. You expose a capability, and the AI takes it from there.

OData made APIs discoverable. MCP makes tools discoverable to AIs.

Expose your data with just enough structure, and let the agent reason. No mapping tables. No MDM. Just AI doing what it’s good at: figuring things out.

Why It Works in Science—And Why It’ll Work in Business

Need proof? Look at biology.

Scientific data is built on shared, Latin‑based taxonomies. Tools like Claude or ChatGPT navigate these datasets without manual schema work. At Carlsberg we’ve shown an AI connecting yeast strains ➜ genes ➜ flavor profiles in minutes.

Business data is easier. You don’t need to teach AI what an invoice is. Or a GL account. These concepts are textbook. Give the AI access and it infers relationships. If it can handle yeast genomics, it can handle your finance tables.

Stop treating AI like glass. It’s ready.

The Dream: MCP‑Compliant OData Servers

Imagine every system—ERP, CRM, LIMS, SharePoint—exposing itself via an AI‑readable surface. No ETLs, no integration middleware, no months of project time.

Combine OData’s self‑describing endpoints with MCP’s agent capabilities. You don’t write connectors. You don’t centralize everything first. The AI layer becomes the system‑of‑systems—a perpetual integrator, analyst, translator.

Integration disappears. Master data becomes a footnote.

When Do You Still Need Clean Data?

Let’s address the elephant in the room: there are still scenarios where data quality matters deeply.

Regulatory reporting. Financial reconciliation. Mission-critical operations where a mistake could be costly. In these domains, AI is a complement to—not a replacement for—rigorous data governance.

But here’s the key insight: you can pursue both paths simultaneously. Critical systems maintain their rigor, while the vast majority of your data landscape becomes accessible through AI-powered approaches.

AI for Data: The Flip That Changes Everything

You don’t need perfect data to start using AI. That’s Data and AI thinking.

AI for Data starts with intelligence and lets structure emerge. Let your AI discover, join, and reason across your real‑world mess—not just your sanitized warehouse.

It’s a shift from enforcing models to exposing capabilities. From building integrations to unleashing agents. From waitingto acting while you learn.

If your organization is still waiting to “get the data right,” here’s your wake‑up call: you’re waiting for something AI no longer needs.

AI is ready. Your data is ready enough.

The only question left: Are you ready to flip the model?

Four Categories of AI Solutions

Introduction

When driving value from generative AI (GenAI) it’s important to choose the right approach in order to be able to get a return on investment. This page attempts at explaining possible approaches and required resources.

Takers, Shapers and Makers

There seems to be 3 major categories of GenAI adopters according to McKinsey and Gartner:

McKinseyGartnerDescription
TakersQuick WinsFocus on utilizing existing GenAI tools and models for productivity improvements with minimal customization.

These initiatives typically have short time to value and are task-specific, aiming for immediate efficiency gains in routine tasks.
ShapersDifferentiating Use CasesEngage in integrating GenAI tools with proprietary data or adapting them for specific applications.

These initiatives aim to achieve competitive advantages, involving medium time to value with higher costs and risks than quick wins.

They leverage GenAI to extend current processes and create unique value propositions.
MakersTransformative InitiativesConcentrate on developing new GenAI models or tools for specialized applications, with the potential to transform business models and markets.

These are the most ambitious initiatives, characterized by high cost, complexity, and risk, and a long time to value.

They aim for strategic benefits that may be difficult to quantify initially.

TCO/ROI

The Total Cost of Ownership (TCO) and Return on Investment (ROI) for GenAI adoption across takers, shapers, and makers categories involve several considerations, including hidden costs, strategic implications, and potential benefits.

Gartner offers insights on measuring GenAI ROI, advocating for a business case approach that simulates potential cost and value realization across GenAI activities. This approach categorizes investments into quick wins, differentiating use cases, and transformational initiatives. Quick wins focus on immediate productivity improvements with short time to value, differentiating use cases aim at competitive advantage with medium time to value, and transformative initiatives have the potential to upend business models with longer time to value but higher costs and complexity. The guide emphasizes the importance of balancing financial returns with strategic benefits, which might be difficult to quantify initially.

Source: https://www.gartner.com/en/articles/take-this-view-to-assess-roi-for-generative-ai.
Red box is added by me, see conclusion below.

Builders

I’m introducing an extra “Builders” category into the GenAI adoption landscape beyond merely adopting or adapting, Builders take a step further by crafting bespoke extensions and plugins for GenAI platforms. This initiative is driven by the ambition to tackle intricate, multi-step workflows that typically demand considerable human intervention. The essence of being a Builder lies in their ability to not just work with GenAI but to enhance its core capabilities, enabling solutions that seamlessly bridge various systems and processes. This approach demands a blend of creativity, technical prowess, and a deep understanding of both the technology and the problem domain.

CategoryDescriptionRequired People Resources/SkillsTools
TakersUtilize existing GenAI tools for productivity improvements with minimal customization.

Aimed at immediate efficiency gains in routine tasks with short time to value.
Basic understanding of AI/ML conceptsSkills in integrating and configuring APIs

Ability to adapt third-party GenAI tools to existing workflows
Microsoft Copilot

Microsoft Copilot Plugins

Enterprise “Chat”-GPTs
ShapersIntegrate GenAI tools with proprietary data or adapt them for specific applications to achieve competitive advantages, involving medium time to value with higher costs and risks.Low/No-code developers

Domain experts for data interpretation

Project managers with a technical background
Retrieval Augmented Generation (RAG)

Microsoft Copilot Studio

Microsoft Azure AI Studio
BuildersDevelop custom solutions or extensions to GenAI platforms to solve complex, multi-step processes that usually require significant human effort.Advanced programming skills in relevant languages

Data scientists for model tuning

Experience with GenAI frameworks

Systems integration expertise

Creative problem-solving abilities
Microsoft Copilot Extensions

Microsoft PromptFlow

LangChain

LangGraph

LlamaIndex

AutoGen

CrewAI

(OpenAI Swarm)

LLM Function Calling

LLM Routing

LLM Threat Modelling

LLM Security
MakersDevelop new GenAI models or tools for specialized applications with the potential to transform business models and markets.

Characterized by high cost, complexity, and risk, with a long time to value.
Expertise in deep learning and neural networks

Experience in building and training large-scale AI modelsStrong research and development background

Ability to work with high-performance computing resources
LLM Models

LLM Frameworks

LLM Fine-Tuning

(LLM Creation and Training)

The “Builders” category fills the gap between “Shapers,” who mainly adapt existing models for their unique needs, and “Makers,” who create new GenAI models from scratch. Builders leverage powerful frameworks and platforms to create bespoke solutions that automate complex workflows, potentially revolutionizing how businesses approach process automation and efficiency. This distinction underscores the evolving landscape of GenAI adoption, highlighting the increasing sophistication and customization capabilities available to organizations.

Conclusion

The red box on the image above indicates that solutions made in the Takers and lower Shapers category are likely to be overtaken by standard solutions from vendors and the plethora of SaaS AI offerings appearing on a daily basis. Caution should be used when choosing to invest in solutions in this area unless quick wins are important.

Clearly it’s important to have a strategic, well-planned approach to integrating GenAI with emphasis on organizational readiness, skill development, and a focus on applications that offer a competitive advantage – otherwise GenAI just becomes a technology looking for a problem like Blockchain.

References

ChatGPT Vision could be used to destroy incumbent image recognition solutions in CPG

I finally got access to ChatGPT Vision and I uploaded a picture of some beers I had at home:

This is the response from ChatGPT:

Nice selection of beers you got there! I see Perlenbacher, which is often associated with Lidl supermarkets, and of course, the iconic Carlsberg. Carlsberg, founded in 1847, is one of the most well-known beer brands globally and is based out of Copenhagen, Denmark. “Birra Messina” seems to be a Sicilian beer. Each of these beers brings a different taste profile to the table. Enjoy responsibly! Would you like to know more about any specific beer?

I then prompted this using TypeChat:

Please attempt to recognize the cans in the picture and fill out the following response:
interface Response {
data: Array<{ name: string, brand: string, count: number}>;
}

And I got this from ChatGPT:

Sure, here’s the response based on the image:

interface Response {
data: Array<{ name: string, brand: string, count: number}>;
}

const beerData: Response = {
data: [
{ name: "Premium Pilsner", brand: "Perlenbacher", count: 2 },
{ name: "Pilsner", brand: "Carlsberg", count: 2 },
{ name: "Classic", brand: "Birra Messina", count: 1 }
]
};

This is based on the visible labels in the image.

No training – no nothing. I know the internet is full of examples of people having ChatGPT turning Figma drawings into web sites and apps – but this is totally out of the comfort zone of ChatGPT and I have previous experience trying to teach image recognition of packaged goods to models and it’s not that easy.

Solutions based on this could destroy the incumbent vendors in image recognition in the CPG market.

Clippy talking to Clippy talking to Clippy…

I listened to a podcast interview with Christian Lanng (Christian Lanng: “How Being a Founder Almost Killed Me”  | E1065) about his next thing with Beyondwork and them trying find out what that means.

Christian says that he doesn’t want a future where a Clippy has to talk to a Clippy which talks to another Clippy and so on. In this context Clippy is some sort of an “AI” which can provide some answer.

I don’t know if that is true or not or if it’s the future we want or not – but right now, that seems to be where we’re headed.

  • LangChain is about chaining prompts and answers together so that we can get better answers
  • Microsoft Copilot Plugins is about enabling Microsoft Copilot to extend your prompts to other systems and consolidate the answers and perform tasks on your behalf
  • Microsoft TypeChat is about providing types and structure to the data we want to pass between systems to make interoperability even easier and more precise

I’ve spent the first 20 years of my life as a developer and the last 20 or so as an “architect” in many different contexts and it almost always boils down to solving integration between business domains and figuring out how data can be accessed, translated and acted upon across the different domains either in real time or before some date and time.

I totally agree with the sentiment that most, if not all, business systems are really just a database with a UI on top and that actual business value comes from combining the data across all systems and extracting answers from it. However, as long as the “business” continues to lock data in silos and refuses to adapt business processes to standard systems and demands customization of systems to accommodate the last 20% of functionality in a way that almost makes integration impossible – then I think the Clippies are going to be around for some time to come.

I think short to medium term the scenarios with plugins to Microsoft Copilot (and similar) will prove valuable to most businesses. Longer term we still need to fix copyright (AIs will be stopped from scraping), privacy (business are scared of leaks) and authentication/authorization (who are you and what do you have access to across boundaries). For instance how can we safely solve problems where data is scattered across private and public domain.

It’s going to be interesting and some new rules/processes are going to have to be worked out.

© 2025 Peter Birkholm-Buch

Theme by Anders NorenUp ↑