Welcome to the first official edition of the newsletter under The Infra Play.

The Infra Play originates from my personal observations as an insider in cloud infrastructure software. I took very seriously the idea that everything around us depends on cloud infrastructure software and then spent years digging further.

Over the same period, we experienced massive change in the industry, driven by AI and innovation in both hardware and software. Almost every playbook you can think of from the last 15 years doesn't work anymore. How we build software is different. How we sell is different. How we think of human capital is different. How much liquidity is available in the industry is different.

So what is The Infra Play? It's where I bring together all my work on cloud infrastructure software in a single place. The articles I write, the videos I create, and the database application that I built. I look at things through multiple mental models, rooted in the core idea that the most important skills in the industry are building and selling.

Why The Infra Play? It's a reflection of you. Almost everyone reading this is either an industry operator, an investor, a founder, or a member of the go-to-market team in a tech company. All of us have made a directional bet, a play, on cloud infrastructure software. My goal is to give you actionable insights, alpha if you will, so you can make the most out of this directional bet.

Today we will take a look at 3 recent studies around the business outcomes of Enterprise AI adoption. The first one will be a bearish one (the now famous "95% of AI doesn't add any value" MIT study), one neutral, and one bullish (GCP shilling AI agents).

While it's difficult to glean actionable insights from these types of studies, they can help build a big picture narrative on why your specific product is making a difference, as well as help you take directional bets on where the value is accruing across the stack.

The key takeaway

For tech sales and industry operators: The good news is that AI adoption is clearly progressing, with most companies already being a paying customer for regular LLMs. The new challenge is defining and implementing custom workflows, most of which are tied to AI agents. The biggest trap for executing these custom implementations is failure to meet your customers where they are, i.e., their business context. This is leading to a situation where most companies are waiting for their vendors to implement AI features rather than buy from unproven early stage companies. At a time where the majority of technical talent is concentrated in startups, this is creating an obvious friction point. This is where deep domain understanding becomes critical. If you're selling, your credibility comes from having actually done the job you're automating, as there is a clear premium on former practitioners who understand the workflow pain, not AI engineers with a better wrapper. If you're buying or building internally, the priority is integrating AI into existing tools where adoption is already high (your CRM, your codebase, your security stack) rather than asking employees to switch to standalone AI tools that require behavior change.

For investors and founders: One of the roughest realizations for a lot of senior leaders in the largest companies operating today is that treating software development as a cost center has made them severely underprepared for an economic reality where your competitive advantage is tightly related to your ability to execute in the world of bits. While the Enterprise AI adoption metrics can be seen as a direct reflection of product adoption, the reality is that they are almost entirely a reflection of the state of play. The value proposition of agentic workflows is obvious, yet most companies have not been able to build ones specific to their unique business processes. The alpha sits somewhere in the middle, between the technical talent that can execute and iterate quickly and the industry leaders that have actually done the job we are trying to convert into an agentic workflow.

Show me the adoption curve

Despite $30–40 billion in enterprise investment into GenAI, this report uncovers a surprising result in that 95% of organizations are getting zero return. The outcomes are so starkly divided across both buyers (enterprises, mid-market, SMBs) and builders (startups, vendors, consultancies) that we call it the GenAI Divide.
Just 5% of integrated AI pilots are extracting millions in value, while the vast majority remain stuck with no measurable P&L impact. This divide does not seem to be driven by model quality or regulation, but seems to be determined by approach.
Tools like ChatGPT and Copilot are widely adopted. Over 80 percent of organizations have explored or piloted them, and nearly 40 percent report deployment. But these tools primarily enhance individual productivity, not P&L performance. Meanwhile, enterprise- grade systems, custom or vendor-sold, are being quietly rejected. Sixty percent of organizations evaluated such tools, but only 20 percent reached pilot stage and just 5 percent reached production. Most fail due to brittle workflows, lack of contextual learning, and misalignment with day-to-day operations.
From our interviews, surveys, and analysis of 300 public implementations, four patterns emerged that define the GenAI Divide:
• Limited disruption: Only 2 of 8 major sectors show meaningful structural change
• Enterprise paradox: Big firms lead in pilot volume but lag in scale-up
• Investment bias: Budgets favor visible, top-line functions over high-ROI back office
• Implementation advantage: External partnerships see twice the success rate of internal builds
The core barrier to scaling is not infrastructure, regulation, or talent. It is learning. Most GenAI systems do not retain feedback, adapt to context, or improve over time.

The MIT Study focuses on a dataset based on a systematic review of over 300 publicly disclosed AI initiatives, structured interviews with representatives from 52 organizations, and survey responses from 153 senior leaders collected across four major industry conferences.

The primary challenge with their report is that they are trying to take a very broad perspective of different industry segments, without calibrating for the effect that transforming certain parts of the economy would have on the rest.

The shift in technology has been obvious and has funneled the largest investor attention to the sector ever. The downstream effect of the massive capex investment in hardware has been massive, both in terms of demand for raw materials, as well as the dynamics on the energy market. The restructuring of employment, both in tech and professional services, also has a significant downstream effect across all sectors.

The "95% failure rate" sound bite basically refers only to what they ranked as niche implementations for specific business processes, and those were mostly driven by attempts at homegrown solutions. So we are playing a bit of a wordcel game here, pretending that the massive and quick adoption of LLMs is less relevant than "copilot for vending machines" failing a poorly run pilot.

Five Myths About GenAI in the Enterprise
1. AI Will Replace Most Jobs in the Next Few Years → Research found limited layoffs from GenAI, and only in industries that are already affected significantly by AI. There is no consensus among executives as to hiring levels over the next 3-5 years.
2. Generative AI is Transforming Business → Adoption is high, but transformation is rare. Only 5% of enterprises have AI tools integrated in workflows at scale and 7 of 9 sectors show no real structural change.
3. Enterprises are slow in adopting new tech → Enterprises are extremely eager to adopt AI and 90% have seriously explored buying an AI solution.
4. The biggest thing holding back AI is model quality, legal, data, risk → What's really holding it back is that most AI tools don't learn and don’t integrate well into workflows.
5. The best enterprises are building their own tools → Internal builds fail twice as often.

They are mostly directionally correct, although not always for the reasons explained here.

On job replacements, structurally speaking, white collar work is the most at risk short term because, well, we are building intelligence that significantly outpaces the median performer. It's not difficult to see how for companies that have a lot of workers doing things in the physical world (i.e., machines not only need to replace their knowledge but also offer a way to manipulate objects in real life), it's not obvious yet what the replacements would look like. For white collar work, most senior leaders have done their job only through scaffolding, abstractions, and delegation for close to two decades by now. In a very fast-paced, dynamic field as new model training, even if they pay attention to the progress, they are likely not testing it in enough real world scenarios to assess how they can start augmenting and ultimately replacing employees in different parts of the workflow. This is before we even touch topics like having an unorthodox opinion and vision for new ways of organizing and driving their teams.

On transformation and building your own tools, this is mostly driven by misunderstanding of the short term Enterprise opportunity, which is about augmenting existing applications they serve. In order for them to do that, they need to have vision and technical talent, which most don't because they've been underpaying and outsourcing their development teams at the first possible opportunity. The reason why tech companies are leading the way here is not just willingness to adopt but simply having actual talent that can execute an innovation project from first principles.

On continual learning, it's arguably the biggest practical bottleneck today. Context windows are not big enough to just "dump everything in the chatbot and keep feeding it information" so every random org can suddenly have a magical worker. What most misunderstand is the concept of building with 6-12 months into the future of model evolution that makes functionality that doesn't work today "suddenly click". Coding experienced this arc, financial analysis did it, video and image generation, etc.

A corporate lawyer at a mid-sized firm exemplified this dynamic. Her organization invested $50,000 in a specialized contract analysis tool, yet she consistently defaulted to ChatGPT for drafting work:
"Our purchased AI tool provided rigid summaries with limited customization options. With ChatGPT, I can guide the conversation and iterate until I get exactly what I need. The fundamental quality difference is noticeable, ChatGPT consistently produces better outputs, even though our vendor claims to use the same underlying technology."
This pattern suggests that a $20-per-month general-purpose tool often outperforms bespoke enterprise systems costing orders of magnitude more, at least in terms of immediate usability and user satisfaction. This paradox exemplifies why most organizations remain on the wrong side of the GenAI Divide.

This example is another good indication of the poor understanding across all parties. First, the company invests in "specialized tooling" without a real benchmarking process on whether it's just a poorly designed API wrapper or it offers a meaningful differential in at least UX. Then the users not only stop using it, but allegedly switch to the basic paid tier of ChatGPT, i.e., at the time 4o with no reasoning capabilities. So the users are not only not exposed to actual top tier models, but have simplistic enough workflows that they only need basic prompting to achieve better outcomes. The startup (or incumbent implementing this functionality) fails to deliver even basic improvements which would've been obvious through user interviews and is surprised when the customer churns. The story then makes it into a "prestigious study", feeding a narrative for the laggards.

These arguments are typical for any emerging category but are helpful in evaluating from first principles "should my company exist". One of the biggest weaknesses for most newly launched companies that make a big bet on shaping their product around AI is that in most cases the founders have a strong grasp of how to integrate models, but poor understanding of the industry they are supposed to be disrupting. It takes one look at the AI SOC analyst market where we have more than 50 new companies trying to offer a security agent and more than half have no background in security to begin with or in other parts of the stack. So now they are supposed to build trust with one of the most closed off and difficult technical audiences to sell to, having never done their job or even been in the industry previously. Hint, almost all of them will go to zero and the problem isn't AI performance.

Source: State of Intelligent Automation: Generative AI Confessions

Now let’s take a look at the neutral report, State of Intelligent Automation: Generative AI Confessions by ABBYY. This one offers similar themes but with a lot less provocative bear takes. The trick of course is that the survey focuses on IT leaders only, which immediately changes the dynamics in terms of adoption visibility.

Both regular LLM subscriptions and more customized tools are being adopted at scale. Interestingly enough, adoption across EMEA and APAC is strong as well, regardless of the less favorable regulatory environment.

There is a significant difference whether a company is adopting AI as part of their core operational strategy, as a push from management on specific use cases, or driven by low level employee adoption. The odds of success are clearly higher when it's part of the core company strategy and has support from executives.

In fact, if a CTO/CIO is driving the initiative, the org is likely to have a 90%+ adoption rate.

Enterprise adoption is a reflection of what large companies do, which is basically scaling certain workflows with leverage. Reporting, customer support, and paper pushing remain a key part of the day-to-day experience, hence why a big focus of these implementations is trying to make that go faster/easier/cheaper.

Still, if you scratch the surface, it's clear that the killer use case for LLMs today is obvious even to the most laggard of organizations: generating and securing code.

Which brings us to AI agents and The ROI of AI 2025 by GCP. Let's preface the report with the fact that GCP is currently the best performing hyperscaler in terms of helping companies build agentic workflows and is obviously biased.

As the AI hype settles, the conversation has shifted to value. Leaders are no longer asking if they should use AI, but how they can scale proven use cases and build sophisticated AI agents for business value. Our latest research confirms this fundamental change in business mindset. We have seen AI evolve from predictive to generative.
Now, we’re in the agentic era, where AI agents can independently execute tasks and make decisions—under human guidance and guardrails. At Google, we think of AI agents as systems that combine the intelligence of advanced AI models with access to tools, so they can take actions on your behalf and under your control. And while this technology is already helping people get more done, many companies are still in the early phases of agentic maturity.
Companies that were quick to adopt AI agents are seeing real returns. They’re using agents to improve customer experiences, free up employees for smarter work, and give departments like marketing, IT, and HR a productivity boost. This ROI helps justify bigger investments and get leadership on board for a broader AI scaling strategy.

If a company has started using LLMs, it's likely it has at least tested or evaluated an agentic workflow for automation. I think this is normal value chain discovery, as users shift to a mental model of wanting to explore additional capabilities once they understand the fundamentals of their existing usage. These figures are based on 3,446 leaders (75% in C-suite) in companies with at least $10M in revenue.

“AI agents are applicable across a wide variety of use cases, and I believe every business has workflows where agentic AI can deliver meaningful value. It accelerates existing processes, driving measurable business impact.”
Fiona Tan, CTO at Wayfair

The challenge with agents is that most companies struggle to define what they actually expect an agent to do. Let’s take one of the definitions that a lot of the technical insiders are using:

Tools in a loop to achieve a goal #
An LLM agent runs tools in a loop to achieve a goal. Let’s break that down.
The “tools in a loop” definition has been popular for a while—Anthropic in particular have settled on that one. This is the pattern baked into many LLM APIs as tools or function calls—the LLM is given the ability to request actions to be executed by its harness, and the outcome of those tools is fed back into the model so it can continue to reason through and solve the given problem.
“To achieve a goal” reflects that these are not infinite loops—there is a stopping condition.
I debated whether to specify “... a goal set by a user”. I decided that’s not a necessary part of this definition: we already have sub-agent patterns where another LLM sets the goal (see Claude Code and Claude Research).
There remains an almost unlimited set of alternative definitions: if you talk to people outside of the technical field of building with LLMs you’re still likely to encounter travel agent analogies or employee replacements or excitable use of the word “autonomous”. In those contexts it’s important to clarify the definition they are using in order to have a productive conversation.
But from now on, if a technical implementer tells me they are building an “agent” I’m going to assume they mean they are wiring up tools to an LLM in order to achieve goals using those tools in a bounded loop.
Some people might insist that agents have a memory. The “tools in a loop” model has a fundamental form of memory baked in: those tool calls are constructed as part of a conversation with the model, and the previous steps in that conversation provide short-term memory that’s essential for achieving the current specified goal.
If you want long-term memory the most promising way to implement it is with an extra set of tools!
Agents as human replacements is my least favorite definition #
If you talk to non-technical business folk you may encounter a depressingly common alternative definition: agents as replacements for human staff. This often takes the form of “customer support agents”, but you’ll also see cases where people assume that there should be marketing agents, sales agents, accounting agents and more.
If someone surveys Fortune 500s about their “agent strategy” there’s a good chance that’s what is being implied. Good luck getting a clear, distinct answer from them to the question “what is an agent?” though!
This category of agent remains science fiction. If your agent strategy is to replace your human staff with some fuzzily defined AI system (most likely a system prompt and a collection of tools under the hood) you’re going to end up sorely disappointed.
That’s because there’s one key feature that remains unique to human staff: accountability. A human can take responsibility for their actions and learn from their mistakes. Putting an AI agent on a performance improvement plan makes no sense at all!
Amusingly enough, humans also have agency. They can form their own goals and intentions and act autonomously to achieve them—while taking accountability for those decisions. Despite the name, AI agents can do nothing of the sort.

So essentially we want the LLM to get access to other tools than just predicting tokens and to perform assigned activities as expected. So in the most blunt example with the customer chatbot for a travel agency, if the bot is being asked about "what's a great place to travel with children" and responding based on its training, that's just ChatGPT in a different interface. If it looks up a database with your travel history and offers to make a booking on airline and hotel websites for you, that's agentic.

The value proposition of agentic behavior is so self-evident that it's being adopted across all geographies and industries.

Do you remember that bit about accountability? Not surprisingly, adoption of such tools is correlated with the potential outcomes if something goes wrong. Highly sensitive roles are testing and adopting agentic workflows at a significantly slower rate, although that will change over time as outcomes improve and most decision makers realize that logging every action in a traceable manner is higher accountability than what you get out of humans. Unironically the concept of a public ledger that blockchains introduced will come back in a more simplistic but obvious implementation here as agents are deemed more trustworthy over time.

The obvious one to highlight here is cybersecurity. The industry is changing rapidly and a lot of the AI spend is going into cybersecurity use cases. Whether this leads to underreporting of the expansion of cybersecurity (or hides AI adoption dollars) is unclear, but not unlikely. If we double click:

Similar to coding use cases, cybersecurity is a logical step in the value chain because most security problems tie back to code, APIs, and logs, all things that humans are significantly worse at quickly understanding vs LLM agents.

“You have to look at ROI as not just size of return but also speed of return. AI initiatives are sizable investments that are not commodities yet, so we have to look at where hyper-automation and scaling with AI is actually generating a return first. How fast is your investment coming back to the organization and what capabilities are you investing in now that will scale up and create more efficiencies or business transformation down the road?”
Cristina Nitulescu, Head of Digital Transformation and IT, Bayer Consumer Health

Qualifying whether a company is an early adopter of AI Agents is a strong indication of both AI maturity and budget allocation.

While most of the business objectives for AI adoption remain similar, it's only this year that specifically exploring and scaling agentic deployments has become a board-level topic. In fact, 77% of the companies are increasing their AI spend even if the cost per token has dropped, and 58% are allocating net new budgets.

To close this off, while the report is trying to highlight the quick rise of interest and focus on AI agents, it’s important to note that for senior leaders, the primary concerns remain “are we doing the right things with AI” and “how can we improve our data”. These are often challenges with process and people, less so with systems.

The Infra Play #113: State of Enterprise AI adoption

The key takeaway

Show me the adoption curve

Tools in a loop to achieve a goal #

Agents as human replacements is my least favorite definition #

Why behind AI: OpenAI DevDay 2025

The Infra Play #112: xAI