More Code Is Not More Value: Why AI Is an Operating Model Test

What IBM’s CEO study, CircleCI’s delivery data, and DORA’s ROI model reveal about AI, flow, and measurable business value

17 min read

A note before you read

This is a longer article because the AI conversation deserves more than another short take about coding speed, productivity, or tool adoption.

The real issue is larger: whether organizations can turn AI-driven activity into safe, reliable, outcome-producing work.

Here is the argument I will make, one increasingly echoed across technology, engineering, and AI adoption communities:

AI has moved beyond a tool conversation. It is becoming an operating model test.
Productivity gains matter only when leaders connect them to customer value, business outcomes, quality, resilience, or learning.
More code is not the same as more value. AI is bringing back the old lesson that lines of code were never a meaningful measure of progress.
IBM’s 2026 CEO Study shows the C-suite and operating-model challenge: decision rights, workflow redesign, accountability, and cross-functional leadership.
CircleCI’s 2026 State of Software Delivery report shows the engineering reality: AI is increasing development activity, while many teams are struggling to validate, integrate, and recover at scale.
DORA’s ROI report provides the value-realization bridge: AI impact should not be judged only by tool usage, code volume, or localized productivity. Leaders already have many of the system metrics needed to see whether AI is improving delivery, stability, developer experience, user experience, and business value.
The work still needs to begin with an anticipated outcome, whether a human writes the code, AI assists, or an agent generates the change.
AI is a multiplier. It accelerates value, learning, and delivery in strong systems. It also accelerates defects, risk, and fragility in weak systems.

Engineers remain accountable for the code they accept and ship. Leaders remain accountable for the systems, incentives, guardrails, and production outcomes they create.

If you lead technology, product, transformation, or AI adoption, this matters because AI can make your organization look busier before it makes your organization better.

The goal is not simply more activity. The goal is Flow, Realization, and measurable business value.

AI has become a system conversation

The IBM Institute for Business Value recently released its 2026 CEO Study, Rewiring the C-suite: The Fast Track to 2030. The report reinforces something many technology and business leaders are starting to feel: AI can no longer be treated as a tool conversation that gets delegated to the technology organization. (IBM 2026 CEO Study, p. 1)

AI is becoming a structural shift in how organizations think, decide, operate, and compete.

That matters because many organizations are still using the same playbook from earlier technology adoption cycles. They form committees, approve tools, launch pilots, ask teams to find use cases, track activity, and celebrate productivity gains.

Those steps can create motion. Transformation requires something deeper.

The harder question is whether AI is improving how work moves, how decisions are made, how teams collaborate, how value is measured, and how business outcomes are realized.

DORA reaches the same conclusion from the software delivery side. Its ROI report describes AI as an amplifier: it magnifies the strengths of high-performing organizations and the dysfunctions of struggling ones. The greatest returns come from the system around the tool: internal platforms, workflow clarity, and aligned teams.

That is the deeper message I took from IBM’s report, and it aligns closely with the philosophy behind my upcoming book, Profitable Engineering: Transforming Technology Teams Into Strategic Business Partners.

AI is a system conversation. The technology matters, and the operating model around it will determine how much value organizations actually realize.

Productivity is only the starting point

IBM makes an important point: AI can free up time, talent, and capital that were previously consumed by friction. That matters. It is also only the beginning. (IBM 2026 CEO Study, p. 3)

The real differentiator is what leaders do with the capacity they free up.

DORA makes a similar point in financial terms: the capacity AI frees up should be reinvested into innovation and value creation, not treated only as a headcount-reduction opportunity. That distinction matters because AI value is realized when reclaimed capacity improves the system, not when leaders only chase short-term savings.

Do they reinvest it into better products, faster learning, stronger customer experiences, improved decision-making, or new sources of value? Or does the gain disappear into the same operating system that created the friction in the first place?

This is where many organizations struggle.

They measure whether AI helped someone complete a task faster. They measure whether more code was generated. They measure whether a process step has become more efficient. Those are useful early signals, but they leave the bigger executive question unanswered:

What business outcome improved because of that productivity gain?

That is where Flow and Realization become essential.

Flow helps leaders understand how work moves from idea to production, where friction appears, where decisions slow down, and where handoffs create waste. Realization helps leaders close the loop between work delivered and business impact achieved.

AI can improve productivity in a task or process. Profitable engineering requires understanding whether that improvement helped the organization realize something meaningful: revenue, retention, margin, risk reduction, customer trust, operational resilience, or strategic learning.

More code can become more inventory

The software industry has learned this lesson before.

Lines of code were never a good measure of software productivity. More code did not automatically mean more value. Sometimes it meant greater complexity, more maintenance, more defects, and higher costs.

AI is bringing that lesson back in a new form.

DORA makes the warning more direct. Because AI can increase the volume of generated code, leaders have to remember that code is often a liability before it becomes value. More code without proper oversight can increase verification overhead, technical debt, and long-term operating costs.

CircleCI’s 2026 State of Software Delivery report analyzed more than 28 million workflows and found that average throughput increased 59% year over year. AI-assisted development and agent-driven workflows are clearly increasing development activity. (CircleCI 2026 State of Software Delivery, pp. 2, 4, 26)

But leaders should be careful before declaring victory.

CircleCI also found that the gains are uneven. The top 5% of teams nearly doubled throughput, while the median team improved only 4%, and the bottom 25% saw no measurable increase. (CircleCI 2026 State of Software Delivery, pp. 4-6)

More importantly, increased activity does not always become a shipped change. The median team saw feature branch throughput increase by 15.2%, while main branch throughput declined by 6.8%. Even teams in the top 10% saw nearly 50% more feature branch activity while main branch activity was essentially flat. (CircleCI 2026 State of Software Delivery, p. 8)

That should get every technology leader’s attention.

Feature branch activity can make the organization feel faster. Main branch movement shows whether the delivery system is actually absorbing the work.

If AI creates more activity in branches while mainline delivery slows or stalls, the organization has not improved flow. It has increased inventory.

More generated code, pull requests, workflow runs, and branch activity can create the appearance of progress. But if the work is not validated, integrated, shipped, operated, measured, and connected to a meaningful outcome, it is not value. It is inventory.

This is where AI productivity can become a facade.

The better question is whether the organization can safely absorb the additional change. Can it review it? Test it? Integrate it? Deploy it? Observe it in production? Recover quickly when something breaks? Connect the work back to customer or business impact?

CircleCI reaches a similar conclusion: success in the AI era is no longer determined by how fast code can be written. The decisive factor is whether teams can validate, integrate, and recover at scale. (CircleCI 2026 State of Software Delivery, p. 2)

That is Flow and Realization in practical terms.

Flow asks whether work can move through the system. Realization asks whether that work produced the outcome the organization intended. AI can increase the amount of work entering the system, but leaders still need to know whether that work reaches production safely and creates value once there.

The outcome still comes first

The tool does not change the discipline.

Whether code is written by a person, assisted by AI, refactored by AI, or generated by an agent, the work should still begin with an anticipated outcome.

What customer problem are we trying to solve? What business result are we trying to improve? What risk are we trying to reduce? What learning are we trying to create?

Without that clarity, AI can make the organization feel more productive while hiding the fact that nothing meaningful has changed.

The metrics that matter do not disappear because AI enters the workflow.

We still need to understand throughput, flow time, work-item aging, deployment health, quality, reliability, time spent on production code, and the distribution of work across features, defects, risk, debt, support, and enablement. We still need to know whether changes are improving customer value, business outcomes, time-to-market, efficiency, or quality.

Leaders should not let the pressure to show AI adoption, AI productivity, or AI usage cloud the fundamentals:

Did the work move? Did it reach production safely? Did quality improve or decline? Did customers benefit? Did the business outcome change? Did we learn anything that will improve the next decision?

AI can accelerate work. Leadership is still responsible for connecting that work to outcomes.

AI will expose your operating model

One of the strongest themes in IBM’s report is that AI-first organizations are redesigning decision rights, workflows, leadership accountability, and cross-functional collaboration. (IBM 2026 CEO Study, pp. 15-19)

That is the right framing.

AI will expose unclear ownership. It will make slow decision systems more visible. It will accelerate whatever alignment, or misalignment, already exists between strategy and execution.

This is the framing I keep coming back to:

AI is a multiplier. It accelerates value, learning, and delivery in strong systems. It also accelerates defects, risks, and fragility in weak ones.

This concern is now showing up from multiple angles. IBM is raising it at the CEO and operating-model level. DORA frames AI as an amplifier of the system it enters. CircleCI’s delivery data shows the same pattern at the engineering-system level. Teams with stronger delivery systems are converting AI acceleration into deployable change, while many others are seeing more activity upstream and more friction downstream. (CircleCI 2026 State of Software Delivery, pp. 8-9)

It is also the pattern I explored in my own article, AI Is a Multiplier: AI value depends on the maturity of the system it enters. Faster code generation only matters if the organization can safely absorb, validate, deploy, operate, measure, and learn from that acceleration.

In organizations with clear strategy, strong DevOps foundations, healthy product operating models, visible workflows, accountable leadership, and disciplined outcome review, AI can accelerate learning and delivery.

In organizations with unclear priorities, disconnected incentives, fragmented data, brittle delivery systems, and leadership misalignment, AI may simply help teams move faster in the wrong direction.

That becomes an operating model problem long before it becomes a technology problem.

Decision rights and workflow design matter more now

IBM emphasizes the need to rethink the C-suite for speed and clarity. The report recommends redesigning decision rights before changing the org chart, clarifying who owns which outcomes, and distributing authority closer to the problem while maintaining clear guardrails. (IBM 2026 CEO Study, pp. 18-19)

That point deserves attention.

Many organizations say they want speed while their leadership systems remain built for permission, escalation, and consensus. Teams wait. Leaders debate. Work queues grow. Priorities shift. People ask for empowerment while the decision architecture remains unchanged.

AI will increase the pressure on that model.

When information moves faster, decision latency becomes more expensive. When AI agents can surface insights, summarize patterns, and recommend actions, leaders need to clarify who can act, where human judgment is required, what guardrails apply, and how accountability is reviewed after the decision.

Without that clarity, AI may create more noise than speed.

This connects directly to one of the central ideas in Profitable Engineering: teams often improve flow faster than the leadership system around them.

Engineering teams can modernize delivery practices, improve automation, deploy more frequently, reduce lead time, and adopt AI tools. Those gains eventually stall when the leadership system cannot make clear decisions, connect work to outcomes, and remove cross-functional friction.

IBM also makes another important point: leaders should redesign workflows before redesigning jobs. (IBM 2026 CEO Study, p. 37)

Too many AI conversations start with the wrong question: How many jobs will AI replace?

A better question is: How does work move today, and where should AI change the flow of that work?

Before leaders redesign roles, they need to understand the system. Where does work begin? Who decides what matters? Where does information get lost? Which handoffs slow progress? Which decisions are repeatable? Which decisions require judgment? Which controls are necessary? Which steps are legacy friction disguised as governance?

This is where value stream thinking becomes practical.

Map the work. Understand the current state. Identify decision points. Clarify ownership. Decide where AI can safely execute, where humans need to remain in judgment, and where escalation rules must be explicit.

Then redesign roles to align with the new flow of work.

AI adoption should be tied to how value actually flows through the organization, rather than being treated as a disconnected collection of use cases.

Accountability cannot be delegated to the tool

IBM frames one of its core plays as orchestrating intelligence, human and artificial. That phrase matters because the future will require better collaboration between people, systems, data, and AI. (IBM 2026 CEO Study, pp. 32-37)

Humans will still provide judgment, context, ethics, creativity, understanding of relationships, and strategic sense-making. AI will increasingly handle pattern recognition, summarization, prediction, repeatable decisions, monitoring, and operational execution at a scale humans cannot match manually.

The leadership challenge is designing the handoffs between the two.

Leaders need to define where AI recommends, where AI decides, where humans approve, where humans audit, and where AI is intentionally kept out of the loop. They also need a way for the system to learn from exceptions, because the exception path often reveals the real complexity of the work.

Those choices are leadership decisions. Treating them as technical details leaves too much to chance.

I was reminded of this in a conversation this week with a former manager from my team. We were talking about AI, and I reminded him that, at least today, accountability still sits with people.

Engineers are responsible for the code they submit, regardless of whether it was written by hand, assisted by AI, refactored by AI, or generated from a prompt. If that code creates bugs, security risks, performance problems, data leaks, resilience issues, or customer impact, the team cannot simply point back to the tool and say AI did it.

That accountability also needs to extend upward.

CTOs, technology executives, and senior leaders who push an AI-first agenda focused primarily on productivity, volume, or replacing engineering effort need to understand the pressures they are creating. Moving faster may help the organization compete, but the way leaders push AI adoption can also increase risk when speed is separated from quality, resilience, security, and production accountability.

CircleCI makes that risk visible in delivery data. Main branch success rates fell to 70.8%, the lowest level in more than five years, meaning attempts to merge changes into production codebases now fail about 30% of the time. CircleCI compares that to a recommended 90% benchmark. (CircleCI 2026 State of Software Delivery, p. 12)

When the system is under that kind of pressure, leaders cannot treat AI-assisted development as only a productivity story. It is also a story of quality, resilience, security, and accountability.

A healthier model integrates both levels of accountability. Engineers remain accountable for the code they accept and ship. Leaders remain accountable for the systems, incentives, guardrails, culture, and production outcomes created inside their organizations.

That allows a company to move forward with AI while avoiding the trap of pressuring teams to produce more without understanding what happens when that work reaches production.

Complexity can erase the productivity gain

CircleCI has another finding leaders should take seriously: AI-assisted code is becoming harder to recover from when things break.

The typical team now takes 72 minutes to get back to green after a failed build, a 13% increase from the prior year. On feature branches, recovery time rose 25%, from 64 minutes to nearly 80 minutes. (CircleCI 2026 State of Software Delivery, p. 11)

That matters because every delay compounds.

When validation breaks down, the organization pays for it in engineering time, blocked deployments, missed deadlines, burnout, higher cost, and frustrated customers. CircleCI gives a practical example: for a team pushing five changes to the main branch per day, moving from a 90% success rate to 70% can add 250 hours of debugging and blocked deployment time per year. At 500 changes per day, the report estimates the loss can equal 12 full-time engineers. (CircleCI 2026 State of Software Delivery, p. 13)

This is where AI productivity can become deeply misleading.

DORA describes this hidden cost as part of the AI adoption J-Curve: the learning curve, the verification tax, and pipeline adaptation. Teams may save time generating code, but that gain can be absorbed by review burden, testing pressure, integration constraints, change approval, and recovery work.

A leader may see more generated code and assume the organization is getting faster. The teams may be absorbing the cost downstream through increased review pressure, more debugging, more failed builds, longer recovery times, and greater operational risk.

That is not productivity. That is productivity debt.

We already have many of the metrics we need

IBM suggests treating AI usage as a key operating metric. I agree with the direction, with one important caution. (IBM 2026 CEO Study, p. 37)

AI usage and AI value are different measures.

This is where Goodhart’s Law matters. When a metric becomes the target, it can stop being a good measure.

If leaders only measure how often people use AI, they may unintentionally reward shallow adoption. Teams may use AI because they are expected to, rather than because it improves the work. Leaders may celebrate usage dashboards while the business impact remains unclear.

The better measurement model connects AI usage to the system-level measures that many technology organizations already have.

We do not need to invent an entirely new measurement universe to understand whether AI is helping. DORA’s ROI report makes the starting point clear: leaders need a baseline. What did delivery, quality, recovery, flow, developer experience, and user experience look like before AI entered the system?

DORA metrics already help leaders see whether deployment frequency, lead time for changes, change failure rate, recovery time, throughput, and instability are improving or worsening. Flow metrics add another layer by showing flow time, flow efficiency, flow velocity, work-item aging, and flow distribution across features, defects, risk, debt, support, and enablement. Developer experience feedback adds the qualitative signal: whether AI is making work easier, safer, clearer, faster, or simply adding pressure and noise.

AI-specific usage metrics still matter, but they should be treated as adoption signals, not proof of value. AI becomes meaningful when it improves decision latency, cycle time, rework, quality, customer trust, forecast accuracy, learning speed, delivery health, or the organization’s ability to realize the expected outcome.

Those are the signals that help leaders separate real improvement from AI theater.

The real AI gap may be organizational design

One of the most important points in IBM’s report is that the gap between AI capability and AI deployment is often more of an organizational design problem than a skills problem. (IBM 2026 CEO Study, p. 36)

That aligns with what many leaders are seeing in practice.

Employees may be interested. Tools may be available. Pilots may exist. Training may have happened. Adoption still stalls when the work is never redesigned, incentives never change, approval paths remain unclear, data remains fragmented, or leaders never make AI part of the operating rhythm.

That is why AI literacy matters, but literacy alone will only get an organization so far.

Training people on AI without changing how work moves is like teaching everyone to drive faster while leaving the roads congested, the signs unclear, and the destination debated.

The system still matters.

DORA’s organizational foundation points in the same direction: build trust in AI, treat internal developer platforms as products, cultivate AI-accessible data, anchor AI velocity in user value, and deploy automated guardrails for verification.

CircleCI also points to what I think of as the messy middle. Smaller companies and the largest enterprises performed better on main branch throughput and recovery time, while mid-sized companies struggled the most. Companies with 21 to 50 employees had the lowest throughput and recovery times, at nearly three hours. The report suggests these companies may have outgrown the simplicity of small teams without yet building the systems and practices needed to operate at a larger scale. (CircleCI 2026 State of Software Delivery, p. 16)

That is a leadership lesson, not just an engineering metric.

Growth creates complexity. AI adds speed. Without the right operating system, the combination can overwhelm the organization.

Why this matters for Profitable Engineering

The philosophy behind Profitable Engineering is that technology teams become strategic partners when leaders can connect the movement of work to measurable business outcomes.

IBM reinforces that idea from the CEO and operating-model level. (IBM 2026 CEO Study, pp. 6-7)

DORA reinforces it from the value-realization side: AI adoption only becomes meaningful when capabilities improve delivery performance, developer experience, user experience, cost efficiency, and eventually business growth.

CircleCI reinforces it from the software delivery side. More code is being created. More activity is entering the system. The highest-performing teams are converting that acceleration into deployable change. Many others are discovering that validation, integration, recovery, and production readiness are now the constraints.

AI will accelerate parts of the value stream. It will change how people work. It will shift decision-making. It will create new sources of productivity. It may eventually reshape business models.

The organizations that benefit most will understand how to connect AI to flow, leadership clarity, decision rights, product strategy, customer value, and realization.

They will clarify the outcome they are trying to improve before rushing into automation. They will study where the work slows down today and decide which decisions can safely move closer to the work. They will define where AI should execute, where it should recommend, and where it should stay out of the loop. They will assign ownership for the business result and review whether the expected value was actually realized.

When value does not materialize, they will have the discipline to learn, adjust, or stop.

Those are profitable engineering questions.

The leadership takeaway

AI is raising the standard for leadership.

Sponsoring tools, approving pilots, and asking teams to “go use AI” may create activity. Lasting value requires leaders to redesign the system around the speed, intelligence, and risk profile AI introduces.

That means clearer decision rights, more visible workflows, stronger cross-functional accountability, better measurement discipline, more intentional reinvestment of productivity gains, and a deeper connection between delivery and realized value.

It also means resisting the easiest story.

The easy story is that AI makes teams faster because more code gets created. The harder truth is that more code can become more inventory, more risk, more debugging, and more operational drag when the delivery system cannot absorb it.

The companies that win will treat AI as a forcing function to improve how their businesses operate.

That is the moment we are entering.

AI is becoming a test of operating models.

For many organizations, the test has already begun.

Profitable Engineering is coming soon. AI is one chapter in the book, but the larger message runs through every chapter: technology only becomes strategic when leaders can connect Flow, Realization, leadership maturity, and modern delivery practices to measurable business value.

Poking holes

I invite your perspective. What do you think?

Let’s talk: phil.clark@rethinkyourunderstanding.com

References

IBM Institute for Business Value, Rewiring the C-suite: The fast track to 2030, 2026 CEO Study
Official report page: https://www.ibm.com/thought-leadership/institute-business-value/en-us/report/2026-ceo
CircleCI, 2026 State of Software Delivery
Official report page: https://circleci.com/resources/2026-state-of-software-delivery/
Related summary article: https://circleci.com/blog/five-takeaways-2026-software-delivery-report/
DORA / Google Cloud, The ROI of AI-assisted Software Development, 2026
Official report page: https://dora.dev/ai/roi/report/
ROI calculator: https://dora.dev/ai/roi/calculator/
Latest version check: https://dora.dev/vc/airoi/?v=2026.1
Phil Clark / Rethink Your Understanding, AI Is a Multiplier
Article: https://rethinkyourunderstanding.com/ai-is-a-multiplier/