Your coding agent got cheaper this week. Your team did not get faster.

Coding-agent vendors cut model prices sharply this week, which can lift the harness ROI number on a board slide without the team shipping anything more. The fix is one habit: read the ratio as two numbers, cost and output, and check which one actually moved.
On May 18 Cursor priced its new in-house coding model at fifty cents per million input tokens, and a fresh CloudBees survey of more than 200 enterprise leaders found only 31 percent of AI spend can be tied to a business outcome. Put those together and the harness ROI number on a board slide can rise next quarter with nothing more behind it than a vendor price cut. The fix is one habit: read the ratio as two numbers, not one.
The setup
On May 18, Cursor shipped Composer 2.5, its own coding model, and priced the standard tier at fifty cents per million input tokens and two dollars fifty per million output. TestingCatalog described it, in the vendor’s own framing, as “exceptionally intelligent and up to 10x more efficient than similarly capable models.” The same week, Google’s Antigravity was sitting in free public preview with full model access and no usage caps.
Strip away the benchmark noise and one thing happened. The price of running a coding agent dropped, across the category, because the vendors cut it. No customer did anything. No engineering team changed how it works.
Here is what that quietly sets up. A quarter from now, a lot of executives are going to open a harness ROI slide and see a better number than last quarter. And a good share of them are going to read that number as their team getting faster. Most of the time, it will not be.
What teams actually built
I have watched a fair number of companies build a harness ROI number over the past year, and they almost all build it the same way. Output over cost. The value the coding agents produced, divided by what the agents cost to run.
The cost half is easy. It is a bill. Token spend, seat licenses, the line item finance can pull in an afternoon. The output half is hard, and this is where it gets honest. Almost nobody has a clean measure of what the agents actually delivered. So the numerator quietly becomes an estimate, or a survey result, or a vendor case study, or a number a VP of Engineering felt comfortable saying out loud.
That is not a moral failing. It is genuinely hard to measure. But it is widespread, and now it is documented. CloudBees released its State of Code Abundance 2026 report on May 19, a survey of more than 200 enterprise technology leaders, and the numbers are blunt.
"36% of organizations track AI spend without measuring ROI or don't measure ROI at all."
So the typical harness ROI number is a precise denominator sitting on top of a soft numerator. A clean bill, divided by a hopeful guess. That arrangement held up fine for a while, for a boring reason. When cost was roughly flat, the soft numerator could not move the ratio much on its own. The number was fuzzy but stable.
This week, the cost stopped being flat.
Where it breaks
Here is the mechanics, and it is worth being slow about it, because the slide will not be.
A ratio has two ways to go up. The numerator rises, or the denominator falls. They look identical on a chart. They are not remotely the same thing.
When a vendor cuts the model price by half and a team’s verified output stays exactly where it was, the harness ROI number doubles. The slide reads “2x.” The headline writes itself. And nobody shipped twice as much. The team did the same work, at the same pace, with the same quality. A vendor in another city changed a price, and the chart in the room now says the engineering org improved.
A harness ROI number that rises because the model got cheaper is a vendor price cut wearing the costume of a productivity gain. The cost side moved. The team did not.
The teams that handled this week well were the ones who had already split the ratio in two. They could walk into the room and say: cost per engineer fell roughly 40 percent, verified output is flat, so this is a procurement win and not a delivery win, and here is each number on its own line. That is a completely fine thing to report. Saving real money on tooling is good news. It is just a different kind of good news than “we got faster,” and the board deserves to know which one it is buying.
There is a second trap underneath the first. While everyone watches the model bill fall, the costs sitting next to it are going the other way. The same CloudBees survey found 54 percent of leaders reporting a significant rise in CI/CD infrastructure spend, and 70 percent now saying test-suite maintenance is a bigger burden than writing the code in the first place. The cheap part got cheaper. The expensive part, the reviewing and the verifying and the keeping-it-from-breaking, did not. WinBuzzer read Cursor’s move plainly: “cheaper standard pricing and stronger promised gains on long tasks are central to Composer 2.5’s effort to defend that position.” The vendors are competing on the one number that is easiest to see. The numbers that are hard to see are still yours to manage.
The pattern
Strip it back and the pattern is simple. A harness ROI number is only honest when both halves of it can be seen moving on their own. One number is not a ratio. It is a mood.
A falling denominator on top of a flat numerator is a vendor price cut. It is not your team getting better, and the board should be told which one it is.
Three things make the number trustworthy again, and none of them take a quarter to set up.
First, decompose the ratio every time it appears. One line for cost per engineer, one line for verified, merged output, dated and owned. If the improvement sits entirely in the cost line, say so out loud. It is still a win, just not the win most people assume.
Second, set a token quota now, before the bill gets strange. CloudBees found only 27 percent of organizations have hard limits on token usage and only 45 percent call their AI spend predictable quarter to quarter. Cheaper tokens have a habit of being used far more freely, so a lower per-token price and a higher total bill tend to arrive together.
Third, watch the cost that did not fall. The model is the cheap part now. The review queue, the test suite, the CI bill: that is where the next surprise lives, so it gets its own line.
The deeper reason this matters is that the vendors are not going to stop. Model prices will keep falling for several quarters because a handful of companies are racing each other down. So an executive who never separates price from productivity will report a rising harness ROI quarter after quarter, feel fine about it every time, and never learn whether the engineering org actually got better. The chart goes up. The knowledge does not.
What I’d tell you over coffee
Cheaper coding agents are good news. Real money, saved, and nobody should apologize for being happy about it. The only thing worth guarding against is letting that saving get quietly filed under “productivity,” because those are two different stories and a board is entitled to know which one it is hearing.
So when the next harness ROI slide lands, ask it one question: did the top number move, or just the bottom one? If the team can answer cleanly, you have a real instrument. If the room goes quiet, that quiet is the actual finding, and it is a far more useful thing to walk out with than a chart that went up on its own.
Sources
- 81% of Enterprise Technology Leaders Report Production Failures from AI-Generated Code, New Research Shows (State of Code Abundance 2026) - GlobeNewswire, 2026-05-19
- Introducing Composer 2.5 - Cursor, 2026-05-18
- Cursor Releases Composer 2.5, Saying It's Better at Sustained Coding Jobs - WinBuzzer, 2026-05-18
- Cursor released Composer 2.5 with up to 10x cost efficiency - TestingCatalog, 2026-05-18