Is an AI sentiment score on an earnings call transcript still telling us anything real?

2026-06-12

A corporate earnings-call microphone on a dark stage beneath a translucent digital scanning grid that highlights phrases on a speech transcript, rendered in deep navy and gold.

The sentiment score on an earnings call transcript used to read management's confidence. Now that investor-relations teams pre-score their own scripts against the same models, the number measures how well a company writes for the machine, and Oracle's June beat-then-drop shows where the real signal went.

TLDR

Pointing a language model at an earnings call transcript used to give an honest read on management's confidence. Now that investor-relations teams pre-score their own scripts against the same models, the sentiment number increasingly measures how well a company writes for the machine, not how the business is doing. The live tell is Oracle, which beat its quarter on June 11 and still fell, because the one line it could not rewrite was the capital-spending math.

We have all watched a sentiment score flash green on a transcript and felt a small, lazy reassurance. The model read the call, the tone was confident, the number says buy. The pitch behind it is everywhere: a language model has no ego and no relationship with the chief executive, so it reads the words straight. The numbers can be massaged, the argument goes, but the texture of how management talks, the hedging, the defensiveness, the specificity, leaks the truth. So score the words. Hedge funds now run the major sentiment engines, RavenPack’s RavenBERT, AlphaSense, the FinBERT family, across every call within seconds, precisely because they believe the words give an unguarded read the income statement does not.

It used to be true, and that is the problem

The belief earned its credibility honestly. For most of the past decade, tone genuinely predicted post-call returns, because management spoke naturally and the model caught things they were not guarding. The clean recent illustration is what the coverage this spring started calling the sentiment trap. AppLovin reported profits that beat expectations and then fell 19.7% in a single session after the chief executive’s defensive pushback in the question-and-answer section got flagged by trading models as friction. That was not the numbers. That was the texture.

45 to 50%

of companies that beat consensus earnings estimates still traded lower the next day during the 2026 season, per Alexandria Technology's data on the "sentiment trap"

When roughly half of all earnings beats sell off the following day, something real is being priced, and for a while the something was language the company had not thought to manage. The edge was genuine. It was also, like every genuine edge, visible to the people on the other side of it.

What Oracle’s tape said that its tone could not

Here is what changed. The company now knows the model is reading. Investor-relations teams have started pre-scoring their own scripts against the same engines the funds use, rehearsing against what one platform calls a digital twin of their toughest critics, uploading draft remarks to be analysed against competitor transcripts before the microphone ever turns on. Goldman Sachs screened the S&P 500 and found 90% of companies now use qualitative rather than quantitative language about AI on their calls, which is itself a managed choice. The result is a contrast you can see across a single season: Alphabet disclosed hard numbers, cloud revenue up 63% to over $20 billion, and the stock rose, while Meta met an AI return question with “a very technical question” and fell.

"Meta's stock dropped 6% on a quarter with revenue up 33% and profits up 61%."

Analysis of the Q1 2026 earnings season, May 2026

Then came the cleanest test of all. On June 11, Oracle beat its fiscal fourth quarter, sales up 21% and per-share earnings up 20%, and the stock still closed down 8.53% at $184.10, having opened down 11.12%. The prepared remarks were surely polished and surely scored well. None of it mattered, because the line management could not soften was the arithmetic: $70 billion of planned 2027 capital spending against $32 billion of cash from operations this year, plugged with a $40 billion financing plan. Adobe walked into the same window down nearly 30% on the year, with investors demanding hard recurring-revenue evidence rather than AI capability claims. The tone layer is now table stakes. The move came from the math.

Goodhart arrives at the earnings call

This is just Goodhart’s law finding the conference call. When a measure becomes a target, it stops being a good measure. The sentiment score became a target the moment investor-relations teams could see it, and a score everyone is optimising toward converges. So the number now partly measures how fluent a company is at writing for the machine, which is not the same thing as how the business is doing, and occasionally the opposite.

The edge in earnings-call language did not disappear. It migrated to the sentences management cannot script.

The residual signal moved to where the media training runs out: the guidance numbers, the capital-spending line, the financing slide, and the one answer in the question-and-answer where the finance chief hedges on the specific metric that matters. Oracle’s tone was clean. Its tape was not, and the tape won.

Key Insight

A green sentiment score is no longer an independent read. It is a signal everyone games, so the honest information has drained out of the scored tone and into the substance the tone cannot dress up.

What we actually do with a sentiment score now

Treat the sentiment number as table stakes, not as a separate vote. When we read a call, weight the parts that cannot be rehearsed: whether the guidance math actually funds itself, where the financing comes from, and the single question where the answer got vaguer than the rest of the call. Three quarters of rising hedging on one metric tells us more than any one quarter’s tone reading. Oracle beating and falling on the same morning is the reminder that the words were managed and the numbers were not.

The unsettling part is not that companies write for the model. It is that the better they get at it, the more the only honest signal left is the one thing nobody can rewrite in rehearsal, and I keep wondering how long it takes the models to start scoring the silences instead.

This is editorial analysis, not investment advice. Cerevisor does not hold or recommend the named positions, and information here can become stale within hours of publication.

Sources

Stock Market Today, June 11: Oracle Falls After AI Spending Guidance Sparks Cash Flow Concerns - The Motley Fool, 2026-06-11
How Will Adobe Stock React To Its Upcoming Earnings? - Trefis, 2026-06-10
The 'Sentiment Trap': Why Beating Earnings Is No Longer Enough in the Age of AI Sentiment Analysis - FinancialContent, 2026-03-13
The Earnings Call Gap: What Q1 2026 Just Told Us About AI ROI - Thorsten Meyer, 2026-05-01
The Future of IR: Latest Trends in AI for Investor Relations (2026) - Q4 Inc., 2026-05-05
Top 10 Best AI Tools for Investor Relations (IR) - ChatFin

Back to all insights