“He’s Got a Good Face”: AI Productivity and the Scouting Problem

“He’s Got a Good Face”: AI Productivity and the Scouting Problem
Photo by Cameron Cox / Unsplash

By: Paco Campbell
Published: Saturday, March 28th, 2026

There’s a scene in Moneyball where a room full of veteran scouts sits around a table and explains, with total conviction, why a particular prospect is worth drafting. One of them says the kid has “a good face.” Another mentions his girlfriend—confident, attractive—as evidence of a winning mentality.

Nobody laughs. These are respected professionals with decades of experience. Their scouting reports carry institutional weight. And every word out of their mouths is vibes dressed in a polo shirt.

That scene is supposed to be the setup before Billy Beane walks in with a spreadsheet and changes everything. But the real story isn’t the spreadsheet. It’s the fifty years of expensive, credentialed confidence that preceded it—an entire industry that confused proximity to the game with understanding of it.

I keep thinking about that room.

In March 2026, the National Bureau of Economic Research published a working paper surveying nearly 750 finance executives (mostly CFOs) about AI’s impact on productivity and the workforce. It landed with all the gravity that an NBER seal carries. Regression tables. Growth-accounting decompositions. Sector fixed effects. The full academic wardrobe.

And underneath all of it: scouts telling you the kid has a good face.

The Wrong People in the Room

The paper’s central instrument is a survey. CFOs—chief financial officers—are asked to report how much AI improved their company’s labor productivity, how it changed employment, and what they expect going forward. The authors frame this as a strength: executives are “well-suited to provide insight into corporate investment, utilization, and expected outcomes for AI.”

Are they, though?

A CFO sees invoices, headcount, and revenue lines. They don’t sit next to the marketing team using ChatGPT or watch an engineer interact with Copilot. They are, by design, the person furthest from actual AI usage and closest to the spending decision. Asking them whether AI improved productivity is like asking the general manager whether the draft pick was worth it. They signed the check. They have every reason in the world to say yes.

And they do say yes. The paper’s mean “reported” productivity gain from AI is 1.8% for 2025. But when the authors back out an “implied” measure from the same CFOs’ reported revenue and employment changes, the number drops to 0.6%.

That gap—between what executives feel and what their own numbers show—is the entire story. The paper calls it a “productivity paradox,” echoing Solow’s famous observation about computers. But maybe it’s not a paradox. Maybe it’s just the distance between a scouting report and a batting average.

The Spreadsheet Nobody Wanted to Open

A month before the NBER paper dropped, Goldman Sachs’s chief economist, Jan Hatzius, told an Atlantic Council audience that AI investment spending contributed “basically zero” to U.S. economic growth in 2025. Not a little. Not modestly. Zero.

His colleague Joseph Briggs told the Washington Post that the AI-growth narrative was so intuitive that it may have prevented people from digging deeper into what was happening.

That's the Moneyball scene, except nobody's playing Billy. The scouts are still running the room.

Now, to be fair: Goldman and the NBER paper are measuring different things. Goldman is doing GDP accounting. When Hatzius said the contribution was basically zero, he named the mechanism directly: Taiwan and Korea are capturing GDP, not the United States. Roughly 60% of data center capital expenditure goes to compute hardware—the GPUs, accelerators, and high-bandwidth memory that make AI run. Almost all of it is manufactured by TSMC in Taiwan and SK Hynix and Samsung in South Korea. The bill of materials for an American AI data center is, for the most part, an East Asian export order. The money leaves. The GDP stays there.

The NBER paper, on the other hand, asks whether individual firms that use AI see higher revenue per worker. And it finds they do—a little.

But here’s the thing: if those firm-level gains were real and meaningful, they would eventually show up somewhere in the macro data. Not in the investment line Goldman is zeroing out, but in output, in total factor productivity, in something. And they mostly haven’t. The paper’s own implied gains are 0.6% in 2025—so small and so diffuse that they’d be essentially invisible in aggregate statistics. Which is exactly what you’d predict if the self-reports are inflated and the real number is tiny. The intuitive story—AI spending is massive, therefore it must be productive—was seductive enough to become conventional wisdom. Politicians cited it. Boards cited it. An NBER paper built an empirical framework around it. And when you look at what the numbers actually show, the signal is barely distinguishable from noise.

Measuring Productivity with a Mood Ring

Here’s where the paper’s methodology gets uncomfortable. The two “productivity” measures it constructs are, respectively, a feeling and an arithmetic trick.

The “reported” measure asks CFOs to select a range that describes how much they believe AI has improved their output per worker. That’s not measurement. That’s a Likert scale capturing executive sentiment about a technology they just spent budget on. The psychological incentive to report positively is enormous. Nobody fills out a survey and writes, “We lit two million dollars on fire and morale is worse.”

The “implied” measure backs out productivity from self-reported AI-attributed revenue and employment changes. It’s more grounded, but it still depends entirely on CFOs accurately isolating the AI effect from every other variable affecting their business—market conditions, pricing, new clients, macroeconomic tailwinds. The paper’s own regressions use sector fixed effects and nothing else. No instrumental variables. No quasi-experimental design. No causal identification of any kind.

The authors acknowledge this, gently, in caveats. But the abstract says “effects.” The title says “evidence.” And the NBER imprimatur says “take this seriously.” The packaging outperforms the product.

The Automation That Was Already on the Shelf

The paper finds that office and administrative support roles—bookkeeping, data entry, and transaction processing—are the most adversely affected by AI. The Negative Exposure Index for these occupations is above 1, indicating that firms describe AI as replacing these roles more often than enhancing them.

This is presented as a finding. It should be presented as an indictment.

Banks and insurance companies have had the technology to automate data entry and basic accounting for decades. RPA, rules-based engines, database automation, OCR, straight-through processing—mature, cheap, deterministic, auditable tools purpose-built for exactly these workflows. If a financial institution still has humans doing manual transaction processing in 2025, that’s not a technology gap. It’s a management failure that predates generative AI by a generation.

So now, along comes the most expensive, least deterministic, hardest-to-audit technology in the modern stack, and suddenly, executives are excited about automating roles that should have been automated with a spreadsheet macro. The hallucinating tool is being deployed where precision is non-negotiable. The probabilistic engine is being handed the work that demands repeatability.

That doesn’t suggest rational deployment of technology. It suggests a herd responding to board pressure and investor narratives. The CFO survey in this paper might actually be capturing that herd behavior—not productivity, but consensus anxiety.

Scouting Reports Don’t Win Games

Moneyball’s popular legacy is uplifting: the little team that could, armed with data instead of dogma. But the actual story is darker. An entire profession of experienced, well-compensated evaluators was catastrophically wrong for half a century. Not because they were stupid. Because they were trusted. Because the system rewarded confidence over accuracy. Because nobody checked, and checking felt rude.

The scouts weren’t villains. They were professionals doing what the institution incentivized them to do: show up, use judgment, sound authoritative, and move on. The institution never asked whether the judgment was right. It only asked whether it sounded right.

That’s this paper. That’s this moment.

We have an NBER working paper—not peer-reviewed, circulated for discussion—built on executive opinion, with no causal framework, presenting correlations as effects, published a month after Goldman Sachs told us the actual GDP contribution is zero. And because it carries the right seal and uses the right vocabulary, it enters the discourse as evidence.

The Seal Is the Product

The paper will almost certainly never survive peer review in its current form. The identification strategy isn’t there. The productivity measures are feelings. The causal claims are aspirational. Any serious journal referee would send it back with a polite but firm suggestion to find an instrument or redesign the survey.

But that might not matter.

NBER working papers don’t need to be published to have an impact. They get cited. They get covered. They shape the narrative. The seal does the work that the methodology didn’t.

And that’s where things get uncomfortable—because we’re living in a moment where institutional seals are doing a lot of heavy lifting across American public life.

In 2019, the president took a Sharpie to a NOAA hurricane forecast map and presented it to the nation with a straight face. The official data said the storm wasn’t heading for Alabama. He drew a bubble that said it was, like a Sharpie, and his steady hand alone commanded nature. When the National Weather Service corrected him, the administration pressured NOAA to retract its own scientists’ work. That wasn’t a policy disagreement. That was a man with institutional power redrawing reality and demanding that the institutions (and Mother Nature) agree.

In August 2025, the same president fired the head of the Bureau of Labor Statistics—hours after a jobs report he didn’t like—accusing her, without evidence, of rigging the numbers. His own first-term BLS commissioner called the firing “totally groundless.” Former Treasury Secretary Janet Yellen said it is the kind of thing you’d expect in a banana republic. The proposed 2026 budget cut the BLS by $56 million, raising questions about whether the agency that measures employment, prices, and economic health could continue to do so accurately.

This is the environment in which the NBER paper lands. Not a vacuum. A landscape where official data is being actively undermined, where institutional credibility is currency, and where a working paper with the right letterhead can enter the policy conversation unchallenged—not because the evidence is strong, but because the brand is familiar.

The scouts have good faces. The reports sound authoritative. The kid can’t hit.

Can this paper be trusted? The evidence it presents answers that question clearly enough. The implied productivity gains are a third of what CFOs claim to feel—and so small they vanish in the aggregate data. The respondents are the wrong people measuring the wrong thing with the wrong tools. And the whole enterprise rests on self-reported data from executives who signed the checks they’re now grading.

In Moneyball, the Oakland A’s eventually won twenty consecutive games—not because the scouts got better, but because someone finally insisted on opening the spreadsheet.

We’re still waiting for someone to open this one. The paper already did—it just didn’t like what the numbers said.

Subscribe to PacoPacket

Sign up now to get access to the library of members-only issues.
Jamie Larson
Subscribe