Latest issue 12 Mar 2026

The AI Agent Threat Is Real. And Mostly Self-Inflicted.

By: Paco Campbell
Published: Thursday, March 12th, 2026

The industry is not wrong to argue that large language model (LLM)- based agentic AI poses novel threats worth exploring. Semantic privilege escalation, where an agent gathers permissions that individually are innocuous, but together form toxic combinations. Instruction-data boundary collapse, where an agent cannot reliably distinguish between a command and the content it’s processing. Memory poisoning. Phew. The taxonomy alone is exhausting. But real. These (and others) deserve serious research.

The industry, however, is threat-modeling a Ferrari on roads that haven’t been terraformed or paved.

Let’s zoom out.

An AI agent does not walk into your enterprise. It does not knock on the door. It does not demand access to your systems, customer data, or compliance workflows. A human let it in. With ignorance or malice. Because a hype-fueled executive was peddling AI, or because they want to be cool.

Whatever the reason, the root cause is not mystical. They have a name. A spouse. A pet, perhaps? — We know who they are.

It is the failure of policies, controls, guidelines, procedures, awareness, and training that allowed one person to set up an agent with broad permissions that can expand itself at any time. If one person can break the company by spinning up an agent, the problem isn’t AI. It’s your controls. If an agent finds secrets in code and uses them, the problem isn’t AI. It’s your controls.

Here’s what people keep avoiding. LLM-based agents are probabilistic systems. That makes them fundamentally unsafe as autonomous decision engines in environments that require deterministic behavior. That single sentence eliminates most enterprise agentic use cases from the conversation before we even get to threat modeling them.

What should most enterprise AI be doing? Summarization. Drafting. Research synthesis. Helping humans think faster and write better. This century’s Encarta, Wikipedia, and Google — fused, corrected, augmented, and on an Adderall binge. Genuinely useful. Transformative for productivity. And none of it requires anyone’s credentials to write anything consequential.

And interestingly, the data suggests that’s exactly how these systems are actually being used.

Anthropic’s own researchers just published an analysis of real-world AI interactions and found that, while their models are theoretically capable of handling 94% of tasks in computer and math occupations, observed usage is only 33%. Think about that gap. The builders themselves — the ones with every financial incentive to tell you AI can do everything — are showing us the distance between what they THINK this technology might do and what people actually use it for.

And what people actually use it for looks a lot like augmentation — writing faster, thinking through problems, synthesizing information. Not autonomy. Not decisions. Not agents with keys to production systems.

Which makes the current enterprise obsession with autonomous agents even stranger. We have data showing people naturally reach for these tools to go faster as humans — and the industry’s response is to try to remove the human entirely.

We have a technology that excels at accelerating human cognition — and instead we’re trying to hand it the keys to production systems that require deterministic behavior.

It’s like discovering calculators and immediately trying to automate corporate treasury decisions.

Before deploying any LLM-based system, three questions should be mandatory. First, is this problem already solved without LLMs? If yes, use that solution. Second, does this introduce LLM-specific failure modes, such as non-determinism, prompt injection, or hallucinations, into a context that cannot tolerate them? If yes, stop. Third, does this require transparent, auditable judgment in consequential decisions? If yes, a probabilistic system that cannot explain its reasoning in auditable terms has no business being in that workflow.

If your use case fails any one of these, you don’t have an AI opportunity. You have a solution looking for a problem, dressed in a press release.

The problem is we skipped ALL THAT conversation. Vendors skipped it on purpose because productivity tooling doesn’t sell nine-figure enterprise contracts. Executives skipped it because autonomous agents sound more impressive than a really good drafting assistant. And so, now we’re writing sophisticated threat models for deployment patterns that shouldn’t exist yet.

Now let’s talk about why this keeps happening.

Humans associate intelligence with language. If something speaks to us fluently, confidently, and articulately, we are wired to treat it as smart. We evolved that way. A well-spoken person in a meeting gets deferred to even when they’re wrong. We’ve never had a system that can produce unlimited, infinitely patient, contextually perfect articulation on demand — with zero understanding behind it. No judgment. No stake in the outcome.

LLMs are the ultimate fluency machine. They don’t think. They pattern-match and predict. They draw on their training (which, for most models, we are not privy to) to put together plausible-sounding things. They sound like they think, and we are poorly equipped to override that instinct at scale.

A study published in Nature Human Behaviour puts a number on it. Humans defer to AI recommendations at nearly three times the rate they defer to other humans. Which means human-in-the-loop — our beloved compensating control — is already undermined by the technology’s own persuasive fluency. We built an oversight mechanism in which the thing being overseen is “neurologically” (I hate that I just anthropomorphized the thing) better at defeating it than we are at maintaining it.

And then there’s George Kurtz.

The CEO of CrowdStrike — the world’s most prominent cybersecurity company, whose entire brand is that expert humans protect critical systems — decided the best response to a market selloff triggered by the release of an Anthropic security tool was to go straight to the source. His words. He opened Claude, confirmed it knew it was talking to George Kurtz, CEO of CrowdStrike, (presumably, based on their conversation), and asked it to build a tool to replace his own company.

Claude responded: “I appreciate the ambition, George.”
Then politely explained that CrowdStrike is a massive platform built by thousands of engineers over a decade-plus, with kernel-level endpoint monitoring, trillions of security events, and automated incident response. Not something you can replicate with a script.

He posted that. As his defense. To the internet.

Let that sit.

Every human in that man’s orbit said yes. The chatbot said “bless your heart” in four words. And he published the chatbot’s polite refusal as a victory lap — apparently not noticing that Claude had just articulated his own company’s moat more clearly than his last three earnings calls.

That’s not intelligence augmentation. That’s expensive sycophancy with good grammar. And a securities lawyer somewhere just got a twitch in their eye.

The technical risks are real. But the dominant failure mode isn’t adversarial AI. It’s organizations mistaking fluent text generation for trustworthy judgment.

There has to be governance. Defined by people who genuinely understand what these systems actually are — retrieval and synthesis engines with a confidence dial stuck on eleven. Not oracles. Not decision-makers. Not strategic advisors to publicly traded companies in a market crisis.

Tone at the top matters most. And right now the tone is: move fast, deploy broadly (regardless of whether it makes sense), ask the chatbot if we’re okay, and post the answer.

That’s not an AI strategy. That’s a liability incubator.

P.S. McKinsey not knowing of SQL Injection in 2026 is… Yikes. Keep pushing hype, y’all!

Subscribe to PacoPacket