The Warning Wasn't Wrong

Peter Vandermeersch spent years warning journalists about AI hallucinations.

He wrote regularly about them. He lectured colleagues about the seductiveness of AI-generated text — how convincing it was, how easy it was to use without checking. He called it one of the defining risks of the current moment in journalism. His Substack, "Press and Democracy," covered "the vital connection between a free press and a healthy democracy."

Last week, NRC — the Dutch newspaper where he was editor-in-chief in the 2010s — published an investigation. Vandermeersch had, across dozens of newsletter posts, published AI-generated quotes attributed to real people who never said them. Seven of those people came forward to say the quotes were false.

Vandermeersch didn't dispute it. He published a post titled "I am admitting my mistake."

"It is particularly painful," he wrote, "that I made precisely the mistake I have repeatedly warned colleagues about: these language models are so good that they produce irresistible quotes you are tempted to use as an author. Of course, I should have verified them. The necessary 'human oversight,' which I consistently advocate, fell short."

In February, Benj Edwards — Ars Technica's senior AI reporter, one of the most qualified journalists in the world on this exact subject — published an article about the Rathbun incident, in which an AI agent had fabricated a hit piece against a software maintainer.

His article contained AI-fabricated quotes about the same software maintainer. The same person who was the victim of the first fabrication discovered the second fabrication in the article about the first. Edwards, sick and working from bed, had used multiple AI tools to help extract quotes. He didn't check them.

He took full public responsibility before he was fired. He didn't blame the tools.

Two senior journalists. Both had more knowledge of AI fabrication risks than almost anyone in their field. Both issued warnings. Both fell in.

This pattern has a name in cognitive science: the knowing-doing gap. It's the distance between understanding a risk and implementing a protection against it. The gap exists because knowing is easy and doing costs something — usually time and attention, occasionally pride.

Armin Ronacher, who built Flask and now works on the Rust tooling ecosystem, wrote this week about friction. Some friction exists for a reason, he argued. The desire to eliminate all friction — to go faster, automate more, remove every step that slows you down — is dangerous when the friction was the protection.

The verification step is friction. It is also the only thing that prevents the irresistible AI-generated quote from becoming published fact. Vandermeersch knew this. He eliminated the friction anyway.

What he had was advocacy. What he was missing was architecture.

Advocacy is the warning. It's the post that explains the risk, the memo that calls for human oversight, the public statement that this is something we need to take seriously. Advocacy is necessary. It is also not sufficient.

Architecture is the check. It's the step in the workflow where the quote gets verified before it goes to print. It doesn't require knowing that AI fabricates quotes. It requires doing a thing — calling the source, checking the transcript, asking "did this person actually say this" — regardless of how good the quote sounds.

Vandermeersch had one. He was missing the other. His advocacy was correct. His architecture wasn't there.

The recursive quality of both incidents is what makes them instructive.

Edwards was covering AI fabrication when his article contained AI fabrication. Vandermeersch was writing about journalism's obligations when his journalism failed those obligations. In both cases, the expertise wasn't the problem. The expertise was real. The awareness was accurate. What was missing was the concrete step — a workflow change, a verification habit, an architectural commitment that didn't depend on remembering to do it in the moment.

This is why the policy-versus-architecture distinction keeps reasserting itself. Anthropic has a Responsible Scaling Policy that described what thresholds require what responses. The RSP was dropped in February — not because anyone decided the thresholds were wrong, but because "binding commitments are harder to get right than I'd thought." The warning was there. The commitment was softened.

The Pentagon's argument against Anthropic's human-in-the-loop requirement was that US law already prohibits autonomous weapons. The law is advocacy. The contract clause Anthropic was asking for is the check. The Pentagon offered the warning; Anthropic wanted the verification step.

Vandermeersch's phrase is precise: "the necessary human oversight, which I consistently advocate."

Advocate. Not implement. Not enforce. Not make structurally mandatory.

It's not a cynical statement. He means it. The advocacy was sincere. The failure was specific — not in his beliefs but in the gap between his beliefs and the concrete action they require.

Armin Ronacher again: "There's a reason we have cooling-off periods for some important decisions." The cooling-off period is an architectural commitment. You don't decide to take a cooling-off period in the moment; you design the system so that the period is built in. Vandermeersch didn't design the verification into his workflow. He relied on remembering to do it when the AI-generated quote looked good.

It looked good. He didn't remember.

The warning wasn't wrong. That's what makes it hurt.

If the warning were wrong — if the risk were overstated, if AI fabrication were rare — then the failure would be forgivable as a misallocation of caution. But the warning was right. Vandermeersch knew exactly what would happen. He'd described it in detail. And it happened anyway, to him, because knowing the hazard doesn't mean you've built the thing that prevents it.

The verification step is boring. It's not the insight. It's not the warning. It's not the essay about what's at stake. It's checking whether the person actually said the thing the AI says they said. It takes time. It sometimes means the quote doesn't make it in.

That friction is the point.

One more thing from today's news, because it connects.

Iran fired two ballistic missiles at Diego Garcia — the US-UK base in the Indian Ocean, previously assumed to be beyond Iran's missile range. One failed mid-flight. One was intercepted. No hits.

But: the assumption of safety was wrong. The thing everyone was confident couldn't happen was attempted. The range assumption was the protection. Demonstrating the range destroys the assumption.

The warning about AI hallucinations was correct. Vandermeersch knew it. The verification step would have caught it. The verification step wasn't there.

Sometimes the missile gets through. Sometimes the assumption about range was the last thing standing between you and the impact.

The assumption held, this time. The check didn't.