Vercel’s latest write-up on security boundaries in agentic architectures zeroes in on an uncomfortable default in AI-assisted coding: many agents execute generated code in the same security context that holds real credentials. As “coding agent” patterns spread beyond IDE workflows into support, ops, and internal tooling, that shared trust domain starts to look less like a convenience and more like a structural risk.
The failure mode: prompt injection meets code execution
The piece opens with a simple scenario that’s easy to recognize in modern debugging setups: an agent reads production logs, encounters prompt injection hidden in the log text, and then obediently generates and runs code. In Vercel’s example, the injected instructions push the agent to exfiltrate sensitive files like ~/.ssh/id_rsa and ~/.aws/credentials to an external endpoint.
The point isn’t that logs are uniquely dangerous—it’s that any untrusted text input becomes a potential control plane once an agent can both (a) be influenced by that text and (b) execute arbitrary code with meaningful access.
A useful mental model: four actors, four trust levels
Rather than treating “the agent” as a single blob of functionality, Vercel splits an agentic system into four distinct actors:
- Agent: the LLM-driven runtime, subject to prompt injection and unpredictable behavior
- Agent secrets: tokens, SSH keys, database credentials—necessary, but high-value
- Generated code execution: the agent’s produced scripts/programs, effectively untrusted
- Filesystem/environment: whatever compute the system runs on (laptop, VM, cluster)
This framing sets up the core question: where should hard security boundaries live, instead of relying on default tooling that tends to collapse everything into one context?
Architectures Vercel sees in the wild (and what they miss)
The write-up walks through patterns from least to most secure, starting with the common reality: “zero boundaries,” where agent, secrets, filesystem, and generated code all coexist with the same level of access.
From there, it highlights a secret injection proxy approach that brokers credentials at the network layer so generated code doesn’t see raw secret values—helpful for preventing straightforward exfiltration, but still allowing unexpected calls “while running.”
Vercel also calls out why a shared sandbox (agent + generated code together) is an incomplete fix: it can protect the outside environment, but it doesn’t prevent generated code from targeting whatever the harness can access inside that sandbox.
The more interesting shift is separating agent compute from sandbox compute, placing the harness (and its secrets) in one context and running generated code in another, with no path back to secrets. The “strongest” architecture in the post combines that separation with secret injection so generated programs can use credentials without reading or exporting them.
For the full breakdown (including Vercel’s specific implementation notes and links), the original post is here: https://vercel.com/blog/security-boundaries-in-agentic-architectures?
