Cursor just introduced its new cross-platform agent sandboxing in an effort to align its security feature with the competition.
The pitch is simple and developer-centric: run agents freely inside a controlled environment, and only prompt for approval when they need to step outside that boundary—most commonly for network access. Cursor says this approach cuts “approval fatigue,” and that sandboxed agents stop 40% less often than unsandboxed ones, translating into fewer interruptions and less manual review.
Much like Claude Code's and Codex's Cursor sandbox has a uniform API using each OS’s available primitives.
macOS: Seatbelt via sandbox-exec, with dynamic policies
On macOS, Cursor evaluated App Sandbox, containers, virtual machines, and Seatbelt. App Sandbox was ruled out due to the requirement to sign every binary an agent might execute (and the risk of letting agent-created binaries inherit that trust). Containers would constrain execution to Linux binaries, and VMs carried unacceptable startup and memory overhead.
Cursor landed on Seatbelt, accessed via sandbox-exec. Although introduced in 2007 and deprecated in 2016, Cursor notes it’s still used by third-party apps like Chrome. The key capability: run a command under a sandbox profile that constrains an entire subprocess tree, with fine-grained permissions restricting syscalls and file reads/writes.
Notably, Cursor generates the policy at runtime based on workspace-level/admin-level settings and the user’s .cursorignore, including deny rules for sensitive project files and directories.
Linux: seccomp + Landlock, plus overlay tricks for .cursorignore
Linux provides strong primitives—seccomp for blocking unsafe syscalls and Landlock for filesystem access control—but Cursor points out that composing them into a developer-friendly sandbox is on userspace. Existing projects didn’t meet their needs, particularly around honoring .cursorignore.
Cursor’s approach uses seccomp to block syscalls and Landlock to enforce filesystem restrictions so ignored files become inaccessible. To make that work, Cursor maps workspaces into an overlay filesystem, then overwrites ignored files with Landlocked copies that can’t be read or modified.
The tradeoff: scanning and remounting ignored files is described as the slowest part, and Cursor notes that Linux doesn’t make it easy to lazily filter filesystem operations because file paths aren’t easily available in a seccomp-bpf context.
Windows: WSL2 today, native primitives later
On Windows, Cursor runs its Linux sandbox inside WSL2. The company describes an equivalent native Windows sandbox as significantly harder, since existing primitives are often browser-oriented and don’t fit general-purpose developer tools. Cursor says it’s working with Microsoft so the necessary primitives become available.
Making agents “sandbox-aware”
A sandbox only helps if the agent can reason about it. Cursor describes updates to its agent harness so models better anticipate what will work under constraints:
- Updated Shell tool descriptions to clarify whether commands have filesystem, git, or network access depending on settings, plus how to request elevated permissions.
- Changes to how Shell tool results are rendered to explicitly surface the sandbox constraint behind a failure—and in some cases recommend permission escalation.
Cursor evaluated the changes using an internal benchmark (Cursor Bench) and identified a common failure mode: agents repeatedly retrying the same command without changing permissions. The improved feedback loop reportedly helped agents recover more gracefully, with offline eval performance improving.
Rollout and what’s next
Cursor says it rolled sandboxing out over the last three months across macOS, Linux, and Windows, and now sees about one-third of requests on supported platforms running with the sandbox. The post also notes enterprise adoption, including NVIDIA.
Looking forward, Cursor points to “sandbox-native” agents trained on environmental constraints, with the idea that these models could be given more freedom to write scripts and programs directly rather than relying as heavily on tool-calling.
Original source: https://cursor.com/blog/agent-sandboxing
