I am still on a quest to stay out of the loop with coding agents, to reach warp speed yoloness. So I am obsessing over sandboxes.

Can I put an agent in a box, give it a task and go to sleep? There are tons of solutions right now but it’s hard to tell which is the right approach.

The reason is that sandboxing agents isn’t one problem but at least two. A local sandbox on a developer machine and a remote multi-tenant sandbox serve different threat models, require different controls, and fail in different ways. Treating them as the same leads to the wrong tradeoffs.

(For the underlying risk model this builds on, see How I Think About Agentic Risks.)

Two sandboxes, two threat models

Local sandboxes

A local sandbox constrains an agent running on a developer machine. The primary risks are:

The attacker model here is not a hostile tenant but a confused deputy: an agent steered off course by prompt injection, poisoned context from an MCP server, or plain hallucination. The agent has no malicious intent. It just can’t distinguish trusted instructions from untrusted input, and it has access to everything on your machine.

Remote sandboxes (multi-tenant)

A remote sandbox runs many workloads on shared infrastructure. The risks expand:

Here we must assume adversarial workloads. The isolation boundary is foundational: if it fails, the blast radius is platform wide.

Local sandboxes: policy-centric controls

A shortcut that’s held up: local sandboxing is primarily a policy problem. You’re not defending against escape attempts. You’re constraining a well-intentioned but unreliable agent on a machine full of valuable stuff.

The relevant risk amplifiers are capabilities (what tools the agent can invoke), data access (what secrets and context it can see), and untrusted input (prompt injection, poisoned data). These are the knobs we can try to turn when configuring the agent.

The most effective controls:

Every one of these controls has a friction cost: a sandbox that’s too annoying gets disabled, which is worse than no sandbox because it creates a false sense of security. Local sandboxes must be low-friction by default, with the option to tighten, not the other way around.

Remote sandboxes: boundary-centric controls

Remote sandboxing is a different problem. We’re not managing a single user’s convenience. We are defending shared infrastructure against workloads we don’t control.

The risk amplifiers that matter most here are isolation boundary quality (the escape surface and blast radius if it fails), egress topology (whether the agent can freely phone home or egress is mediated), and platform abuse (CPU/RAM/disk exhaustion, runaway LLM calls, using your infra as a launchpad for scanning, spam, or scraping). Platform abuse deserves explicit attention because it’s the risk that scales with tenancy. One rogue agent is a nuisance a thousand is a serious incident.

The most effective controls:

Remote sandboxing is less about preventing a single bad tool call and more about ensuring bad behavior cannot become a systemic incident.

The control plane becomes the perimeter

Once you adopt “no secrets in the sandbox,” you implicitly create a control plane that sits outside the sandbox boundary:

This is the architectural consequence most teams don’t anticipate. Sandboxing forces you into a broker model whether you planned for one or not. The sandbox becomes constrained and disposable while the control plane becomes durable and high-value. Your security investment shifts accordingly. The control plane is now the thing worth defending, not the sandbox itself.

Good example of this is what Browser Use document here.

This pattern mirrors what LLM gateways already promise to do for model access: mediate, log, enforce policy, and keep credentials out of the hot path. In an agentic architecture the control plane extends that pattern to tools, storage, and network access. Same principle: put a policy aware broker between the untrusted component and everything it shouldn’t touch directly.

Sandboxing here is not only isolation but also deciding where authority lives.