I really don't understand why people have all these "lightweight" ways of sandboxing agents. In my view there are two models:
- totally unsandboxed but I supervise it in a tight loop (the window just stays open on a second monitor and it interrupts me every time it needs to call a tool).
- unsupervised in a VM in the cloud where the agent has root. (I give it a task, negotiate a plan, then close the tab and forget about it until I get a PR or a notification that it failed).
I want either full capabilities for the agent (at the cost of needing to supervise for safety) or full independence (at the cost of limited context in a VM). I don't see a productive way to mix and match here, seems you always get the worst of both worlds if you do that.
Maybe the usecase for this particular example is where you are supervising the agent but you're worried that apparently-safe tool calls are actually quietly leaving a secret that's in context? So it's not that it's a 'mixed' usecase but rather it's just increasing safety in the supervised case?
> unsupervised in a VM in the cloud where the agent has root
Why in the cloud and not in a local VM?
I've re-discovered Vagrant and have been using it exactly for this and it's surprisingly effective for my workflows.
https://blog.emilburzo.com/2026/01/running-claude-code-dange...
It's been ages since I used VirtualBox and reading the following didn't make me miss the experience at all:
> Eventually I found this GitHub issue. VirtualBox 7.2.4 shipped with a regression that causes high CPU usage on idle guests.
The list of viable hypervisors for running VMs with 3D acceleration is probably short but I'd hope there are more options these days for running headless VMs. Incus (on Linux hosts) and Lima come to mind and both are alternatives to Vagrant as well.
I totally understand, Vagrant and VirtualBox are quite a blast from the past for me as well. But besides the what-are-the-odds bug, it's been smooth sailing.
> VMs with 3D acceleration
I think we don't even need 3D acceleration since Vagrant is running the VMs headless anyways and just ssh-ing in.
> Incus (on Linux hosts)
That looks interesting, though from a quick search it doesn't seem to have a "Vagrantfile" equivalent (is that correct?), but I guess a good old shell script could replace that, even if imperative can be more annoying than declarative.
And since it seems to have a full-VM mode, docker would also work without exposing the host docker socket.
Thanks for the tip, it looks promising, I need to try it out!
> though from a quick search it doesn't seem to have a "Vagrantfile" equivalent (is that correct?)
It's just YAML config for the VM's resources:
https://linuxcontainers.org/incus/docs/main/howto/instances_...
https://linuxcontainers.org/incus/docs/main/explanation/inst...
And cloud-init for provisioning:
https://gitlab.oit.duke.edu/jnt6/incus-config/-/blob/main/co...
You mentioned "deleting the actual project, since the file sync is two-way", my solution (in agentastic.dev) was to fist copy the code with git-worktree, then share that with the container.
Yeah local is totally fine too just whatever is easiest to set up.
As someone that does this, it's Turtles All The Way Down [1]. Every layer has escapes. I require people to climb up multiple turtles thus breaking most skiddie [2] scripts. Attacks will have to targeted and custom crafted by people that can actually code thus reducing the amount of turds in the swimming pool I must avoid. People should not write apps that make assumptions around accessing sensitive files.
[1] - https://en.wikipedia.org/wiki/Turtles_all_the_way_down
It's turtles all the way down but there is a VERY big gap between VM Isolation Turtle and <a half-arse seccomp policy> turtle. It's a qualitative difference between those two sandboxes.
(If the VM is remote, even more so).
It’s a risk/convenience tradeoff. The biggest threat is Claude accidentally accesses and leaks your ssl keys, or gets prompt-hijacked to do the same. A simple sandbox fixes this.
There are theoretical risks of Claude getting fully owned and going rogue, and doing the iterative malicious work to escape a weaker sandbox, but it seems substantially less likely to me, and therefore perhaps not (currently) worth the extra work.
How does a simple sandbox fix this at all? If Claude has been prompt-hijacked you need a VM to be anywhere near safe.
Prompt-hijacking is unlikely. GP is most likely trying to prevent mistakes, not malicious behavior.
I was using opencode the other day. It took me a while to realize the that the agent couldn't read/write the .env file but didn't realize it. When I pushed it first it was able to create a temp file and copy it over .env AND write and opencode.json file that disables the .env protection and go wild.
Is there a premade VM image or docker container I can just start with for example Google Antigravity, Claude or Kilocode/vscode? Right now I have to install some linux desktop and all the tools needed, a bit of a pain IMO.
I see there are cloud VMs like at kilocode but they are kind if useless IMO. I can only interact with the prompt and not the code base directly. Too many things go wrong and maybe I also want kilo code to run a docker stack for me which it can't in the agent cloud.
I use https://jules.google.
The UI is obviously vibe-coded garbage but the underlying system works. And most of the time you don't have to open the UI after you've set it running you just comment on the Github PR.
This is clearly an unloved "lab" project that Google will most likely kill but to me the underlying product model is obviously the right one.
I assume Microsoft got this model right first with the "assign issue to Copilot" thing and then fumbled it by being Microsoft. So whoever eventually turns this <correct product model> into an <actual product that doesn't suck> should win big IMO.
Locally, I'd use Vagrant with a provisioning script that installs whatever you need on top of one of the prebuilt Vagrant boxes. You can then snapshot that if you want and turn that into a base image for subsequent containers.
> [...] and maybe I also want kilo code to run a docker stack for me which it can't in the agent cloud
Yes! I'm surprised more people do not want this capability. Check out my comment above, I think Vagrant might also be what you want.
fly.io launched something like that recently:
Just got started with Claude Code the other day, using the dev container CLI. It's super easy.
TLDR:
- Ensure that you have installed npm on your machine.
- Install the dev container CLI globally via npm: `npm i -g @devcontainers/cli`
- Clone the Claude Code repo: https://github.com/anthropics/claude-code
- Navigate into the root directory of that repo.
- Run the dev container CLI command to start the container: `devcontainer --workspace-folder . up`
- Run another dev container command to start Claude in the container: `devcontainer exec --workspace-folder . claude`
And there you go! You have a sandboxed environment for Claude to work in. (As sandboxed as Docker is, at least.)
I like this method because you can just manage it like any other Docker container/volumes. When you want to rebuild it, or reset the volume, you just use the appropriate Docker (and the occasional dev container) commands.
I guess whether container isolation is good enough just comes down to the threat you're protecting against:
- confused/misaligned agent: probably good enough (as of Q1 2026...).
- hijacked agent: definitely not good enough.
But also it's kinda weird that we still have high-level interfaces that force you to care this much about the type of virtualization it's giving you. We probably need to be moving more towards stuff like Incus here that treats VMs and system containers basically as variants of the same thing that you can manage at a higher level of abstraction. (I think k8s can be like that too).