Skip to content

OpenClaw Sandbox Mode Explained: off, non-main, all

Burak Sonmez

Burak Sonmez · Software engineer writing about security & ops

May 06, 2026

8 min read––– views

OpenClaw Sandbox Mode Explained: off, non-main, all

OpenClaw's sandbox.mode setting is the single switch that decides how much damage a compromised subagent can do — off means full host access, non-main sandboxes only subagents (the production default), and all sandboxes everything including the main agent. Pick non-main unless you have a specific reason not to: it's the only mode that defends against prompt injection in subagent tasks without breaking your main workflow.

This post is the deep-dive companion to the full hardening guide. The pillar covers the whole hardening checklist; this one focuses on the sandbox, which is the layer that contains damage when something does go wrong upstream. Defaults and JSON paths come from the OpenClaw docs.

The three sandbox modes at a glance

ModeScopeWhen to use
offNothing sandboxedLocal development on a throwaway VM, never production
non-mainSubagents onlyRecommended default for any setup that runs subagents
allMain agent + subagentsHigh-paranoia setups where main agent also handles untrusted input

The setting lives at agents.defaults.sandbox.mode in ~/.openclaw/openclaw.json:

{
  "agents": {
    "defaults": {
      "sandbox": {
        "mode": "non-main"
      }
    }
  }
}

A few things worth understanding before we go deeper:

  • The main agent is the one orchestrating. It reads your prompts, plans tasks, and decides what to delegate. It runs as the bot user on the host machine — typically openclaw if you followed the non-root setup pattern.
  • Subagents are workers. When the main agent decides "I need to fetch this URL and analyze the page", it spawns a subagent. Subagents are short-lived and handle one well-defined task.
  • The risk profile is different for each. The main agent operates on local context you authored. Subagents operate on whatever they were sent to fetch — web pages, repos, files of unknown provenance. That asymmetry is what makes non-main the right default.

Why non-main is the production default

Almost every realistic prompt-injection vector flows through subagents:

  • A web page contains hidden instructions in white-on-white text. A subagent fetches it for "research" and treats the hidden text as authoritative.
  • A README in a third-party repo contains a hostile instruction block. A code-review subagent reads it and acts on it.
  • An open issue on a tracker has a poisoned comment. A triage subagent processes the thread and gets misled.
  • A document the user uploaded has injection in a comment field. A summarization subagent picks it up.

In every case, the attack surface is external content the subagent was asked to consume. The main agent isn't exposed to this directly — it sees the subagent's output, not the raw poisoned content. Putting the sandbox boundary around subagents puts it exactly where the risk lives.

mode: all adds main-agent isolation on top, but it costs convenience. The main agent loses easy access to your local files, your shell history, your previous chat context stored on disk. Some of that's recoverable through volume mounts and explicit sharing, but you spend a lot of operational complexity for a marginal increase in security against a much rarer threat.

I run non-main on every OpenClaw instance I operate. The only time I'd consider all is if my main agent was also processing untrusted documents (e.g., a customer-facing intake bot) — but that's a different deployment shape than personal-use OpenClaw.

mode: off shouldn't ship to anything you care about. Even on a development machine, it teaches you bad habits — you stop thinking about the subagent boundary. Run sandboxed even in dev so the production setup matches.

workspaceAccess: none / ro / rw and when to use each

Once a subagent is sandboxed, the next decision is how much filesystem the sandbox sees. That's agents.defaults.sandbox.workspaceAccess:

{
  "agents": {
    "defaults": {
      "sandbox": {
        "mode": "non-main",
        "workspaceAccess": "rw"
      }
    }
  }
}

The three values:

  • none — the sandbox starts with an empty filesystem. The subagent can run code that doesn't need any files (e.g., HTTP requests, calculations, single-shot analysis of in-memory data). Best for fully untrusted tasks: web browsing, scraping, parsing remote content. If the subagent is compromised, there's nothing for it to read or write.
  • ro — the workspace directory is mounted read-only. The subagent can read project files, run linters, generate analysis reports. It cannot modify anything. Best for code review, doc generation, search-and-summarize.
  • rw — the workspace is mounted read-write. The subagent can create files, edit existing ones, run builds. Required for coding tasks: writing new modules, applying refactors, generating tests.

Pick the lowest level that lets the task complete. The defaults set in ~/.openclaw/openclaw.json apply to all subagents unless an individual task overrides them. A task-specific override looks like:

{
  "agents": {
    "tasks": {
      "web-research": {
        "sandbox": { "workspaceAccess": "none" }
      },
      "code-review": {
        "sandbox": { "workspaceAccess": "ro" }
      }
    }
  }
}

Per-task overrides are the right place for stricter rules. The default of rw is a reasonable starting point because most personal-use tasks involve coding — but if you have a task type that handles untrusted input, override it down.

Network and capability isolation inside the sandbox

workspaceAccess controls filesystem. Two more settings control everything else: network and Linux capabilities.

{
  "agents": {
    "defaults": {
      "sandbox": {
        "mode": "non-main",
        "workspaceAccess": "rw",
        "scope": "session",
        "docker": {
          "readOnlyRoot": false,
          "network": "bridge",
          "user": "1000:1000",
          "capDrop": ["ALL"]
        }
      }
    }
  }
}

The Docker block tunes the container itself:

  • networknone air-gaps the container completely (no internet at all). bridge allows network access for tasks that need npm, git, or external APIs. host shares the host's network stack — strongest performance, weakest isolation, only use if absolutely needed. bridge is the right default; flip to none per-task for fully untrusted work like browsing.
  • user: "1000:1000" — runs the sandbox as a non-root user inside the container. Even without dropping capabilities, a non-root user inside a container is much harder to escalate from than a root one. The specific UID doesn't matter much; just don't leave it as root (UID 0).
  • capDrop: ["ALL"] — strips every Linux capability the container could otherwise have. By default, Docker containers run with about a dozen capabilities enabled (CHOWN, NET_BIND_SERVICE, SETUID, etc.). capDrop: ["ALL"] removes them all. Combined with the non-root user above, the container becomes effectively unprivileged: it cannot mount filesystems, change network interfaces, or escalate privileges. If a subagent gets compromised, this is the wall it hits.
  • readOnlyRoot: false — leaves the container's root filesystem writable. Set to true if your tasks don't need to write outside the workspace mount; it adds an extra layer of immutability. Most personal-use setups leave it false because npm and similar tools need to write to /tmp or system paths.
  • scope: "session" — the container lifecycle. session means each chat session gets a fresh container; tasks within the session share state. task would isolate per task (more secure, more overhead). Session scope is the right default for personal use.

These settings layer with the gateway hardening from the full gateway security guide: network controls what the bot can be reached on; sandbox controls what each subagent can do once it's running.

Verifying the sandbox is actually working

Configuration is half the story. The other half is checking that the runtime actually applies what you wrote.

1. Confirm Docker is the active sandbox engine.

openclaw doctor

doctor reports the configured sandbox mode, the active engine (Docker, Podman, or none), and any misconfigurations. Look for a line like sandbox.engine: docker (running). If doctor reports sandbox.engine: none while mode is set to non-main, the sandbox isn't actually running and your settings are inert.

2. Watch a subagent container come up live.

In one terminal:

docker ps -a --filter 'label=app=openclaw' --format 'table {{.Names}}\t{{.Image}}\t{{.Status}}'

In another, ask OpenClaw to do something that spawns a subagent (e.g., "browse this URL and summarize"). A new container should appear in the first terminal, run, and exit when the task completes. If no container appears, the sandbox isn't engaging.

3. Confirm capabilities are actually dropped.

When a subagent container is running:

docker inspect $(docker ps -q --filter 'label=app=openclaw' --latest) \
  --format '{{.HostConfig.CapDrop}} {{.HostConfig.CapAdd}} {{.Config.User}}'

You should see something like [ALL] [] 1000:1000. [ALL] in CapDrop means every capability is dropped; an empty CapAdd means none have been re-added; 1000:1000 is the non-root user. If you see [] for CapDrop or root for User, the runtime config doesn't match your ~/.openclaw/openclaw.json — restart the OpenClaw service and re-check.

If those three checks all return what they should, the sandbox is doing its job: subagents run isolated, with no filesystem they can write outside the workspace mount, no Linux capabilities to escalate with, and a non-root user that can't pretend to be root. That's the wall a compromised subagent hits — and from a hardening standpoint, it's the most important wall in the whole system.