Why I built SITU
The local AI coding agent category treats privacy as a configuration outcome, achieved through user discipline and the correct sequence of flags. For an individual contributor that is workable. When privacy is a concern, it is not a control. SITU is what I built to close that gap.
How the question arrived
Earlier this year I was advising the management team of a fintech client through their AI tooling decision. Their policy on source code was non-negotiable and predates the current AI cycle: IP-related code and documents must not touch cloud infrastructure under any circumstances. Their developers, predictably, wanted access to agentic coding tools. The brief from the board was straightforward in form and unsolved in substance — deliver the productivity gains, hold the line on IP.
That brief is not unique to one client. I have seen the same requirement come up at other companies a number of times. The board has read the headlines on recent leaks and incidents. Engineering has watched their peers move three months ahead on velocity. Legal and compliance have read enough vendor privacy pages to know the standard cloud answer does not survive contact with the relevant regulation. The decision arrives on the CTO's desk as a binary: trust the cloud offerings — a one-way move that is hard to undo once made — or hold the existing standard of security and privacy.
This is not a novel question. Every domain that has had to handle genuinely sensitive material — payment infrastructure, industrial control, defense networks, hardened operating systems — already worked through it. What the AI coding agent category is still treating as an open problem, the rest of the security profession internalized a long time ago.
What the market is offering, and what it is not
The local-agent ecosystem is unusual in how it frames its own value. Almost every project in the category leads with "local-first" or "private by default" in its README. On inspection, "local" means the model runs on the developer's machine. The agent itself, in most projects, runs as the host user with full network access, full filesystem access, and ambient credentials. The privacy claim depends on settings and soft configuration — a level of trust that is not equivalent to a production-hardened system, and that cannot be tested or audited deterministically.
That is a trust property, not a containment property. Procurement teams routinely conflate the two. Security teams who look carefully do not.
A structured evaluation
I evaluated the category against a single criterion: can the system be hardened at the OS level so that the agent cannot exfiltrate code, regardless of how it is configured or used? The honest answers were as follows. None of these projects are unserious; but their policy-based approach will not pass a security audit, and does not deliver the confidence of a system built security-first.
Aider — runs as the host user with full network and filesystem access. The security model is trust in operator policy. That is a weak control: when privacy matters, anything that depends on policy can fail, and a single failure leaks.
Cline — the agent runs inside the IDE process. Network, filesystem, environment variables, SSH keys, and ambient credentials are all reachable. Per-step approval is the only check — a human-in-the-loop policy, defeated the moment a user clicks through.
OpenHands — the most thoughtfully sandboxed of the comparison set, but its architecture assumes a network path between the agent container and the LLM provider. Because of that networking requirement, the deployment cannot reach a state where exfiltration is structurally impossible.
Goose — has a documented sandbox mode, but it gates the agent's behavior (which tools it may invoke), not the environment the agent runs in. The agent still runs as the host user; a behavior gate can be bypassed, environment isolation cannot.
OpenCode — a fast terminal TUI with broad model support. Hardening is entirely a flag system: --allowedTools and --excludedTools constrain which tools the agent may invoke, but the agent itself runs as the host user with full filesystem and network reach. There is no environment isolation and no scoping of which directories the agent can read or modify. The privacy model is the same trust-in-policy posture as Aider — different surface, same weak control.
The pattern is consistent. Privacy in this category is a configuration outcome, not an architectural property. Adopting any of these tools is a risk decision; that risk can be measured and consciously accepted, but it has to be measured first.
The structural error
The category's structural error is locating the privacy boundary at the wrong layer. A network rule enforced inside the agent — even with prompt-injection detection, even with per-step approval — is a request the agent and its tooling may comply with. A network rule enforced by the operating system, in the form of a process without a network namespace, is absolute. It is auditable from outside the agent and verifiable independently of anything the agent does.
Other security-critical domains worked through this decades ago, usually after a painful incident or two. Payment processors stopped trusting process memory to hold cryptographic keys — keys live in tamper-resistant hardware security modules, because anything reachable from a process turns out, eventually, to be reachable from an attacker. Industrial control networks moved progressively toward air gaps and unidirectional gateways, after years of software-firewalled segregation between IT and OT proved insufficient. Hardened operating systems stopped trusting an untrusted process to confine itself — they impose mandatory access controls at the kernel layer (SELinux, AppArmor), specifically because a process is not in a position to enforce its own boundaries.
The pattern is the same in every case: move the control to a layer where the thing being controlled cannot reach the control. The coding-agent category has not yet applied that pattern to itself.
The kernel does not negotiate. This is not a clever insight; it is the same logic the rest of the security profession applied a long time ago, brought to a category that is still treating it as an open question.
What I built
SITU is the result of moving the privacy boundary from the agent down to the kernel.
The agent runs inside a Podman pod with --network=none as the default mode. The pod has no network interface — not blocked, not firewalled, absent. No process inside the pod can open a socket to anywhere, because no namespace exists for it to bind to. The agent can see only the directory mounted into the pod; it cannot reach the home directory, SSH keys, ambient credentials, or the broader filesystem. The session is ephemeral — pod destroyed on exit, no persisted state.
NETWORK mode, to allow the agent to fetch information from the internet, exists as a separate, explicit switch with its own boundary. There is no intermediate state in which the agent has partial connectivity.
The model runs on local hardware — CPU, NVIDIA CUDA, or Apple Silicon, whatever the developer has. The implementation is a structured framework, written to be audited: the shell layer, the container definitions, and the agent loop are all designed to be read. The whole stack is MIT-licensed, with no plan to put the safety guarantee behind a paywall.
Why publish this now
The regulatory frame in Europe is converging on this question. GDPR is the floor; the AI Act adds specific obligations for high-risk systems; DORA tightens the financial-sector posture; sector regulations in healthcare, defense, and legal practice are already in motion. The build-versus-buy decision for AI coding tools, inside any organization handling data that cannot leave its jurisdiction, has moved from "which vendor's privacy policy is acceptable" to "what is the long-term exposure here, and can today's choice come back to harm the organization later". SITU Agent is a contribution to that movement.
The source is on GitHub. The architectural choices are visible in the shell scripts and container definitions; nothing depends on trusting a description. If your organization is working through the same trade-off, I would like to compare notes.
← All posts