This blog is about SITU Agent — an open-source local AI coding agent built for teams whose source code cannot leave their own hardware. Posts cover the reasoning and architectural decisions behind the project.
RSS feedGetting the best out of a local AI coding agent on limited hardware is as much about configuration as it is about hardware. The right inference settings — whether in llama.cpp, Ollama, LM Studio, or vLLM — can mean the difference between 0.3 tok/s and 6 tok/s on the same machine, or between hitting an out-of-memory crash mid-task and running a 14B model comfortably inside 32 GB. This post documents the journey of a client engagement with a software company — rolling out SITU Agent as a local AI coding agent for their development team and systematically tuning llama.cpp parameters against real coding benchmarks to extract maximum speed and output quality from the hardware the developers already had.
Read →Two architectures have emerged for AI agent sandboxing. One gates what the agent is allowed to do. The other removes capabilities the agent would need to exfiltrate data at all. The distinction is not subtle, and only one of them survives a determined attacker — or an accidental misconfiguration.
Read →The local AI coding agent category treats privacy as a configuration outcome, achieved through user discipline and the correct sequence of flags. For an individual contributor that is workable. When privacy is a concern, it is not a control. SITU is what I built to close that gap.
Read →