First Steps

Configuration

SITU reads its settings from situ.conf, located in the same directory as the launch script (~/.situ/situ.conf). All parameters are optional except MODEL.

The configuration file

Open ~/.situ/situ.conf in any text editor. Each setting is a simple KEY=VALUE line.

Parameter	Content	Default	Description
`CONTAINER_ENGINE`	`podman` or `docker`	`podman`	The container engine SITU uses to create and manage containers. Set to `docker` if you are running Docker instead of Podman.
`MODE`	`RESTRICTED` or `NETWORK`	`RESTRICTED`	Controls external network access. `RESTRICTED` creates an internal container network with no external routes; `NETWORK` allows external connections. See Restricted Mode and Network Mode.
`MOUNTPOINT`	Absolute directory path	Current working directory	The host directory mounted into the SITU container as the workspace. Set this to an encrypted volume or a project root to control exactly what the agent can read and write.
`LLAMA_IMAGE`	Container image reference	`ghcr.io/ggml-org/llama.cpp:server` (CPU)	The llama.cpp server image used as the model sidecar. Switch to a CUDA variant (e.g. `ghcr.io/ggml-org/llama.cpp:server-cuda`) to use GPU acceleration. Ignored when `LM_HOST` is set.
`MODEL`	GGUF filename	none — required	Path to the model file relative to the `~/.situ/models/` directory. The file must exist before starting the agent. Example: `gemma-4-E4B-it-Q4_K_M.gguf`.
`CTX_SIZE`	Integer (tokens)	`0`	Context window size passed to the llama.cpp server. `0` automatically uses the model's own training context size, which is the recommended default. Set an explicit value to cap memory use on hardware-constrained machines or to extend context beyond the training limit with RoPE scaling.
`TEMPERATURE`	Float	`0.1`	Sampling temperature passed to the llama.cpp sidecar. Lower values produce more deterministic output; higher values increase creativity. Only applies to the local sidecar — ignored when `LM_HOST` is set.
`MAX_TOKENS`	Integer (tokens)	`16384`	Per-call generation budget for the agent. Caps how many tokens the model may produce in a single LLM call, including any reasoning tokens. Increase for tasks that require very long single-step outputs.
`REASONING`	`true` or `false`	`false`	Enables the model's extended thinking (reasoning) mode. When `true`, the agent requests a reasoning trace before each response, which improves output quality at the cost of additional tokens. Set to `false` to disable reasoning and reduce token usage.
`REASONING_BUDGET_MAXPERCENT`	Integer (%)	`25`	Caps the server-side thinking budget at this percentage of `MAX_TOKENS`. At the default of 25% with `MAX_TOKENS=16384`, the model may use up to 4096 tokens for reasoning before being guided to produce its answer. Set to `0` to disable thinking at the server level (also forced when `REASONING=false`). Set to `-1` for no limit.
`REASONING_BUDGET_MESSAGE`	String	Let me now write the solution.	Message injected by the llama.cpp server immediately before the `</think>` tag when the reasoning budget is exhausted. This guides the model to transition cleanly from reasoning to its answer rather than being cut off mid-thought.
`PARALLEL`	Integer	`1`	Number of parallel request slots in the llama.cpp server. The default of `1` is correct for single-user use and minimises KV cache memory. Increase this only when multiple SITU pods share a single `llamaservice` instance.
`LMS_READY_TIMEOUT`	Integer (seconds)	`180`	How long SITU waits for the llama.cpp server to finish loading the model before aborting. Increase this for very large models on slow storage.
`LM_HOST`	Hostname or IP address	none — pod sidecar is used	Connect to an existing LM server on the network instead of spinning up the built-in llama.cpp sidecar. Requires `MODE=NETWORK`. Useful when a more powerful machine on the local network runs the model.
`LM_PORT`	Integer (port number)	`8080`	Port of the external LM server. Only relevant when `LM_HOST` is set.

Lines beginning with # are comments and have no effect — the file ships with every parameter commented out and its default shown in the comment.

Command Line Parameters — per-session overrides for any situ.conf value.
Restricted Mode — what MODE=RESTRICTED enforces at the kernel level.
Network Mode — what changes when MODE=NETWORK is set.
Installation — download a model and run the first isolated local AI coding session.

Configuration

The configuration file

Related