Configuration
SITU reads its settings from situ.conf, located in the same directory as the launch script (~/.situ/situ.conf). All parameters are optional except MODEL.
The configuration file
Open ~/.situ/situ.conf in any text editor. Each setting is a simple KEY=VALUE line.
| Parameter | Content | Default | Description |
|---|---|---|---|
CONTAINER_ENGINE |
podman or docker |
podman |
The container engine SITU uses to create and manage containers. Set to docker if you are running Docker instead of Podman. |
MODE |
RESTRICTED or NETWORK |
RESTRICTED |
Controls external network access. RESTRICTED creates an internal container network with no external routes; NETWORK allows external connections. See Restricted Mode and Network Mode. |
MOUNTPOINT |
Absolute directory path | Current working directory | The host directory mounted into the SITU container as the workspace. Set this to an encrypted volume or a project root to control exactly what the agent can read and write. |
LLAMA_IMAGE |
Container image reference | ghcr.io/ggml-org/llama.cpp:server (CPU) |
The llama.cpp server image used as the model sidecar. Switch to a CUDA variant (e.g. ghcr.io/ggml-org/llama.cpp:server-cuda) to use GPU acceleration. Ignored when LM_HOST is set. |
MODEL |
GGUF filename | none — required | Path to the model file relative to the ~/.situ/models/ directory. The file must exist before starting the agent. Example: gemma-4-E4B-it-Q4_K_M.gguf. |
CTX_SIZE |
Integer (tokens) | 0 |
Context window size passed to the llama.cpp server. 0 automatically uses the model's own training context size, which is the recommended default. Set an explicit value to cap memory use on hardware-constrained machines or to extend context beyond the training limit with RoPE scaling. |
TEMPERATURE |
Float | 0.1 |
Sampling temperature passed to the llama.cpp sidecar. Lower values produce more deterministic output; higher values increase creativity. Only applies to the local sidecar — ignored when LM_HOST is set. |
MAX_TOKENS |
Integer (tokens) | 16384 |
Per-call generation budget for the agent. Caps how many tokens the model may produce in a single LLM call, including any reasoning tokens. Increase for tasks that require very long single-step outputs. |
REASONING |
true or false |
false |
Enables the model's extended thinking (reasoning) mode. When true, the agent requests a reasoning trace before each response, which improves output quality at the cost of additional tokens. Set to false to disable reasoning and reduce token usage. |
REASONING_BUDGET_MAXPERCENT |
Integer (%) | 25 |
Caps the server-side thinking budget at this percentage of MAX_TOKENS. At the default of 25% with MAX_TOKENS=16384, the model may use up to 4096 tokens for reasoning before being guided to produce its answer. Set to 0 to disable thinking at the server level (also forced when REASONING=false). Set to -1 for no limit. |
REASONING_BUDGET_MESSAGE |
String | Let me now write the solution. | Message injected by the llama.cpp server immediately before the </think> tag when the reasoning budget is exhausted. This guides the model to transition cleanly from reasoning to its answer rather than being cut off mid-thought. |
PARALLEL |
Integer | 1 |
Number of parallel request slots in the llama.cpp server. The default of 1 is correct for single-user use and minimises KV cache memory. Increase this only when multiple SITU pods share a single llamaservice instance. |
LMS_READY_TIMEOUT |
Integer (seconds) | 180 |
How long SITU waits for the llama.cpp server to finish loading the model before aborting. Increase this for very large models on slow storage. |
LM_HOST |
Hostname or IP address | none — pod sidecar is used | Connect to an existing LM server on the network instead of spinning up the built-in llama.cpp sidecar. Requires MODE=NETWORK. Useful when a more powerful machine on the local network runs the model. |
LM_PORT |
Integer (port number) | 8080 |
Port of the external LM server. Only relevant when LM_HOST is set. |
Lines beginning with # are comments and have no effect — the file ships with every parameter commented out and its default shown in the comment.
Related
- Command Line Parameters — per-session overrides for any
situ.confvalue. - Restricted Mode — what
MODE=RESTRICTEDenforces at the kernel level. - Network Mode — what changes when
MODE=NETWORKis set. - Installation — download a model and run the first isolated local AI coding session.