Retro CLI for testing Mistral Small 4 against local llama.cpp with the official Mistral SDK and MCP tools.
Retro terminal CLI for testing and using Mistral Small 4 locally againstllama.cpp, with the official mistralai Python SDK.
The repository is intentionally focused on one workflow:
shell, read_file, write_file, list_dir, search_textmcp.json usingFIRECRAWL_API_KEY from your environment/image and /doc attachment commandsmake sync
uv run python -m mistral4cliUseful one-shot smoke test:
uv run python -m mistral4cli --once "Return only the word ok." --no-streamInside the REPL:
/help for actionable usage/defaults to inspect runtime parameters/tools to inspect loaded tools/run -- ... to execute a shell command/ls [PATH] to inspect the tree/find -- ... to search text in the workspace/edit PATH -- ... to write text files/image to pick and analyze images/doc to pick and analyze documents/reset, /system ..., /exitThe local model is expected to be running outside this repo with llama.cpp.
The documented launch profile is:
llama-server \
-hf unsloth/Mistral-Small-4-119B-2603-GGUF:UD-Q5_K_XL \
--host 0.0.0.0 --port 8080 \
--jinja --flash-attn off --no-mmap \
--chat-template-file ./mistral-small-4-reasoning.jinja \
--ctx-size 262144 \
-ngl 99 \
--temp 0.7 --top-p 0.95 --top-k 40 --min-p 0.0 \
--parallel 1 --ctx-checkpoints 32 --cache-prompt \
--threads "$(nproc)"Recommended runtime defaults used by the CLI:
temperature=0.7top_p=0.95prompt_mode=reasoningmax_tokens unset unless you override itThe repository now includes the exact reasoning template atmistral-small-4-reasoning.jinja. In this
local setup it is effectively required if you want reasoning enabled by
default, because it sets reasoning_effort=high in the llama.cpp chat
template.
For the detailed local runbook, see
docs/local-mistral-small-4.md.
make check
make test
make docsmake check runs formatting, lint and type checks.make test runs the full pytest suite, including local integration tests
that require the llama.cpp server.make docs regenerates the checked-in API reference from public docstrings.
src/mistral4cli/ - CLI, session, tools and attachment handlingtests/ - unit and integration testsdocs/local-mistral-small-4.md - detailed local deployment notesdocs/reference.md - generated API reference from public docstringsmistral-small-4-reasoning.jinja - versioned llama.cpp reasoning templatemcp.json - optional FireCrawl MCP config that expandsFIRECRAWL_API_KEY at runtimeMIT. See LICENSE.
ibitato/Mistral4SmallClient
April 13, 2026
April 13, 2026
Python