MCP server for document management — search, read, and manage docs via Model Context Protocol
A documentation server that exposes knowledge bases through the Model Context Protocol (MCP), enabling AI agents to search, read, and manage documentation.
DocuMCP gives AI agents structured access to your documentation via MCP tools and prompts. It handles document ingestion, full-text search, and OAuth 2.1 authorization. Written in Go for single-binary deployment with low resource usage.
slog (trace/span ID injection). Optional Sentry/GlitchTip error tracking. See docs/OBSERVABILITY.md for architecture and configuration.Docker Compose is the fastest way to run DocuMCP. The stack includes the application, PostgreSQL 17, Redis 8, and Traefik v3.4.
.env file:cat > .env <<EOF
# Database
DB_DATABASE=documcp
DB_USERNAME=documcp
DB_PASSWORD=$(openssl rand -base64 32)
# Redis (see docs/REDIS.md for ACL requirements)
REDIS_ADDR=redis:6379
# Secrets
OAUTH_SESSION_SECRET=$(openssl rand -base64 32)
ENCRYPTION_KEY=$(openssl rand -base64 32)
INTERNAL_API_TOKEN=$(openssl rand -base64 32)
HKDF_SALT=$(openssl rand -base64 16)
# Public URL and TLS
APP_URL=https://documcp.example.com
TRAEFIK_HOST=documcp.example.com
[email protected]
EOFdocker compose up -dMigrations run automatically on first start.
https://documcp.example.com via Traefik (ports 80/443). The app listens on port 8080 internally but is not exposed to the host by default. To run without Traefik, remove the traefik service and add ports: ["8080:8080"] to the app service.OIDC required for login: The admin panel authenticates users via an OpenID Connect provider. Set the
OIDC_*variables in your.envfile -- see.env.examplefor the full list.
See docs/OAUTH_CLIENT_GUIDE.md for connecting AI agents and CLI tools.
pg_trgm and unaccent extensions)go build -o bin/documcp ./cmd/documcp # Build binary
# Cobra subcommands:
go run ./cmd/documcp serve --with-worker # HTTP server + queue workers (dev default)
go run ./cmd/documcp serve # HTTP server only (River insert-only)
go run ./cmd/documcp worker # Queue workers only + health endpoint (:9090)
go run ./cmd/documcp migrate # Run database migrations and exit
go run ./cmd/documcp version # Print version info
go run ./cmd/documcp health # Check readiness (for Docker healthchecks)go test ./... # All tests
go test -race ./... # With race detection
go test -cover ./... # With coverage
go test -tags integration ./... # Integration tests (needs Docker)gofmt -w . # Format
goimports -w . # Fix imports
golangci-lint run # Lint (v2.11.4)cd frontend
npm ci
npm run build # OpenAPI codegen + vue-tsc + Vite build -> web/frontend/dist/
npm run dev # Dev server with HMR
npm run test # Vitest
npm run lint # vue-tsc + ESLintcmd/documcp/ Entry point (serve, worker, migrate, version, health)
internal/
app/ App lifecycle (Foundation + ServerApp + WorkerApp)
auth/oauth/ OAuth 2.1 server (PKCE, device flow, dynamic registration)
auth/oidc/ OIDC client for user authentication
client/kiwix/ ZIM archive reader (Kiwix)
client/git/ Git template repository sync
config/ Configuration loading (env, YAML)
cron/ Cron schedule definitions
crypto/ AES-256-GCM encryption for secrets at rest
database/ PostgreSQL connection and migrations (goose)
dto/ Data transfer objects
extractor/ Text extraction (PDF, DOCX, XLSX, HTML, EPUB, Markdown)
handler/api/ REST API handlers
handler/mcp/ MCP tool and prompt handlers
handler/oauth/ OAuth endpoint handlers
model/ Domain models
observability/ Tracing, metrics, structured logging, Sentry
queue/ River job queue (workers, events, periodic jobs)
repository/ Data access layer (pgx, handwritten SQL)
search/ Full-text search (tsvector/tsquery + pg_trgm)
security/ Path traversal and SSRF guards
server/ HTTP server setup and routing (chi v5)
service/ Business logic orchestration
stringutil/ Shared string utilities
testutil/ Test helpers and fixtures
frontend/ Vue 3 + TypeScript SPA source (admin panel)
web/frontend/ Embedded SPA (//go:embed dist/)
migrations/ SQL migration files (goose)
docs/contracts/ OpenAPI spec, MCP contract, database schemaThe application uses a single Cobra binary with serve, worker, migrate, and health subcommands for independent scaling. A shared Foundation holds dependencies (database pool, Redis client, repositories, search). ServerApp handles HTTP; WorkerApp handles River queue processing. Redis provides distributed rate limiting across server instances and cross-instance SSE event delivery via Pub/Sub. Repositories use pgxpool.Pool directly, services accept repository interfaces, and handlers accept services. Background jobs run via River, a Postgres-native job queue.
| Tool | Description |
|---|---|
list_documents | List documents with optional filters |
search_documents | Full-text search across documents |
read_document | Retrieve document content by UUID |
create_document | Create a new document |
update_document | Modify document metadata |
delete_document | Remove a document |
unified_search | Cross-source search (documents, ZIM, Git templates) |
list_zim_archives | List available ZIM archives |
search_zim | Search within a specific ZIM archive |
read_zim_article | Retrieve a ZIM article |
list_git_templates | List available Git templates |
search_git_templates | Search across template READMEs |
get_template_structure | View folder tree and variables |
get_template_file | Retrieve a file with variable substitution |
get_deployment_guide | Deployment instructions with essential files |
download_template | Download template as base64-encoded archive |
ZIM and Git template tools are registered conditionally based on whether the corresponding external services are configured.
| Variable | Required | Default | Description |
|---|---|---|---|
APP_URL | No | http://localhost | Public application URL |
INTERNAL_API_TOKEN | No | -- | Token for internal API endpoints |
ENCRYPTION_KEY | No | -- | 32-byte key for AES-256-GCM encryption of stored Git tokens |
| Variable | Required | Default | Description |
|---|---|---|---|
SERVER_HOST | No | 0.0.0.0 | Listen address |
SERVER_PORT | No | 8080 | Listen port |
TRUSTED_PROXIES | No | -- | CIDR ranges for trusted reverse proxies |
TLS_ENABLED | No | false | Terminate TLS directly (no reverse proxy needed) |
TLS_PORT | No | 8443 | HTTPS listen port (SERVER_PORT becomes HTTP→HTTPS redirect) |
TLS_CERT_FILE | No | -- | PEM certificate path (empty + TLS enabled = self-signed) |
TLS_KEY_FILE | No | -- | PEM private key path |
| Variable | Required | Default | Description |
|---|---|---|---|
DB_HOST | Yes | -- | PostgreSQL host |
DB_PORT | No | 5432 | PostgreSQL port |
DB_DATABASE | Yes | -- | Database name |
DB_USERNAME | Yes | -- | Database user |
DB_PASSWORD | Yes | -- | Database password |
DB_SSLMODE | No | require | PostgreSQL SSL mode |
DB_MAX_OPEN_CONNS | No | 25 | Maximum database connections (increase to 40-50 for combined serve+worker mode) |
DB_PGX_MIN_CONNS | No | 5 | Minimum idle database connections |
| Variable | Required | Default | Description |
|---|---|---|---|
REDIS_ADDR | Yes | -- | Redis address (host:port) |
REDIS_USERNAME | No | -- | Redis 6+ ACL username |
REDIS_PASSWORD | No | -- | Redis password |
REDIS_DB | No | 0 | Redis database number |
REDIS_POOL_SIZE | No | 10 | Redis connection pool size |
REDIS_DIAL_TIMEOUT | No | 5s | Redis connection timeout |
| Variable | Required | Default | Description |
|---|---|---|---|
OIDC_PROVIDER_URL | No | -- | OpenID Connect provider URL |
OIDC_CLIENT_ID | No | -- | OIDC client ID |
OIDC_CLIENT_SECRET | No | -- | OIDC client secret |
OIDC_REDIRECT_URI | No | -- | OIDC callback URL |
OAUTH_SESSION_SECRET | Yes | -- | Session secret (min 32 bytes); derives CSRF and token HMAC keys via HKDF |
| Variable | Required | Default | Description |
|---|---|---|---|
STORAGE_DRIVER | No | local | Blob backend: local (or fs) for filesystem, s3 for any S3-compatible service |
STORAGE_BASE_PATH | No | ./storage | Filesystem root (always required — workers use it for git clones and extraction scratch, even with s3) |
STORAGE_MAX_UPLOAD_SIZE | No | 52428800 | Max upload file size in bytes (50 MiB) |
STORAGE_MAX_EXTRACTED_TEXT | No | 52428800 | Max decompressed text per file in bytes (50 MiB) |
STORAGE_MAX_ZIP_FILES | No | 100 | Max files in a DOCX/EPUB ZIP archive |
STORAGE_MAX_SHEETS | No | 100 | Max sheets in an XLSX file |
STORAGE_S3_ENDPOINT | No† | -- | S3 endpoint URL (empty = AWS default). Required for Garage, SeaweedFS, R2, B2, etc. |
STORAGE_S3_BUCKET | No† | -- | Target bucket name |
STORAGE_S3_REGION | No† | -- | AWS region string (us-east-1 is a safe placeholder for Garage/SeaweedFS) |
STORAGE_S3_ACCESS_KEY_ID | No† | -- | Static access key |
STORAGE_S3_SECRET_ACCESS_KEY | No† | -- | Static secret key |
STORAGE_S3_USE_PATH_STYLE | No | true | Force path-style addressing; required for most self-hosted backends |
STORAGE_S3_FORCE_SSL | No | true | Reject plaintext endpoints at startup |
† Required when STORAGE_DRIVER=s3. The s3 driver speaks the S3 API and works against AWS S3, Cloudflare R2, Backblaze B2, Wasabi, Garage, SeaweedFS, and any other S3-compatible service. Keys use the same {file_type}/{uuid}.{ext} layout as the filesystem driver, so switching backends requires no database migration.
| Variable | Required | Default | Description |
|---|---|---|---|
KIWIX_FEDERATED_SEARCH_TIMEOUT | No | 3s | Deadline for Kiwix fan-out during unified search |
KIWIX_FEDERATED_MAX_ARCHIVES | No | 10 | Max archives to search in parallel |
KIWIX_FEDERATED_PER_ARCHIVE_LIMIT | No | 3 | Max results per archive |
| Variable | Required | Default | Description |
|---|---|---|---|
QUEUE_HIGH_WORKERS | No | 10 | River queue concurrency for high-priority jobs |
QUEUE_DEFAULT_WORKERS | No | 5 | River queue concurrency for default jobs |
QUEUE_LOW_WORKERS | No | 2 | River queue concurrency for low-priority jobs |
WORKER_HEALTH_PORT | No | 9090 | Health endpoint port for worker-only mode |
| Variable | Required | Default | Description |
|---|---|---|---|
OTEL_ENABLED | No | false | Enable OpenTelemetry tracing |
OTEL_EXPORTER_OTLP_ENDPOINT | No | -- | OTLP HTTP exporter endpoint (e.g., tempo:4318) |
OTEL_SERVICE_NAME | No | documcp | Service name in traces |
OTEL_INSECURE | No | false | Use HTTP instead of HTTPS for OTLP exporter |
OTEL_SAMPLE_RATE | No | 1.0 | Trace sampling rate (0.0--1.0); ignores upstream sampling decisions |
OTEL_ENVIRONMENT | No | -- | deployment.environment resource attribute |
SENTRY_DSN | No | -- | Sentry/GlitchTip DSN for error tracking (empty = disabled) |
SENTRY_SAMPLE_RATE | No | 1.0 | Error sample rate (0.0--1.0) |
See .env.example for the full list of configurable variables with defaults.
DocuMCP is designed to scale horizontally behind a load balancer. Three things have to be true:
STORAGE_DRIVER=s3 and point it at any S3-compatible service. The filesystem driver is node-local and will not work with more than one replica.initialize() on replica A must keep landing on replica A for the rest of its session. The docker-compose file's Traefik labels set a documcp_affinity cookie on the documcp service to handle this automatically. If you deploy behind a different load balancer, enable cookie-based session affinity on the /documcp route (or the whole service). When a replica restarts, its sessions are gone and clients reinitialize on the next request -- the StreamableHTTP transport tolerates this.river_leader table, and only the leader enqueues periodic jobs. If every replica runs in insert-only serve mode (without --with-worker), no leader is elected and scheduled jobs never fire. Run at least one serve --with-worker or worker replica.Cross-replica cache invalidation is already handled: admin edits to Kiwix external services publish a message on a dedicated Redis pub/sub channel (documcp:control:cache.kiwix.invalidate) that all replicas subscribe to. Other caches are read-through against Postgres and don't need invalidation.
Per-replica health is reported at /health/ready (which checks Postgres and Redis) and documcp_mcp_active_sessions is exposed as a Prometheus gauge so operators can detect sticky-session hot-spotting -- a large spread between max and min across replicas means one node is holding disproportionately many sessions.
| Document | Description |
|---|---|
| OAuth Client Guide | Connecting AI agents, CLI tools, and Claude.ai |
| Observability | Tracing, metrics, logging, error tracking, Grafana dashboard |
| Prometheus Metrics | Metric listing, PromQL examples, scrape configuration |
| Redis | ACL requirements, client architecture, troubleshooting |
| OpenAPI Spec | REST API specification |
| MCP Contract | MCP tools and prompts schema |
| OAuth Flows | OAuth 2.1 flow diagrams |
c-premus/documcp
March 25, 2026
April 13, 2026
Go