Newcontext-mode—Save 98% of your AI coding agent's context windowLearn more
MCP Directory
ServersClientsBlog

context-mode

Save 98% of your AI coding agent's context window. Works with Claude Code, Cursor, Copilot, Codex, and more.

Try context-mode
MCP Directory

Model Context Protocol Directory

MKSF LTD
Suite 8805 5 Brayford Square
London, E1 0SG

MCP Directory

  • About
  • Blog
  • Documentation
  • Contact

Menu

  • Servers
  • Clients

© 2026 model-context-protocol.com

The Model Context Protocol (MCP) is an open standard for AI model communication.
Powered by Mert KoseogluSoftware Forge
  1. Home
  2. Servers
  3. documcp

documcp

GitHub

MCP server for document management — search, read, and manage docs via Model Context Protocol

1
1

DocuMCP

CI
Release
Image Size
Go Version
License

A documentation server that exposes knowledge bases through the Model Context Protocol (MCP), enabling AI agents to search, read, and manage documentation.

DocuMCP gives AI agents structured access to your documentation via MCP tools and prompts. It handles document ingestion, full-text search, and OAuth 2.1 authorization. Written in Go for single-binary deployment with low resource usage.

Features

  • MCP Server -- 16 tools and 6 prompts via the official Go MCP SDK. Search, read, create, update, and delete documents. Federated search across documents, ZIM archives, and Git templates in a single query.
  • OAuth 2.1 Authorization Server -- PKCE, device authorization (RFC 8628), dynamic client registration (RFC 7591), and RFC 9728 Protected Resource Metadata for automatic discovery.
  • Document Pipeline -- Upload PDF, DOCX, XLSX, HTML, EPUB, or Markdown. Text is extracted, indexed via PostgreSQL full-text search, and searchable within seconds.
  • External Integrations -- Kiwix ZIM archives (federated article search) and Git template repositories.
  • Background Jobs -- River Postgres-native job queue with 7 worker types, 3 priority queues, and 6 periodic schedules.
  • Admin UI -- Vue 3 + TypeScript SPA for managing documents, users, OAuth clients, external services, and queue status.
  • Observability -- OpenTelemetry tracing with automatic instrumentation for database queries (otelpgx), Redis commands (redisotel), and outbound HTTP (otelhttp). Prometheus metrics (19 collectors covering HTTP, database pool, Redis pool, search, and queue). Structured logging with slog (trace/span ID injection). Optional Sentry/GlitchTip error tracking. See docs/OBSERVABILITY.md for architecture and configuration.
  • OIDC Authentication -- User login via any OpenID Connect provider.

Quick Start

Docker Compose is the fastest way to run DocuMCP. The stack includes the application, PostgreSQL 17, Redis 8, and Traefik v3.4.

  1. Create a .env file:
cat > .env <<EOF
# Database
DB_DATABASE=documcp
DB_USERNAME=documcp
DB_PASSWORD=$(openssl rand -base64 32)

# Redis (see docs/REDIS.md for ACL requirements)
REDIS_ADDR=redis:6379

# Secrets
OAUTH_SESSION_SECRET=$(openssl rand -base64 32)
ENCRYPTION_KEY=$(openssl rand -base64 32)
INTERNAL_API_TOKEN=$(openssl rand -base64 32)
HKDF_SALT=$(openssl rand -base64 16)

# Public URL and TLS
APP_URL=https://documcp.example.com
TRAEFIK_HOST=documcp.example.com
[email protected]
EOF
  1. Start the stack:
docker compose up -d

Migrations run automatically on first start.

  1. The application is available at https://documcp.example.com via Traefik (ports 80/443). The app listens on port 8080 internally but is not exposed to the host by default. To run without Traefik, remove the traefik service and add ports: ["8080:8080"] to the app service.

OIDC required for login: The admin panel authenticates users via an OpenID Connect provider. Set the OIDC_* variables in your .env file -- see .env.example for the full list.

See docs/OAUTH_CLIENT_GUIDE.md for connecting AI agents and CLI tools.

Development

Prerequisites

  • Go 1.26.2
  • Node.js 24 (frontend)
  • PostgreSQL (with pg_trgm and unaccent extensions)
  • Redis 8+ (distributed rate limiting and SSE events)

Build and Run

go build -o bin/documcp ./cmd/documcp    # Build binary

# Cobra subcommands:
go run ./cmd/documcp serve --with-worker # HTTP server + queue workers (dev default)
go run ./cmd/documcp serve               # HTTP server only (River insert-only)
go run ./cmd/documcp worker              # Queue workers only + health endpoint (:9090)
go run ./cmd/documcp migrate             # Run database migrations and exit
go run ./cmd/documcp version             # Print version info
go run ./cmd/documcp health              # Check readiness (for Docker healthchecks)

Test

go test ./...                        # All tests
go test -race ./...                  # With race detection
go test -cover ./...                 # With coverage
go test -tags integration ./...      # Integration tests (needs Docker)

Code Quality

gofmt -w .                           # Format
goimports -w .                       # Fix imports
golangci-lint run                    # Lint (v2.11.4)

Frontend

cd frontend
npm ci
npm run build              # OpenAPI codegen + vue-tsc + Vite build -> web/frontend/dist/
npm run dev                # Dev server with HMR
npm run test               # Vitest
npm run lint               # vue-tsc + ESLint

Architecture

cmd/documcp/             Entry point (serve, worker, migrate, version, health)
internal/
  app/                   App lifecycle (Foundation + ServerApp + WorkerApp)
  auth/oauth/            OAuth 2.1 server (PKCE, device flow, dynamic registration)
  auth/oidc/             OIDC client for user authentication
  client/kiwix/          ZIM archive reader (Kiwix)
  client/git/            Git template repository sync
  config/                Configuration loading (env, YAML)
  cron/                  Cron schedule definitions
  crypto/                AES-256-GCM encryption for secrets at rest
  database/              PostgreSQL connection and migrations (goose)
  dto/                   Data transfer objects
  extractor/             Text extraction (PDF, DOCX, XLSX, HTML, EPUB, Markdown)
  handler/api/           REST API handlers
  handler/mcp/           MCP tool and prompt handlers
  handler/oauth/         OAuth endpoint handlers
  model/                 Domain models
  observability/         Tracing, metrics, structured logging, Sentry
  queue/                 River job queue (workers, events, periodic jobs)
  repository/            Data access layer (pgx, handwritten SQL)
  search/                Full-text search (tsvector/tsquery + pg_trgm)
  security/              Path traversal and SSRF guards
  server/                HTTP server setup and routing (chi v5)
  service/               Business logic orchestration
  stringutil/            Shared string utilities
  testutil/              Test helpers and fixtures
frontend/                Vue 3 + TypeScript SPA source (admin panel)
web/frontend/            Embedded SPA (//go:embed dist/)
migrations/              SQL migration files (goose)
docs/contracts/          OpenAPI spec, MCP contract, database schema

The application uses a single Cobra binary with serve, worker, migrate, and health subcommands for independent scaling. A shared Foundation holds dependencies (database pool, Redis client, repositories, search). ServerApp handles HTTP; WorkerApp handles River queue processing. Redis provides distributed rate limiting across server instances and cross-instance SSE event delivery via Pub/Sub. Repositories use pgxpool.Pool directly, services accept repository interfaces, and handlers accept services. Background jobs run via River, a Postgres-native job queue.

MCP Tools

ToolDescription
list_documentsList documents with optional filters
search_documentsFull-text search across documents
read_documentRetrieve document content by UUID
create_documentCreate a new document
update_documentModify document metadata
delete_documentRemove a document
unified_searchCross-source search (documents, ZIM, Git templates)
list_zim_archivesList available ZIM archives
search_zimSearch within a specific ZIM archive
read_zim_articleRetrieve a ZIM article
list_git_templatesList available Git templates
search_git_templatesSearch across template READMEs
get_template_structureView folder tree and variables
get_template_fileRetrieve a file with variable substitution
get_deployment_guideDeployment instructions with essential files
download_templateDownload template as base64-encoded archive

ZIM and Git template tools are registered conditionally based on whether the corresponding external services are configured.

Configuration

Application

VariableRequiredDefaultDescription
APP_URLNohttp://localhostPublic application URL
INTERNAL_API_TOKENNo--Token for internal API endpoints
ENCRYPTION_KEYNo--32-byte key for AES-256-GCM encryption of stored Git tokens

Server & TLS

VariableRequiredDefaultDescription
SERVER_HOSTNo0.0.0.0Listen address
SERVER_PORTNo8080Listen port
TRUSTED_PROXIESNo--CIDR ranges for trusted reverse proxies
TLS_ENABLEDNofalseTerminate TLS directly (no reverse proxy needed)
TLS_PORTNo8443HTTPS listen port (SERVER_PORT becomes HTTP→HTTPS redirect)
TLS_CERT_FILENo--PEM certificate path (empty + TLS enabled = self-signed)
TLS_KEY_FILENo--PEM private key path

Database (PostgreSQL)

VariableRequiredDefaultDescription
DB_HOSTYes--PostgreSQL host
DB_PORTNo5432PostgreSQL port
DB_DATABASEYes--Database name
DB_USERNAMEYes--Database user
DB_PASSWORDYes--Database password
DB_SSLMODENorequirePostgreSQL SSL mode
DB_MAX_OPEN_CONNSNo25Maximum database connections (increase to 40-50 for combined serve+worker mode)
DB_PGX_MIN_CONNSNo5Minimum idle database connections

Redis

VariableRequiredDefaultDescription
REDIS_ADDRYes--Redis address (host:port)
REDIS_USERNAMENo--Redis 6+ ACL username
REDIS_PASSWORDNo--Redis password
REDIS_DBNo0Redis database number
REDIS_POOL_SIZENo10Redis connection pool size
REDIS_DIAL_TIMEOUTNo5sRedis connection timeout

Authentication & OAuth

VariableRequiredDefaultDescription
OIDC_PROVIDER_URLNo--OpenID Connect provider URL
OIDC_CLIENT_IDNo--OIDC client ID
OIDC_CLIENT_SECRETNo--OIDC client secret
OIDC_REDIRECT_URINo--OIDC callback URL
OAUTH_SESSION_SECRETYes--Session secret (min 32 bytes); derives CSRF and token HMAC keys via HKDF

Storage

VariableRequiredDefaultDescription
STORAGE_DRIVERNolocalBlob backend: local (or fs) for filesystem, s3 for any S3-compatible service
STORAGE_BASE_PATHNo./storageFilesystem root (always required — workers use it for git clones and extraction scratch, even with s3)
STORAGE_MAX_UPLOAD_SIZENo52428800Max upload file size in bytes (50 MiB)
STORAGE_MAX_EXTRACTED_TEXTNo52428800Max decompressed text per file in bytes (50 MiB)
STORAGE_MAX_ZIP_FILESNo100Max files in a DOCX/EPUB ZIP archive
STORAGE_MAX_SHEETSNo100Max sheets in an XLSX file
STORAGE_S3_ENDPOINTNo†--S3 endpoint URL (empty = AWS default). Required for Garage, SeaweedFS, R2, B2, etc.
STORAGE_S3_BUCKETNo†--Target bucket name
STORAGE_S3_REGIONNo†--AWS region string (us-east-1 is a safe placeholder for Garage/SeaweedFS)
STORAGE_S3_ACCESS_KEY_IDNo†--Static access key
STORAGE_S3_SECRET_ACCESS_KEYNo†--Static secret key
STORAGE_S3_USE_PATH_STYLENotrueForce path-style addressing; required for most self-hosted backends
STORAGE_S3_FORCE_SSLNotrueReject plaintext endpoints at startup

† Required when STORAGE_DRIVER=s3. The s3 driver speaks the S3 API and works against AWS S3, Cloudflare R2, Backblaze B2, Wasabi, Garage, SeaweedFS, and any other S3-compatible service. Keys use the same {file_type}/{uuid}.{ext} layout as the filesystem driver, so switching backends requires no database migration.

External Services (Kiwix)

VariableRequiredDefaultDescription
KIWIX_FEDERATED_SEARCH_TIMEOUTNo3sDeadline for Kiwix fan-out during unified search
KIWIX_FEDERATED_MAX_ARCHIVESNo10Max archives to search in parallel
KIWIX_FEDERATED_PER_ARCHIVE_LIMITNo3Max results per archive

Queue Workers

VariableRequiredDefaultDescription
QUEUE_HIGH_WORKERSNo10River queue concurrency for high-priority jobs
QUEUE_DEFAULT_WORKERSNo5River queue concurrency for default jobs
QUEUE_LOW_WORKERSNo2River queue concurrency for low-priority jobs
WORKER_HEALTH_PORTNo9090Health endpoint port for worker-only mode

Observability

VariableRequiredDefaultDescription
OTEL_ENABLEDNofalseEnable OpenTelemetry tracing
OTEL_EXPORTER_OTLP_ENDPOINTNo--OTLP HTTP exporter endpoint (e.g., tempo:4318)
OTEL_SERVICE_NAMENodocumcpService name in traces
OTEL_INSECURENofalseUse HTTP instead of HTTPS for OTLP exporter
OTEL_SAMPLE_RATENo1.0Trace sampling rate (0.0--1.0); ignores upstream sampling decisions
OTEL_ENVIRONMENTNo--deployment.environment resource attribute
SENTRY_DSNNo--Sentry/GlitchTip DSN for error tracking (empty = disabled)
SENTRY_SAMPLE_RATENo1.0Error sample rate (0.0--1.0)

See .env.example for the full list of configurable variables with defaults.

Running multiple replicas

DocuMCP is designed to scale horizontally behind a load balancer. Three things have to be true:

  1. Shared storage -- set STORAGE_DRIVER=s3 and point it at any S3-compatible service. The filesystem driver is node-local and will not work with more than one replica.
  2. Sticky MCP sessions -- the MCP SDK keeps session state in memory per replica, so a client that runs initialize() on replica A must keep landing on replica A for the rest of its session. The docker-compose file's Traefik labels set a documcp_affinity cookie on the documcp service to handle this automatically. If you deploy behind a different load balancer, enable cookie-based session affinity on the /documcp route (or the whole service). When a replica restarts, its sessions are gone and clients reinitialize on the next request -- the StreamableHTTP transport tolerates this.
  3. At least one worker replica -- scheduled jobs (document extraction, soft-delete purge, orphan cleanup, OAuth token cleanup, external-service health checks, expired scope-grant cleanup) run via River's periodic-job enqueuer. River elects a single leader across the cluster via the river_leader table, and only the leader enqueues periodic jobs. If every replica runs in insert-only serve mode (without --with-worker), no leader is elected and scheduled jobs never fire. Run at least one serve --with-worker or worker replica.

Cross-replica cache invalidation is already handled: admin edits to Kiwix external services publish a message on a dedicated Redis pub/sub channel (documcp:control:cache.kiwix.invalidate) that all replicas subscribe to. Other caches are read-through against Postgres and don't need invalidation.

Per-replica health is reported at /health/ready (which checks Postgres and Redis) and documcp_mcp_active_sessions is exposed as a Prometheus gauge so operators can detect sticky-session hot-spotting -- a large spread between max and min across replicas means one node is holding disproportionately many sessions.

Documentation

DocumentDescription
OAuth Client GuideConnecting AI agents, CLI tools, and Claude.ai
ObservabilityTracing, metrics, logging, error tracking, Grafana dashboard
Prometheus MetricsMetric listing, PromQL examples, scrape configuration
RedisACL requirements, client architecture, troubleshooting
OpenAPI SpecREST API specification
MCP ContractMCP tools and prompts schema
OAuth FlowsOAuth 2.1 flow diagrams

License

MIT

Repository

C-
c-premus

c-premus/documcp

Created

March 25, 2026

Updated

April 13, 2026

Language

Go

Category

Search & Knowledge