mdn

Offline-first MDN Web Docs RAG-MCP server ready for semantic search with hybrid vector and full‑text retrieval

Offline-first MDN Web Docs RAG-MCP server ready for semantic search with hybrid vector (1024-d) and full‑text (BM25) retrieval.

Example

example screenshot

Content

The dataset covers the core MDN documentation sections, including:

Web API
JavaScript
HTML
CSS
SVG
HTTP

See dataset repo on HuggigFace for more details.

Usage

1. Download dataset and embedding model

npx -y @deepsweet/mdn@latest download

Both dataset (~260 MB) and the embedding model GGUF file (~438 MB) will be downloaded directly from HugginFace and stored in its default cache location (typically ~/.cache/huggingface/), just like the hf download command does.

2. Setup RAG-MCP server

{
  "mcpServers": {
    "mdn": {
      "command": "npx",
      "args": [
        "-y",
        "@deepsweet/mdn@latest",
        "server"
      ],
      "env": {}
    }
  }
}

[!TIP]
Remove @latest for a full offline experience, but keep in mind that this will cache a fixed version without auto-updating.

The stdio server will spawn llama.cpp under the hood, load the embedding model (~655 MB RAM/VRAM), and query the dataset – all on demand.

Settings

Env variable	Default value	Description
`MDN_DATASET_PATH`	HuggingFace cache	Custom dataset directory path
`MDN_MODEL_PATH`	HuggingFace cache	Custom model file path
`MDN_MODEL_TTL`	`1800`	For how long llama.cpp with embedding model should be kept loaded in memory, in seconds; `0` to prevent unloading
`MDN_QUERY_DESCRIPTION`	`Natural language query for hybrid vector and full-text search`	Custom search query description in case your LLM does a poor job asking the MCP tool
`MDN_SEARCH_RESULTS_LIMIT`	`3`	Total search results limit
`HF_TOKEN`		Optional HuggingFace access token, helps with occasional "HTTP 429 Too Many Requests"

To do

automatically update and upload the dataset artifacts monthly with GitHub Actions
automatically prune old dataset revisions like hf cache prune
figure out a better query description so that LLM doesn't over-generate keywords

Articles

Парсим MDN и пишем оффлайн RAG-MCP (in Russian)

License

The RAG-MCP server itself and the processing scripts are available under MIT.

Repository

deepsweet

deepsweet/mdn

Created

March 31, 2026

Updated

April 13, 2026

Language

TypeScript

mdn