MCP Server using OpenRouter models to get descriptions for images
MCP OpenVision is a Model Context Protocol (MCP) server that provides image analysis capabilities powered by OpenRouter vision models. It enables AI assistants to analyze images via a simple interface within the MCP ecosystem.
To install mcp-openvision for Claude Desktop automatically via Smithery:
npx -y @smithery/cli install @Nazruden/mcp-openvision --client claudepip install mcp-openvisionuv pip install mcp-openvisionMCP OpenVision requires an OpenRouter API key and can be configured through environment variables:
MCP OpenVision works with any OpenRouter model that supports vision capabilities. The default model is qwen/qwen2.5-vl-32b-instruct:free, but you can specify any other compatible model.
Some popular vision models available through OpenRouter include:
qwen/qwen2.5-vl-32b-instruct:free (default)anthropic/claude-3-5-sonnetanthropic/claude-3-opusanthropic/claude-3-sonnetopenai/gpt-4oYou can specify custom models by setting the OPENROUTER_DEFAULT_MODEL environment variable or by passing the model parameter directly to the image_analysis function.
The easiest way to test MCP OpenVision is with the MCP Inspector tool:
npx @modelcontextprotocol/inspector uvx mcp-openvision%USERPROFILE%\.cursor\mcp.json~/.cursor/mcp.json or ~/Library/Application Support/Claude/claude_desktop_config.json{
"mcpServers": {
"openvision": {
"command": "uvx",
"args": ["mcp-openvision"],
"env": {
"OPENROUTER_API_KEY": "your_openrouter_api_key_here",
"OPENROUTER_DEFAULT_MODEL": "anthropic/claude-3-sonnet"
}
}
}
}# Set the required API key
export OPENROUTER_API_KEY="your_api_key"
# Run the server module directly
python -m mcp_openvisionMCP OpenVision provides the following core tool:
image: Can be provided as:query: User instruction for the image analysis tasksystem_prompt: Instructions that define the model's role and behavior (optional)model: Vision model to usetemperature: Controls randomness (0.0-1.0)max_tokens: Maximum response lengthThe query parameter is crucial for getting useful results from the image analysis. A well-crafted query provides context about:
| Basic Query | Enhanced Query |
|---|---|
| "Describe this image" | "Identify all retail products visible in this store shelf image and estimate their price range" |
| "What's in this image?" | "Analyze this medical scan for abnormalities, focusing on the highlighted area and providing possible diagnoses" |
| "Analyze this chart" | "Extract the numerical data from this bar chart showing quarterly sales, and identify the key trends from 2022-2023" |
| "Read the text" | "Transcribe all visible text in this restaurant menu, preserving the item names, descriptions, and prices" |
By providing context about why you need the analysis and what specific information you're seeking, you help the model focus on relevant details and produce more valuable insights.
# Analyze an image from a URL
result = await image_analysis(
image="https://example.com/image.jpg",
query="Describe this image in detail"
)
# Analyze an image from a local file with a focused query
result = await image_analysis(
image="path/to/local/image.jpg",
query="Identify all traffic signs in this street scene and explain their meanings for a driver education course"
)
# Analyze with a base64-encoded image and a specific analytical purpose
result = await image_analysis(
image="SGVsbG8gV29ybGQ=...", # base64 data
query="Examine this product packaging design and highlight elements that could be improved for better visibility and brand recognition"
)
# Customize the system prompt for specialized analysis
result = await image_analysis(
image="path/to/local/image.jpg",
query="Analyze the composition and artistic techniques used in this painting, focusing on how they create emotional impact",
system_prompt="You are an expert art historian with deep knowledge of painting techniques and art movements. Focus on formal analysis of composition, color, brushwork, and stylistic elements."
)The image_analysis tool accepts several types of image inputs:
project_root parameter to specify a base directoryWhen using relative file paths (like "examples/image.jpg"), you have two options:
project_root parameter:# Example with relative path and project_root
result = await image_analysis(
image="examples/image.jpg",
project_root="/path/to/your/project",
query="What is in this image?"
)This is particularly useful in applications where the current working directory may not be predictable or when you want to reference files using paths relative to a specific directory.
This project uses Black for automatic code formatting. The formatting is enforced through GitHub Actions:
You can also run Black locally to format your code before committing:
# Format all Python code in the src and tests directories
black src testspytestThis project uses an automated release process:
pyproject.toml following Semantic Versioning principlespython scripts/bump_version.py [major|minor|patch]CHANGELOG.md with details about the new versionmain branchThis automation helps maintain a consistent release process and ensures that every release is properly versioned and documented.
If you find this project helpful, consider buying me a coffee to support ongoing development and maintenance.
<a href="https://www.buymeacoffee.com/nazruden" target="_blank"> <img src="https://img.buymeacoffee.com/button-api/?text=Buy me a coffee&emoji=&slug=nazruden&button_colour=FFDD00&font_colour=000000&font_family=Lato&outline_colour=000000&coffee_colour=ffffff" alt="Buy Me A Coffee" width="217" height="60"> </a>This project is licensed under the MIT License - see the LICENSE file for details.
Nazruden/mcp-openvision
March 28, 2025
June 30, 2025
Python