MCP Server using OpenRouter models to get descriptions for images
MCP OpenVision is a Model Context Protocol (MCP) server that provides image analysis capabilities powered by OpenRouter vision models. It enables AI assistants to analyze images via a simple interface within the MCP ecosystem.
To install mcp-openvision for Claude Desktop automatically via Smithery:
npx -y @smithery/cli install @Nazruden/mcp-openvision --client claude
pip install mcp-openvision
uv pip install mcp-openvision
MCP OpenVision requires an OpenRouter API key and can be configured through environment variables:
MCP OpenVision works with any OpenRouter model that supports vision capabilities. The default model is qwen/qwen2.5-vl-32b-instruct:free
, but you can specify any other compatible model.
Some popular vision models available through OpenRouter include:
qwen/qwen2.5-vl-32b-instruct:free
(default)anthropic/claude-3-5-sonnet
anthropic/claude-3-opus
anthropic/claude-3-sonnet
openai/gpt-4o
You can specify custom models by setting the OPENROUTER_DEFAULT_MODEL
environment variable or by passing the model
parameter directly to the image_analysis
function.
The easiest way to test MCP OpenVision is with the MCP Inspector tool:
npx @modelcontextprotocol/inspector uvx mcp-openvision
%USERPROFILE%\.cursor\mcp.json
~/.cursor/mcp.json
or ~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"openvision": {
"command": "uvx",
"args": ["mcp-openvision"],
"env": {
"OPENROUTER_API_KEY": "your_openrouter_api_key_here",
"OPENROUTER_DEFAULT_MODEL": "anthropic/claude-3-sonnet"
}
}
}
}
# Set the required API key
export OPENROUTER_API_KEY="your_api_key"
# Run the server module directly
python -m mcp_openvision
MCP OpenVision provides the following core tool:
image
: Can be provided as:query
: User instruction for the image analysis tasksystem_prompt
: Instructions that define the model's role and behavior (optional)model
: Vision model to usetemperature
: Controls randomness (0.0-1.0)max_tokens
: Maximum response lengthThe query
parameter is crucial for getting useful results from the image analysis. A well-crafted query provides context about:
Basic Query | Enhanced Query |
---|---|
"Describe this image" | "Identify all retail products visible in this store shelf image and estimate their price range" |
"What's in this image?" | "Analyze this medical scan for abnormalities, focusing on the highlighted area and providing possible diagnoses" |
"Analyze this chart" | "Extract the numerical data from this bar chart showing quarterly sales, and identify the key trends from 2022-2023" |
"Read the text" | "Transcribe all visible text in this restaurant menu, preserving the item names, descriptions, and prices" |
By providing context about why you need the analysis and what specific information you're seeking, you help the model focus on relevant details and produce more valuable insights.
# Analyze an image from a URL
result = await image_analysis(
image="https://example.com/image.jpg",
query="Describe this image in detail"
)
# Analyze an image from a local file with a focused query
result = await image_analysis(
image="path/to/local/image.jpg",
query="Identify all traffic signs in this street scene and explain their meanings for a driver education course"
)
# Analyze with a base64-encoded image and a specific analytical purpose
result = await image_analysis(
image="SGVsbG8gV29ybGQ=...", # base64 data
query="Examine this product packaging design and highlight elements that could be improved for better visibility and brand recognition"
)
# Customize the system prompt for specialized analysis
result = await image_analysis(
image="path/to/local/image.jpg",
query="Analyze the composition and artistic techniques used in this painting, focusing on how they create emotional impact",
system_prompt="You are an expert art historian with deep knowledge of painting techniques and art movements. Focus on formal analysis of composition, color, brushwork, and stylistic elements."
)
The image_analysis
tool accepts several types of image inputs:
project_root
parameter to specify a base directoryWhen using relative file paths (like "examples/image.jpg"), you have two options:
project_root
parameter:# Example with relative path and project_root
result = await image_analysis(
image="examples/image.jpg",
project_root="/path/to/your/project",
query="What is in this image?"
)
This is particularly useful in applications where the current working directory may not be predictable or when you want to reference files using paths relative to a specific directory.
This project uses Black for automatic code formatting. The formatting is enforced through GitHub Actions:
You can also run Black locally to format your code before committing:
# Format all Python code in the src and tests directories
black src tests
pytest
This project uses an automated release process:
pyproject.toml
following Semantic Versioning principlespython scripts/bump_version.py [major|minor|patch]
CHANGELOG.md
with details about the new versionmain
branchThis automation helps maintain a consistent release process and ensures that every release is properly versioned and documented.
If you find this project helpful, consider buying me a coffee to support ongoing development and maintenance.
<a href="https://www.buymeacoffee.com/nazruden" target="_blank"> <img src="https://img.buymeacoffee.com/button-api/?text=Buy me a coffee&emoji=&slug=nazruden&button_colour=FFDD00&font_colour=000000&font_family=Lato&outline_colour=000000&coffee_colour=ffffff" alt="Buy Me A Coffee" width="217" height="60"> </a>This project is licensed under the MIT License - see the LICENSE file for details.
Nazruden/mcp-openvision
March 28, 2025
June 30, 2025
Python