A high-performance proxy middleware to fix Qwen3.5 tool calling issues and enable seamless MCP client compatibility.
Author: Le Xiaodong
A high-performance proxy middleware designed to enable Qwen3.5 to work seamlessly with MCP clients (like Cline/Pencil) by fixing tool call formatting issues in real-time.
A proxy middleware layer that fixes tool call formatting issues for Qwen3.5 models running on Sglang / vLLM / Ollama / llama.cpp servers.
This project runs between the client and LLM backend server, aiming to fix common tool call defects in Qwen3.5 deployments, enabling seamless collaboration with MCP (Model Context Protocol) clients.
<function=...> XML tags leaked in content into standard tool_calls format.stop reasons to tool_use, ensuring Agents can trigger tool execution logic.model parameter.httpx architecture.tool_{uuid} format to prevent ID collisions.MCPError and ErrorType enum for structured errors.graph LR
Client[Cline / MCP Client] -- "OpenAI/Anthropic API" --> Proxy[Qwen Tool Fix Proxy]
Proxy -- "Fixed/Normalized Request" --> Backend[Sglang / vLLM / Claude]
Backend -- "Raw Response (with XML leaks)" --> Proxy
Proxy -- "Cleaned Response (Standard Tool Calls)" --> Clientpip install -r qwen_tool_fix/requirements.txtCopy qwen_tool_fix/.env.example to qwen_tool_fix/.env and modify:
# Proxy Server Configuration
PORT=8123
# Backend API Configuration
BACKEND_TYPE=sglang
BACKEND_URL=http://127.0.0.1:5005
API_KEY=empty
# Other Configuration
TIMEOUT=120.0
LOG_LEVEL=INFO
ENABLE_CORS=true
ALLOWED_ORIGINS=*# Basic usage
python start_proxy.py --port 8123
# With custom backend URL
python start_proxy.py --port 8123 --backend-url http://localhost:5005
# With all options
python start_proxy.py --port 8123 --backend-url http://localhost:5005 --log-level DEBUG
# Development mode (auto-reload)
python start_proxy.py --reloadCommand Line Options:
| Option | Short | Default | Description |
|---|---|---|---|
--port | -p | 8123 | Proxy server port |
--host | -H | 0.0.0.0 | Listen address |
--backend-url | -b | http://127.0.0.1:5005 | Backend server URL |
--backend-model | -m | qwen3.5-27b | Backend model name |
--log-level | -l | INFO | Log level (DEBUG/INFO/WARNING/ERROR) |
--reload | - | False | Enable auto-reload (development mode) |
Use the provided scripts in qwen_tool_fix/:
qwen_tool_fix/run_proxy.batbash qwen_tool_fix/run_proxy.shpython -m qwen_tool_fix.proxy_serverIn Cline settings, change the API Endpoint to:http://127.0.0.1:8123/v1
| Endpoint | Method | Description | Compatibility |
|---|---|---|---|
/v1/chat/completions | POST | Chat Completion | OpenAI |
/v1/messages | POST | Messages API | Anthropic |
/health | GET | Health Check | - |
/models | GET | Model List | OpenAI |
from qwen_tool_fix.tool_parser import parse_qwen_xml_tools, strip_think_tags, parse_json_with_fallback
# Example 1: Extract tool calls
text_with_tools = "Please execute <function=bash><parameter=command>ls</parameter></function>"
tools = parse_qwen_xml_tools(text_with_tools)
# Example 2: Remove think tags
text_with_think = " Thinking... The answer is 42"
cleaned = strip_think_tags(text_with_think)
# Example 3: Parse JSON with fallback
json_text = "{'key': 'value'}" # Single quotes
parsed = parse_json_with_fallback(json_text) # Returns {'key': 'value'}from qwen_tool_fix.mcp_enhancements import (
generate_tool_call_id,
normalize_tool_arguments,
validate_chat_request,
MCPError,
ErrorType,
ToolCall
)
# Generate unique tool call ID
tool_id = generate_tool_call_id() # tool_1712999999123_a1b2c3d4
# Normalize arguments
args = normalize_tool_arguments('{"key": "value"}')
# Validate request
result = validate_chat_request({"messages": [{"role": "user", "content": "hello"}]})
if not result.is_valid:
for error in result.errors:
print(f"Error: {error}")
# Create structured error
error = MCPError(ErrorType.BAD_REQUEST, "Invalid request", {"field": "messages"})ast.literal_evalcommand, arguments, and other parameterstool_{uuid} format| Server | XML Parsing | Think Leak | finish_reason | Recommendation |
|---|---|---|---|---|
| Sglang | ✅ Good | ✅ Fixed | ✅ Stable | Recommended |
| vLLM | ⚠️ Partial | ✅ Fixed | ⚠️ Fluctuating | Use --tool-call-parser qwen3_coder |
| Ollama | ⚠️ Intermittent | ✅ Fixed | ⚠️ Errors | Upgrade to latest version |
| llama.cpp | ❌ Poor | ❌ Severe | ❌ Errors | Must use with this proxy |
Distributed under the MIT License. See LICENSE for more information.
lexiaodong/qwen3.5_fix
April 12, 2026
April 13, 2026
Python