Newcontext-mode—Save 98% of your AI coding agent's context windowLearn more
MCP Directory
ServersClientsBlog

context-mode

Save 98% of your AI coding agent's context window. Works with Claude Code, Cursor, Copilot, Codex, and more.

Try context-mode
MCP Directory

Model Context Protocol Directory

MKSF LTD
Suite 8805 5 Brayford Square
London, E1 0SG

MCP Directory

  • About
  • Blog
  • Documentation
  • Contact

Menu

  • Servers
  • Clients

© 2026 model-context-protocol.com

The Model Context Protocol (MCP) is an open standard for AI model communication.
Powered by Mert KoseogluSoftware Forge
  1. Home
  2. Servers
  3. artguard

artguard

GitHub

Scan AI artifacts like agent skills and config files for security risks, privacy issues, and instruction-level attacks with a Python CLI tool.

1
0

artguard

License: MIT
Built with Claude Code
Python 3.11+

A Claude Code prompt that autonomously scaffolds a full AI artifact scanner CLI.

Paste the prompt into Claude Code and it builds artguard — a working Python CLI
that scans agent skills, MCP server configs, and IDE rule files for security threats,
privacy violations, and instruction-level attacks.

The problem

Enterprises are installing AI agent skills, MCP servers, and IDE rule files
(.cursorrules, .clinerules, .windsurfrules) with zero security review.
No existing scanner covers them.

Traditional scanners are built for code packages. AI artifacts are hybrid —
part code, part natural language instructions — and the attack surface lives
in the instructions themselves.

A YARA rule won't catch a skill that tells your coding agent to approve vulnerable
PRs. Static analysis won't surface an artifact that claims "no data stored" while
writing to disk.

What artguard scans

Artifact TypeExamples
Agent skill filesskills.md, skill.json, tool definitions
MCP server configsmcp.json, server manifests
IDE rule files.cursorrules, .clinerules, .windsurfrules
Plugin manifestsmanifest.json, API schemas

Three detection layers

Layer 1 — Privacy posture analysis (differentiator)
Detects the gap between what an artifact claims to do with your data and what it
actually does. Undisclosed storage, covert telemetry, third-party sharing,
retention policy mismatches.

Layer 2 — Semantic instruction analysis (differentiator)
LLM-powered detection of behavioral manipulation, prompt injection, context
poisoning, and goal hijacking in the instruction content itself.

Layer 3 — Static pattern matching (table stakes)
Traditional malware patterns — credential harvesting, exfiltration endpoints,
obfuscated code — backed by the best free scanners available.

Output

Every scan produces a Trust Profile JSON — a structured AI Bill of Materials
designed to feed policy engines, audit trails, and access controls. Not a
safe/unsafe binary.

Composite Trust Score: 14  ██░░░░░░░░░░░░░░░░░░  MALICIOUS 🔴

┌─ PRIVACY POSTURE ─────────────────────────── Score: 32/100 ┐
│  [CRITICAL] PV3 Retention claim mismatch                   │
│             Claims "no data stored" but writes to ~/.cache  │
└─────────────────────────────────────────────────────────────┘

┌─ BEHAVIORAL INTENT ────────────────────────────────────────┐
│  [HIGH]     S4 System prompt override detected              │
│             "Ignore all previous instructions and..."       │
└─────────────────────────────────────────────────────────────┘

Usage

Requirements: Claude Code, Python 3.11+, an Anthropic API key (for Layer 2).

# 1. Create a new directory
mkdir artguard && cd artguard

# 2. Open Claude Code
claude

# 3. Paste the contents of prompt.md
# Claude Code will scaffold the full project autonomously

# 4. Scan an artifact
artguard scan path/to/skill.md
artguard scan path/to/mcp.json --deep     # enables Layer 2 LLM analysis
artguard batch ./skills-directory/

Architecture

Layer 3 integrates YARA rules, heuristic engines, hash lookups, and
IP reputation feeds from the best available open-source and free-tier
sources — so you get broad coverage without vendor lock-in.

Project structure (what Claude Code generates)

artguard/
├── artguard/
│   ├── cli.py                    # Click CLI entry point
│   ├── schema.py                 # Finding, TrustProfile dataclasses
│   ├── db.py                     # SQLite feedback corpus
│   ├── parsers/                  # One parser per artifact type
│   ├── analyzers/
│   │   ├── layer1_privacy.py     # Privacy posture analysis
│   │   ├── layer2_semantic.py    # LLM semantic instruction analysis
│   │   └── layer3_static.py      # Static pattern matching (extractable)
│   ├── trust_profile/            # Trust Profile builder + scorer
│   └── output/                   # Terminal + export formatting
├── tests/
│   └── fixtures/                 # Benign + malicious sample artifacts
├── scan_profiles/                # YAML policy configs
└── prompt.md                     # ← This file is the source of truth

Contributing

The prompt is the source of truth. Improvements to detection patterns,
new artifact parsers, or better YARA rules are all welcome — either as
prompt edits or as PRs against the generated codebase.

If you stress-test this against real skill registries (ClawHub, skills.sh,
npm MCP packages), findings and false positive rates are especially valuable.

License

MIT

Repository

ZO
Zorropiscina

Zorropiscina/artguard

Created

March 17, 2026

Updated

April 13, 2026

Category

AI