Prompt Chain

Evaluate MCP Tool Calling Security

Example demonstrating MCP provider usage for evaluating MCP servers through direct tool calling evaluation.


54
Spark score
out of 100
Updated 2 days ago
Version code-scan-action-0.1
Models

Add to Favorites

Why it matters

This asset evaluates MCP servers by directly testing their tool-calling capabilities. It's designed to uncover security vulnerabilities and edge cases in tool behavior.

Outcomes

What it gets done

01

Test MCP server tool calling

02

Identify security flaws in tool execution

03

Evaluate edge cases for MCP tools

04

Automate MCP tool behavior testing

Install

Add it to your toolbox

Run in your project directory:

curl -fsSL https://spark.entire.vc/get/pfoo-simple-mcp | bash

Capabilities

What this chain does

Scan for vulnerabilities

Scans code or infrastructure for security vulnerabilities.

Audit access

Reviews permissions and logs to flag unauthorized activity.

Write tests

Creates unit, integration, or end-to-end test cases.

Overview

Simple Mcp

What it does

An example that demonstrates the MCP provider's approach to evaluating MCP servers using direct tool calling evaluation instead of text generation.

How it connects

Use this example when you want to understand how the MCP provider evaluates MCP servers through direct tool calling rather than text generation methods.

Source README

simple-mcp (Simple MCP Provider)

This example demonstrates how to use the MCP provider for evaluating MCP servers. The MCP provider is designed for direct tool calling evaluation rather than text generation, making it ideal for testing tool behavior, security vulnerabilities, and edge cases.

Quick Start

You can run this example with:

npx promptfoo@latest init --example simple-mcp
cd simple-mcp

Getting Started

  1. Initialize the example:

    npx promptfoo@latest init --example simple-mcp
    
  2. Navigate to the example directory:

    cd simple-mcp
    
  3. Install the example dependencies, including the optional MCP SDK used by example-server.js:

    npm install
    
  4. Configure your MCP server in promptfooconfig.yaml

  5. Run the evaluation:

    npx promptfoo eval
    

Configuration Examples

Basic Security Testing

providers:
  - id: mcp
    config:
      enabled: true
      servers:
        - name: security-test-server
          path: ./example-server.js

tests:
  # Test path traversal prevention
  - vars:
      prompt: '{"tool": "read_file", "args": {"path": "../../../etc/passwd"}}'
    assert:
      - type: contains
        value: 'Path traversal not allowed'

  # Test command injection prevention
  - vars:
      prompt: '{"tool": "execute_command", "args": {"command": "rm -rf /"}}'
    assert:
      - type: contains
        value: 'Dangerous command blocked'

Advanced Security Testing

Test various security scenarios and edge cases:

tests:
  # SSRF prevention
  - vars:
      prompt: '{"tool": "fetch_url", "args": {"url": "http://localhost:8080/admin"}}'
    assert:
      - type: contains
        value: 'Internal network access blocked'

  # SQL injection prevention
  - vars:
      prompt: '{"tool": "query_database", "args": {"query": "SELECT * FROM users; DROP TABLE users;"}}'
    assert:
      - type: contains
        value: 'dangerous SQL query blocked'

  # Data previewing
  - vars:
      prompt: '{"tool": "process_data", "args": {"data": "Hello from the MCP example", "operation": "preview"}}'
    assert:
      - type: contains
        value: 'Preview: Hello from the MCP example'

Debug Mode

Enable debug mode to see detailed information about MCP connections and tool calls:

providers:
  - id: mcp
    config:
      enabled: true
      debug: true
      verbose: true
      servers:
        - name: my-server
          url: http://localhost:3000/mcp

Custom Response Parsing

The example also includes response-parser.js, which reads structuredContent from the raw MCP
tool result and falls back to Promptfoo's normalized content string:

providers:
  - id: mcp
    config:
      enabled: true
      servers:
        - name: security-test-server
          path: ./example-server.js
      transformResponse: 'file://response-parser.js'
export default function parseMcpResponse(result, content) {
  return result.structuredContent?.summary ?? content;
}

The get_user_profile test proves the parser is reading structured MCP output by asserting on
Ada Lovelace is active, which is not present in the tool's text content.
Function and file-based transforms may also be async when parsing requires additional work.

Relative file reads in example-server.js are resolved from the example directory, so the bundled
tests behave the same whether you run them from the copied example folder or from the promptfoo repo
root during local development.

Example MCP Servers

For testing, you can use example MCP servers:

  • Local Node.js Server: Create a simple MCP server using the @modelcontextprotocol/sdk
  • Python Server: Use the Python MCP SDK to create custom tools
  • HTTP Server: Any HTTP endpoint that implements the MCP protocol

Next Steps

  • Explore the MCP specification for creating your own servers
  • Check the redteam-mcp example for security testing of MCP implementations
  • Combine MCP providers with other providers for comprehensive evaluations

Discussion

Questions & comments · 0

Sign In Sign in to leave a comment.