MCP

Evaluate and Improve AI Assistant Performance

Mandoline MCP Server integrates AI assistants with Mandoline's evaluation framework for performance improvement.

Works with claudecursor
⚠️ This tool looks unmaintained — no upstream commits in 6+ months.

7
Spark score
out of 100
Updated 8 months ago
Version 0.2.0
Models

Add to Favorites

Why it matters

Enable AI assistants like Claude and Cursor to critically evaluate and continuously improve their own performance using the Mandoline evaluation framework via the Model Context Protocol.

Outcomes

What it gets done

01

Define custom evaluation metrics for specific tasks.

02

Evaluate prompt/response pairs against defined metrics.

03

Monitor AI assistant performance and identify areas for improvement.

04

Integrate with AI assistants to facilitate self-evaluation.

Install

Add it to your toolbox

Run in your project directory:

curl -fsSL https://spark.entire.vc/get/vb-mandoline | bash

Capabilities

Tools your agent gets

get_server_health

Confirms MCP server availability and returns correct status

create_metric

Defines custom evaluation criteria for your specific tasks

batch_create_metrics

Creates multiple evaluation metrics in a single operation

get_metric

Retrieves details about a specific metric

get_metrics

Views your metrics with filtering and pagination

update_metric

Modifies existing metric definitions

create_evaluation

Evaluates prompt/response pairs against your metrics

batch_create_evaluations

Evaluates the same content across multiple metrics

+3 tools

Overview

Mandoline MCP Server

What it does

The Mandoline MCP Server enables AI assistants to leverage Mandoline's evaluation framework via the Model Context Protocol (MCP), allowing them to reflect on, critique, and continuously improve their own performance.

How it connects

Use Mandoline MCP Server to integrate evaluation tools into your AI assistant workflows for objective performance analysis and iterative refinement of AI responses or code generation. Do not use if you are not utilizing tools that support the Model Context Protocol.

Source README

Mandoline MCP Server

Enable AI assistants like Claude Code, Claude Desktop, and Cursor to reflect on, critique, and continuously improve their own performance using Mandoline's evaluation framework via the Model Context Protocol.


Client Setup

Most users should start here. Use Mandoline's hosted MCP server to integrate evaluation tools into your AI assistant.

For each integration below, replace sk_**** with your actual API key from mandoline.ai/account.

Claude Code

Use the CLI to add the Mandoline MCP server to Claude Code:

claude mcp add --scope user --transport http mandoline https://mandoline.ai/mcp --header "x-api-key: sk_****"

You can use --scope user (across projects) or --scope project (current project only).

Note: Restart any active Claude Code sessions after configuration changes.

Verify: Run /mcp in Claude Code to see Mandoline listed as a connected server:

Tutorial: Watch Claude evaluate multiple code solutions and pick the best one.

Official Documentation: Claude Code MCP Guide

Codex

Use the CLI to add the Mandoline MCP server to Codex:

codex mcp add mandoline --env MANDOLINE_API_KEY=sk_**** -- npx -y mcp-remote https://mandoline.ai/mcp --header 'x-api-key: ${MANDOLINE_API_KEY}'

Note: Restart any active Codex sessions after configuration changes.

Verify: Run /mcp in Codex to see Mandoline listed as a connected server:

Official Documentation: Codex MCP Configuration

Claude Desktop

Edit your configuration file (Settings > Developer > Edit Config):

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%/Claude/claude_desktop_config.json
{
  "mcpServers": {
    "Mandoline": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-remote",
        "https://mandoline.ai/mcp",
        "--header",
        "x-api-key: ${MANDOLINE_API_KEY}"
      ],
      "env": {
        "MANDOLINE_API_KEY": "sk_****"
      }
    }
  }
}

This configuration applies globally to all conversations.

Note: Restart Claude Desktop after configuration changes.

Verify: Look for Mandoline tools when you click the "Search and tools" button.

Official Documentation: MCP Quickstart Guide

Cursor

Create or edit your MCP configuration file:

{
  "mcpServers": {
    "Mandoline": {
      "url": "https://mandoline.ai/mcp",
      "headers": {
        "x-api-key": "sk_****"
      }
    }
  }
}

You can use your global configuration (affects all projects) ~/.cursor/mcp.json or project-local configuration (current project only) .cursor/mcp.json (in project root)

Note: Restart Cursor after configuration changes.

Verify: Check the Output panel (Ctrl+Shift+U) → "MCP Logs" for successful connection, or look for Mandoline tools in the Composer Agent.

Official Documentation: Cursor MCP Guide


Server Setup

Only needed if you want to run the server locally or contribute to development. Most users should use the hosted server above.

Prerequisites: Node.js 18+ and npm

Installation

  1. Clone and build

    git clone https://github.com/mandoline-ai/mandoline-mcp-server.git
    cd mandoline-mcp-server
    npm install
    npm run build
    
  2. Configure environment (optional)

    cp .env.example .env.local
    # Edit .env.local to customize PORT, LOG_LEVEL, etc.
    
  3. Start the server

    npm start
    

The server runs on http://localhost:8080 by default.

Using Local Server

To use your local server instead of the hosted one, replace https://mandoline.ai/mcp with http://localhost:8080/mcp in the client configurations above.


Usage

Once integrated, you can use Mandoline evaluation tools directly in your AI assistant conversations.

Tools

Health

Tool Purpose
get_server_health Confirm the MCP server is reachable and returning a healthy status payload.

Metrics

Tool Purpose
create_metric Define custom evaluation criteria for your specific tasks
batch_create_metrics Create multiple evaluation metrics in one operation
get_metric Retrieve details about a specific metric
get_metrics Browse your metrics with filtering and pagination
update_metric Modify existing metric definitions

Evaluations

Tool Purpose
create_evaluation Score prompt/response pairs against your metrics
batch_create_evaluations Evaluate the same content against multiple metrics
get_evaluation Retrieve evaluation results and scores
get_evaluations Browse evaluation history with filtering and pagination
update_evaluation Add metadata or context to evaluations

Resources

Resource Description
llms.txt Mandoline docs index (tools, tutorials, blogs, leaderboards, SDKs); mirrored from https://mandoline.ai/llms.txt.
mcp MCP setup guide for assistants; mirrored from https://mandoline.ai/mcp.

Discussion

Questions & comments · 0

Sign In Sign in to leave a comment.