Prompt Chain

Evaluate Search Results with a Rubric

A prompt chain example that demonstrates building evaluation rubrics for search quality assessment, using promptfoo's multi-step workflow capabilities.


80
Spark score
out of 100
Updated 3 months ago
Version 1.0.0
Models
claude 3 opusgpt 4o

Add to Favorites

Why it matters

Automate the evaluation of search engine results using a predefined rubric to ensure quality and relevance. This asset helps in systematically assessing the output of search queries.

Outcomes

What it gets done

01

Define and apply a structured rubric for evaluating search results.

02

Classify search results based on relevance and quality criteria.

03

Summarize findings from evaluated search results.

Install

Add it to your toolbox

Run in your project directory:

curl -fsSL https://spark.entire.vc/get/pfoo-eval-search-rubric | bash

Capabilities

What this chain does

Search the web

Searches the web and retrieves relevant sources.

Summarize

Condenses long documents or threads into key takeaways.

Classify

Labels or categorizes text, files, or data points.

Overview

Eval Search Rubric

What it does

This is a prompt chain example from the promptfoo repository that demonstrates how to build evaluation rubrics for assessing search quality. It structures the evaluation process as a multi-step workflow, showing how to organize assessment criteria systematically. The example is runnable and serves as a reference implementation for search quality evaluation patterns.

How it connects

Use this when you need a starting point for building search evaluation workflows or want to understand how to structure assessment rubrics as prompt chains. It's particularly useful when you're setting up quality standards for search systems and need a concrete example of how to organize multi-step evaluation logic using promptfoo.

Source README

eval-search-rubric (Search Rubric)

You can run this example with:

npx promptfoo@latest init --example eval-search-rubric
cd eval-search-rubric

This example demonstrates how to use the search-rubric assertion type to verify that LLM outputs contain accurate, current information.

Overview

The search-rubric assertion allows you to verify facts by searching the web in real-time. This is particularly useful for:

  • Current events and news
  • Stock prices and financial data
  • Weather information
  • Recent company information
  • Any time-sensitive data

Running the Example

npx promptfoo eval

How It Works

  1. The LLM generates a response to your prompt
  2. The search-rubric assertion extracts the claim you want to verify
  3. A provider with web search capabilities searches for current information
  4. The assertion passes or fails based on whether the output matches current web data

Provider Support

Anthropic Claude

  • Web search capabilities via tool configuration (launched in 2025)
  • Requires explicit web_search_20250305 tool configuration
  • Pricing: $10 per 1,000 searches plus standard token costs

OpenAI

  • Requires web_search_preview tool configuration
  • Works with gpt-5.1, o4-mini, and other Responses API models

Perplexity

  • Built-in web search capabilities
  • No additional configuration needed

Configuration

assert:
  - type: search-rubric
    value: 'search query to verify'
    threshold: 0.8 # Optional: minimum accuracy score (0-1)

Notes

  • Search rubric assertions add latency (2-5 seconds per assertion)
  • Use caching during development: npx promptfoo eval --cache
  • Be specific with your search queries for better results

Discussion

Questions & comments · 0

Sign In Sign in to leave a comment.