Prompt Chain

Evaluate Search Results with a Rubric

Name: Evaluate Search Results with a Rubric
Availability: OnlineOnly
Author: Promptfoo

A prompt chain example that demonstrates building evaluation rubrics for search quality assessment, using promptfoo's multi-step workflow capabilities.

Copy chain

Promptfoo

Maintainer?

Spark score

out of 100

Updated 3 months ago

Version 1.0.0

Models

claude 3 opusgpt 4o

Add to Favorites

Why it matters

Automate the evaluation of search engine results using a predefined rubric to ensure quality and relevance. This asset helps in systematically assessing the output of search queries.

Outcomes

What it gets done

Define and apply a structured rubric for evaluating search results.

Classify search results based on relevance and quality criteria.

Summarize findings from evaluated search results.

Install

Add it to your toolbox

Run in your project directory:

curl -fsSL https://spark.entire.vc/get/pfoo-eval-search-rubric | bash

Capabilities

What this chain does

Search the web

Searches the web and retrieves relevant sources.

Summarize

Condenses long documents or threads into key takeaways.

Classify

Labels or categorizes text, files, or data points.

Overview

Eval Search Rubric

What it does

This is a prompt chain example from the promptfoo repository that demonstrates how to build evaluation rubrics for assessing search quality. It structures the evaluation process as a multi-step workflow, showing how to organize assessment criteria systematically. The example is runnable and serves as a reference implementation for search quality evaluation patterns.

How it connects

Use this when you need a starting point for building search evaluation workflows or want to understand how to structure assessment rubrics as prompt chains. It's particularly useful when you're setting up quality standards for search systems and need a concrete example of how to organize multi-step evaluation logic using promptfoo.

Source README

eval-search-rubric (Search Rubric)

You can run this example with:

npx promptfoo@latest init --example eval-search-rubric
cd eval-search-rubric

This example demonstrates how to use the search-rubric assertion type to verify that LLM outputs contain accurate, current information.

Overview

The search-rubric assertion allows you to verify facts by searching the web in real-time. This is particularly useful for:

Current events and news
Stock prices and financial data
Weather information
Recent company information
Any time-sensitive data

Running the Example

npx promptfoo eval

How It Works

The LLM generates a response to your prompt
The search-rubric assertion extracts the claim you want to verify
A provider with web search capabilities searches for current information
The assertion passes or fails based on whether the output matches current web data

Provider Support

Anthropic Claude

Web search capabilities via tool configuration (launched in 2025)
Requires explicit web_search_20250305 tool configuration
Pricing: $10 per 1,000 searches plus standard token costs

OpenAI

Requires web_search_preview tool configuration
Works with gpt-5.1, o4-mini, and other Responses API models

Perplexity

Built-in web search capabilities
No additional configuration needed

Configuration

assert:
  - type: search-rubric
    value: 'search query to verify'
    threshold: 0.8 # Optional: minimum accuracy score (0-1)

Notes

Search rubric assertions add latency (2-5 seconds per assertion)
Use caching during development: npx promptfoo eval --cache
Be specific with your search queries for better results

Discussion

Evaluate Search Results with a Rubric

What it gets done

Add it to your toolbox

What this chain does

Eval Search Rubric

What it does

How it connects

eval-search-rubric (Search Rubric)

Overview

Running the Example

How It Works

Provider Support

Anthropic Claude

OpenAI

Perplexity

Configuration

Notes

Questions & comments · 0