Prompt Chain

Compare LLM Responses for Quality

Compare Claude and GPT models side-by-side. Evaluate their performance on various prompts to determine the best LLM for your needs.


54
Spark score
out of 100
Updated yesterday
Version code-scan-action-0.1
Models

Add to Favorites

Why it matters

Evaluate and compare the outputs of different large language models (LLMs) like Claude and GPT. This asset helps you determine which model provides superior responses for your specific needs.

Outcomes

What it gets done

01

Run prompts against multiple LLMs.

02

Score and rank LLM responses.

03

Identify the best performing LLM for a given task.

Install

Add it to your toolbox

Run in your project directory:

curl -fsSL https://spark.entire.vc/get/pfoo-claude-vs-gpt | bash

Capabilities

What this chain does

Chatbot

Handles multi-turn conversations within a defined domain.

Summarize

Condenses long documents or threads into key takeaways.

Classify

Labels or categorizes text, files, or data points.

Overview

Claude Vs Gpt

What it does

This prompt chain enables a direct comparison of responses between Claude and GPT models. It allows users to input a series of prompts and receive side-by-side evaluations of how each LLM handles them. The goal is to facilitate a clear understanding of their performance differences.

How it connects

Use this when you need to make an informed decision between using Claude or GPT for a specific application. It's ideal for A/B testing and empirical evaluation of LLM outputs to determine which model better suits your needs.

Source README

yaml-language-server: $schema=https://promptfoo.dev/config-schema.json

description: 'GPT vs Claude example'

prompts:

  • file://prompt.yaml

providers:

  • id: anthropic:claude-sonnet-4-6
    label: Claude Sonnet 4.6
  • openai:gpt-4.1-mini

defaultTest:
assert:
- type: cost
threshold: 0.01
- type: latency
threshold: 3000
- type: javascript
value: 'output.length <= 100 ? 1 : output.length > 1000 ? 0 : 1 - (output.length - 100) / 900'

tests:

  • vars:
    riddle: 'I speak without a mouth and hear without ears. I have no body, but I come alive with wind. What am I?'
    assert:

    Make sure the LLM output contains this word

    • type: icontains
      value: echo

    Use model-graded assertions to enforce free-form instructions

    • type: llm-rubric
      value: Do not apologize
  • vars:
    riddle: "You see a boat filled with people. It has not sunk, but when you look again you don't see a single person on the boat. Why?"
    assert:
    • type: llm-rubric
      value: explains that the people are below deck, or they are all in a relationship
  • vars:
    riddle: 'The more of this there is, the less you see. What is it?'
    assert:
    • type: icontains
      value: darkness
  • vars:
    riddle: >-
    I have keys but no locks. I have space but no room. You can enter, but
    can't go outside. What am I?
    assert:
    • type: icontains
      value: keyboard
  • vars:
    riddle: >-
    I am not alive, but I grow; I don't have lungs, but I need air; I don't
    have a mouth, but water kills me. What am I?
    assert:
    • type: icontains-any
      value:
      • fire
      • flame
  • vars:
    riddle: What can travel around the world while staying in a corner?
    assert:
    • type: icontains
      value: stamp
  • vars:
    riddle: Forward I am heavy, but backward I am not. What am I?
    assert:
    • type: icontains
      value: ton
  • vars:
    riddle: >-
    The person who makes it, sells it. The person who buys it, never uses
    it. The person who uses it, doesn't know they're using it. What is it?
    assert:
    • type: icontains
      value: coffin
  • vars:
    riddle: I can be cracked, made, told, and played. What am I?
    assert:
    • type: icontains
      value: joke
  • vars:
    riddle: What has keys but can't open locks?
    assert:
    • type: icontains
      value: piano
  • vars:
    riddle: >-
    I'm light as a feather, yet the strongest person can't hold me for much
    more than a minute. What am I?
    assert:
    • type: icontains
      value: breath
  • vars:
    riddle: >-
    I can fly without wings, I can cry without eyes. Whenever I go, darkness
    follows me. What am I?
    assert:
    • type: icontains
      value: cloud
  • vars:
    riddle: >-
    I am taken from a mine, and shut up in a wooden case, from which I am
    never released, and yet I am used by almost every person. What am I?
  • vars:
    riddle: >-
    David's father has three sons: Snap, Crackle, and _____? What is the
    name of the third son?
    assert:
    • type: contains
      value: David
  • vars:
    riddle: >-
    I am light as a feather, but even the world's strongest man couldn't
    hold me for much longer than a minute. What am I?
    assert:
    • type: icontains
      value: breath

Discussion

Questions & comments · 0

Sign In Sign in to leave a comment.