Prompt Chain

Evaluate Image Classification Prompts

Eval Image Classification is a prompt workflow example that demonstrates how to evaluate image classification models using promptfoo's testing framework.


54
Spark score
out of 100
Updated 2 days ago
Version code-scan-action-0.1

Add to Favorites

Why it matters

This asset evaluates the performance of image classification prompts. It helps users understand how well their prompts are performing and identify areas for improvement in their AI models.

Outcomes

What it gets done

01

Run image classification prompts against a dataset.

02

Analyze and score the accuracy of classification results.

03

Provide insights into prompt effectiveness for image classification tasks.

Install

Add it to your toolbox

Run in your project directory:

curl -fsSL https://spark.entire.vc/get/pfoo-eval-image-classification | bash

Capabilities

What this chain does

Classify

Labels or categorizes text, files, or data points.

Summarize

Condenses long documents or threads into key takeaways.

Extract

Pulls structured data fields from unstructured text.

Overview

Eval Image Classification

What it does

Eval Image Classification is a multi-step prompt workflow example from the promptfoo repository. It demonstrates how to configure and run evaluations for image classification models using promptfoo's testing framework. The example provides a reference implementation for testing computer vision classification tasks.

How it connects

Use this example when you need to set up automated testing for image classification models or want to learn promptfoo's evaluation patterns for computer vision tasks. It's particularly useful when building quality assurance workflows for AI systems that classify images.

Source README

eval-image-classification (Image Classification Example with Promptfoo)

You can run this example with:

npx promptfoo@latest init --example eval-image-classification
cd eval-image-classification

This example demonstrates how to use Promptfoo for image classification tasks using the Fashion MNIST dataset. The example uses GPT-4o and GPT-4o-mini with a structured json schema to analyze images, including classification, color analysis, and additional attributes.

Getting Started

  1. Set up your OpenAI API key:

    export OPENAI_API_KEY='your-api-key'
    
  2. Run the evaluation:

    npx promptfoo@latest eval
    
  3. View the results:

    npx promptfoo@latest view
    
  4. Optionally, re-generate or update the dataset:

    python dataset_gen.py
    

    Note: You may need to install dependencies with:

    pip install -r requirements.txt
    

    This script creates a CSV file with 100 random images from the Fashion MNIST dataset and their labels. A CSV with 10 sample images is included so you can skip this step if preferred.

  5. Experiment with the configuration:

    • Modify the JSON schema in promptfooconfig.yaml to add or adjust required fields
    • Try different models such as llama3.2 or Claude Sonnet 4.6 by changing the provider in the config
    • Adjust the system prompt to improve classification accuracy
    • Add additional assertions to validate model outputs

Discussion

Questions & comments · 0

Sign In Sign in to leave a comment.