Evaluate Image Classification Prompts
Eval Image Classification is a prompt workflow example that demonstrates how to evaluate image classification models using promptfoo's testing framework.
Why it matters
This asset evaluates the performance of image classification prompts. It helps users understand how well their prompts are performing and identify areas for improvement in their AI models.
Outcomes
What it gets done
Run image classification prompts against a dataset.
Analyze and score the accuracy of classification results.
Provide insights into prompt effectiveness for image classification tasks.
Install
Add it to your toolbox
Run in your project directory:
curl -fsSL https://spark.entire.vc/get/pfoo-eval-image-classification | bash Capabilities
What this chain does
Labels or categorizes text, files, or data points.
Condenses long documents or threads into key takeaways.
Pulls structured data fields from unstructured text.
Overview
Eval Image Classification
What it does
Eval Image Classification is a multi-step prompt workflow example from the promptfoo repository. It demonstrates how to configure and run evaluations for image classification models using promptfoo's testing framework. The example provides a reference implementation for testing computer vision classification tasks.
How it connects
Use this example when you need to set up automated testing for image classification models or want to learn promptfoo's evaluation patterns for computer vision tasks. It's particularly useful when building quality assurance workflows for AI systems that classify images.
Source README
eval-image-classification (Image Classification Example with Promptfoo)
You can run this example with:
npx promptfoo@latest init --example eval-image-classification
cd eval-image-classification
This example demonstrates how to use Promptfoo for image classification tasks using the Fashion MNIST dataset. The example uses GPT-4o and GPT-4o-mini with a structured json schema to analyze images, including classification, color analysis, and additional attributes.
Getting Started
Set up your OpenAI API key:
export OPENAI_API_KEY='your-api-key'Run the evaluation:
npx promptfoo@latest evalView the results:
npx promptfoo@latest viewOptionally, re-generate or update the dataset:
python dataset_gen.pyNote: You may need to install dependencies with:
pip install -r requirements.txtThis script creates a CSV file with 100 random images from the Fashion MNIST dataset and their labels. A CSV with 10 sample images is included so you can skip this step if preferred.
Experiment with the configuration:
- Modify the JSON schema in
promptfooconfig.yamlto add or adjust required fields - Try different models such as llama3.2 or Claude Sonnet 4.6 by changing the provider in the config
- Adjust the system prompt to improve classification accuracy
- Add additional assertions to validate model outputs
- Modify the JSON schema in
Discussion
Questions & comments · 0
Sign In Sign in to leave a comment.