Prompt Chain

Test Policy Plugin Code Generation

Name: Test Policy Plugin Code Generation
Availability: OnlineOnly
Author: Promptfoo

Test suite that evaluates the PolicyPlugin test generator to ensure it produces valid policy violation tests for AI red-teaming workflows.

Copy chain

Works with github

Promptfoo

Maintainer?

Spark score

out of 100

Updated 2 days ago

Version code-scan-action-0.1

Models

gpt 4o

Add to Favorites

Why it matters

This asset evaluates the Policy Plugin's test generation capabilities. It ensures the plugin's code generation for tests is robust and accurate.

Outcomes

What it gets done

Evaluate the Policy Plugin's test generation.

Verify the accuracy of generated test code.

Debug potential issues in the test generation process.

Install

Add it to your toolbox

Run in your project directory:

curl -fsSL https://spark.entire.vc/get/pfoo-evals | bash

Capabilities

What this chain does

Write tests

Creates unit, integration, or end-to-end test cases.

Review code

Analyzes code for bugs, style issues, and improvements.

Debug

Traces errors to their root cause and suggests fixes.

Overview

Evals

What it does

This is an evaluation suite specifically designed to test the PolicyPlugin test generator itself. It validates that the PolicyPlugin produces correct and valid policy violation tests, ensuring the test generation infrastructure works as expected.

How it connects

Use this suite when you need to validate the PolicyPlugin test generator before deploying it in production red-teaming workflows, or when troubleshooting issues with policy test generation quality.

Source README

policy evals

This suite evaluates the PolicyPlugin test generator itself.

It compares five native promptfoo redteam generate cases:

normal single-input generation
policy text with explicit test-generation instructions
modifier-driven generation (Spanish output)
multi-input generation with coordinated document / query attacks
log-analysis generation with PromptBlock: output

The eval flow is:

run promptfoo redteam generate against each case config under cases/
normalize the generated YAML into a stable JSON payload
feed that payload into Promptfoo assertions and llm-rubric checks through an executable prompt

That keeps the suite on Promptfoo's real CLI generation path instead of using a custom harness provider.

Prerequisites

OPENAI_API_KEY available in your environment or in .env

Run

From the repository root:

npm run local -- validate -c src/redteam/plugins/policy/evals/promptfooconfig.yaml
npm run local -- eval -c src/redteam/plugins/policy/evals/promptfooconfig.yaml --env-file .env --no-cache

To generate any single comparison case directly:

npm run local -- redteam generate -c src/redteam/plugins/policy/evals/cases/normal-single-input.yaml -o /tmp/policy-normal.yaml --force

Files

promptfooconfig.yaml - eval suite
generatePolicyEvalPrompt.cjs - executable prompt that runs redteam generate for one case and emits normalized JSON
cases/*.yaml - native redteam generation configs being compared
tests/policy-generation.yaml - case metadata and Promptfoo assertions

Discussion