Multi-step prompt sequences for complex AI workflows.
62 tools found
Automate code generation and review for Nvidia provider integrations, enhancing development efficiency and code quality.
A prompt workflow example that compares responses from Llama and GPT language models side-by-side using promptfoo's evaluation framework.
A prompt workflow example that benchmarks and compares Mistral and Llama language models side-by-side using promptfoo's evaluation framework.
Prompt workflow that benchmarks DeepSeek, Mistral, Llama, and Qwen models on factual assertion tasks using OpenRouter to compare open-source LLM performance.
A prompt workflow example that benchmarks and compares outputs from Microsoft Phi and Meta Llama language models side-by-side using promptfoo's evaluation
Example configuration demonstrating how to evaluate Cerebras Inference API models using promptfoo for high-performance LLM inference testing.
Llama Guard moderation prompt chain example for Replicate provider that demonstrates content safety filtering workflows using promptfoo testing framework.
Evaluate OpenAI agents with Langfuse for reliable production deployment. Monitor traces, debug, and improve performance with online/offline metrics.
Analyze 10-K financial documents with LlamaIndex. Extract insights and synthesize information across multiple reports.
Compare models from OpenAI, Anthropic Claude, Meta Llama, and Mistral on Azure AI Foundry. Evaluate performance across providers.
Prompt workflow demonstrating integration of Meta Llama models on Azure AI Foundry with promptfoo for testing and evaluation.
Evaluate Cerebras Inference API models like Llama with promptfoo for high-performance LLM inference. Test and compare model outputs.