Integrate LLM Data with Helicone
Promptfoo example demonstrating Helicone integration for LLM observability and monitoring in prompt evaluation workflows.
code-scan-action-0.1Add to Favorites
Why it matters
Connect your LLM applications to Helicone for enhanced observability and data management. This asset facilitates the integration, allowing you to track, analyze, and manage your LLM interactions.
Outcomes
What it gets done
Integrate LLM data streams with Helicone.
Enable data extraction and summarization from LLM interactions.
Facilitate ETL synchronization for LLM operational data.
Query and manage LLM interaction data.
Install
Add it to your toolbox
Run in your project directory:
curl -fsSL https://spark.entire.vc/get/pfoo-integration-helicone | bash Capabilities
What this chain does
Writes and executes SQL or NoSQL queries on databases.
Moves and transforms data between systems on a schedule.
Pulls structured data fields from unstructured text.
Condenses long documents or threads into key takeaways.
Overview
Integration Helicone
What it does
This is a working example that demonstrates how to integrate Helicone observability into promptfoo prompt evaluation workflows. It provides configuration and setup patterns for connecting Helicone's monitoring capabilities to track LLM API calls, costs, and performance during prompt testing.
How it connects
Use this example when you need to add observability to your prompt evaluation process, want to monitor LLM usage and costs during testing, or need to debug and track prompt performance metrics in your AI application development workflow.
Source README
integration-helicone (Helicone AI Gateway)
You can run this example with:
npx promptfoo@latest init --example integration-helicone
cd integration-helicone
This example demonstrates how to use the Helicone AI Gateway provider in promptfoo to route requests through a self-hosted Helicone AI Gateway instance for unified provider access.
What This Example Shows
- Unified Interface: Use the same OpenAI-compatible syntax to access multiple providers
- Load Balancing: Smart routing based on provider availability and performance
- Self-Hosted Gateway: Full control over your LLM routing infrastructure
- Provider Comparison: Compare responses from different providers through a single interface
- Flexible Configuration: Easy switching between providers and models
Prerequisites
- Helicone AI Gateway: A running instance (we'll start one locally)
- API Keys: You'll need at least one provider API key:
- OpenAI API key (recommended)
- Anthropic API key (optional)
- Groq API key (optional)
Setup
Set Environment Variables:
# Set your provider API keys export OPENAI_API_KEY=your_openai_api_key_here export ANTHROPIC_API_KEY=your_anthropic_api_key_here # Optional export GROQ_API_KEY=your_groq_api_key_here # OptionalStart Helicone AI Gateway:
# In a separate terminal, start the gateway npx @helicone/ai-gateway@latestThe gateway will start on
http://localhost:8080by default.Install promptfoo (if you haven't already):
npm install -g promptfoo
Running the Example
From this directory, run:
promptfoo eval
This will:
- Send the same prompts to all three providers through the Helicone AI Gateway
- Compare responses and performance across providers
- Generate a detailed comparison report
- Show differences in model capabilities and response patterns
What Happens
- Request Routing: Each request is sent to the local Helicone AI Gateway at
http://localhost:8080 - Provider Selection: The gateway routes each request to the appropriate provider (OpenAI, Anthropic, or Groq)
- Unified Interface: All providers use the same OpenAI-compatible request/response format
- Response Comparison: promptfoo compares the responses from each provider
Gateway Features
The Helicone AI Gateway provides several powerful features:
- Load Balancing: Automatic routing to the fastest/most reliable provider
- Caching: Built-in response caching to reduce costs and improve latency
- Rate Limiting: Configurable rate limits to prevent abuse
- Observability: Optional integration with Helicone's observability platform
- Self-Hosted: Full control over your infrastructure and data
Configuration Details
The example configuration includes:
Provider Setup
providers:
- id: helicone:openai/gpt-4o-mini
label: 'OpenAI via Helicone Gateway'
config:
temperature: 0.7
max_tokens: 500
Key Features Demonstrated
- Unified Interface: All providers use the same
helicone:provider/modelformat - OpenAI Compatibility: Standard OpenAI parameters work across all providers
- Easy Switching: Change providers by simply updating the model name
- Local Gateway: All requests go through your self-hosted gateway instance
Customization
You can modify the configuration to:
- Add More Providers: Include any providers supported by your Helicone AI Gateway
- Change Models: Specify different models using the
provider/modelformat - Custom Gateway: Point to a different Helicone AI Gateway instance
- Router Configuration: Use custom routers for different environments
Example with Custom Gateway and Router
providers:
- id: helicone:openai/gpt-4o
config:
baseUrl: http://my-gateway.company.com:8080
router: production
temperature: 0.5
Advanced Features
Using Different Gateway Endpoints
Route to different environments using routers:
providers:
- id: helicone:openai/gpt-4o
config:
router: production
- id: helicone:openai/gpt-4o-mini
config:
router: development
Custom Gateway Configuration
If you're running your own Helicone AI Gateway with custom configuration:
providers:
- id: helicone:custom-provider/custom-model
config:
baseUrl: http://localhost:9000
headers:
Custom-Header: value
Troubleshooting
Common Issues
- Authentication Error: Verify your
HELICONE_API_KEYis correct - Provider API Key Missing: Ensure you have valid API keys for the providers you're testing
- No Data in Dashboard: Check that requests are successfully completing
Debug Mode
For detailed request logging:
LOG_LEVEL=debug promptfoo eval
Learn More
Next Steps
- Explore the Dashboard: Review the analytics in your Helicone dashboard
- Set Up Alerts: Configure cost and usage alerts in Helicone
- Optimize Costs: Use caching and rate limiting to reduce expenses
- Scale Testing: Add more providers and test cases for comprehensive evaluation
Discussion
Questions & comments · 0
Sign In Sign in to leave a comment.