OpenRouter AI-Powered Article Summarization
Terraphim AI supports AI-powered article summarization through OpenRouter integration. This feature replaces basic text excerpts with intelligent, contextual summaries generated by state-of-the-art language models.
Overview
The OpenRouter integration provides:
- Intelligent Summaries: AI-generated summaries instead of simple text excerpts
- Multiple Models: Choose from GPT-4, Claude, Mixtral, and more
- Cost Control: Optional feature with configurable models and usage limits
- Feature-Gated: Zero overhead when disabled, optional compilation
Quick Start
1. Enable the OpenRouter Feature
The OpenRouter integration is behind a feature flag for optional compilation:
# Build with OpenRouter support
# Or for the server
2. Get an OpenRouter API Key
- Sign up at OpenRouter
- Create an API key (starts with
sk-or-v1-) - Fund your account with credits
3. Configure a Role
Using the Configuration Wizard in the UI:
- Navigate to
/config/wizard - Select or create a role
- Enable "AI-Enhanced Summaries (OpenRouter)"
- Optionally enable "Auto-summarize search results"
- Enable "Chat interface" and pick a Chat model, optionally add a System prompt
- Enter your API key
- Choose a model (e.g., "GPT-3.5 Turbo" for fast, affordable summaries)
- Save the configuration
Supported Models
| Model | Provider | Speed | Quality | Cost | Best For |
|-------|----------|-------|---------|------|----------|
| openai/gpt-3.5-turbo | OpenAI | Fast | Good | Low | General summaries |
| openai/gpt-4 | OpenAI | Slow | Excellent | High | High-quality summaries |
| anthropic/claude-3-sonnet | Anthropic | Medium | Excellent | Medium | Balanced performance |
| anthropic/claude-3-haiku | Anthropic | Very Fast | Good | Low | High throughput |
| mistralai/mixtral-8x7b-instruct | Mistral | Fast | Good | Very Low | Open source option |
Configuration
Role Configuration (JSON)
Environment Variables
For production deployments, you can use environment variables:
# Set default OpenRouter configuration
How It Works
Search Pipeline Integration
When OpenRouter is enabled for a role, the search pipeline:
- Executes Normal Search: Retrieves documents using standard indexing
- Filters Candidates: Identifies documents suitable for AI summarization
- Generates Summaries (optional): Calls OpenRouter API to create intelligent summaries when auto-summarize is enabled
- Replaces Descriptions: Substitutes basic excerpts with AI summaries
- Returns Enhanced Results: Users see contextual, meaningful descriptions
Chat API
POST /chat
Request:
Response:
Content Filtering
The system automatically determines which documents should receive AI summaries:
- β Included: Documents with substantial content (>200 characters)
- β Included: Documents without existing high-quality descriptions
- β Excluded: Very short documents (<200 characters)
- β Excluded: Very large documents (>8000 characters, cost control)
- β Excluded: Documents with existing detailed descriptions
Summary Generation
Each summary is:
- Length-Controlled: Target ~250 characters for search result compatibility
- Context-Aware: Includes main ideas, key points, and essential information
- Search-Optimized: Designed to help users evaluate document relevance
- Fallback-Safe: Original descriptions retained if AI generation fails
Cost Management
Controlling Costs
- Model Selection: Choose cost-effective models for high-volume use
- Content Filtering: Automatic filtering prevents summarizing unsuitable content
- Length Limits: Content truncated to 4000 characters max before API calls
- Error Handling: Graceful fallback to original descriptions on API failures
Estimated Costs
Based on OpenRouter pricing (approximate):
| Model | Cost per 1K tokens | Typical Summary Cost | 1000 Summaries | |-------|-------------------|---------------------|-----------------| | GPT-3.5 Turbo | $0.0015 | $0.002 | $2.00 | | Claude 3 Haiku | $0.0005 | $0.0007 | $0.70 | | Mixtral 8x7B | $0.0004 | $0.0005 | $0.50 | | GPT-4 | $0.03 | $0.04 | $40.00 |
Costs vary based on content length and model pricing changes.
Development
Feature Flag Architecture
The OpenRouter integration uses conditional compilation:
// Only compiled when 'openrouter' feature is enabled
// Graceful fallback when feature is disabled
Adding New Models
To add support for new OpenRouter models:
- Update the model list in
ConfigWizard.svelte:
<option value="new/model-name">New Model (Description)</option>- Add to recommended models in
openrouter.rs:
,Testing
Run OpenRouter-specific tests:
# Run all OpenRouter tests
# Test without feature (should use stubs)
Troubleshooting
Common Issues
API Key Not Working
- Verify key starts with
sk-or-v1- - Check account has sufficient credits
- Ensure key has necessary permissions
No Summaries Generated
- Check role has
openrouter_enabled: true - Verify documents meet content filtering criteria
- Check logs for API errors or rate limiting
High Costs
- Switch to a more cost-effective model
- Review content filtering settings
- Monitor usage through OpenRouter dashboard
Error Messages
| Error | Cause | Solution |
|-------|-------|----------|
| Feature not enabled | OpenRouter feature not compiled | Build with --features openrouter |
| API key cannot be empty | Missing API key | Configure valid OpenRouter API key |
| Rate limit exceeded | Too many requests | Wait and retry, or upgrade OpenRouter plan |
| Content too long | Document >4000 chars | Automatic truncation, no action needed |
Logging
Enable debug logging to troubleshoot:
RUST_LOG=terraphim_service::openrouter=debug Logs will show:
- API requests and responses
- Content filtering decisions
- Summary generation progress
- Error details and fallbacks
Best Practices
Model Selection
- Development: Use
anthropic/claude-3-haikufor fast, cheap testing - Production: Use
openai/gpt-3.5-turbofor balanced performance - High Quality: Use
anthropic/claude-3-sonnetfor premium results
Configuration
- Test with a small document set first
- Monitor costs through OpenRouter dashboard
- Use read-only haystacks to prevent unexpected large-scale summarization
- Configure appropriate rate limiting for high-volume deployments
Performance
- AI summarization adds 200-500ms per document
- Consider enabling only for important roles
- Use knowledge graph ranking to prioritize important documents
- Cache results when possible (future enhancement)
Roadmap
Future enhancements planned:
- Caching: Cache generated summaries to reduce API calls
- Streaming: Stream responses for real-time summary generation
- Custom Prompts: User-configurable prompt templates
- Local Models: Support for local LLM deployment
- Batch Processing: Optimize multiple document summarization
- Analytics: Usage analytics and cost tracking
Security Considerations
- API keys are stored in role configuration (consider encryption for production)
- Sensitive document content is sent to OpenRouter (review data policies)
- Use environment variables for API keys in production deployments
- Consider network policies for OpenRouter API access
- Implement proper access controls for configuration management
For more information: