Perplexity Integration for Terraphim AI
This document describes the integration of Perplexity AI's web search API as a haystack type in Terraphim AI, providing real-time, AI-powered web search capabilities.
Overview
The Perplexity integration allows Terraphim AI to access current information from the web through Perplexity's AI-powered search API. This enables users to get real-time insights, current events, and up-to-date information that complements local knowledge bases.
Features
- Real-time Web Search: Access current information from across the web
- AI-Powered Summaries: Get concise, relevant summaries of search results
- Citation Tracking: Automatic source attribution and verification
- Configurable Models: Support for different Perplexity models (sonar-small, medium, large)
- Domain Filtering: Restrict searches to specific domains for focused results
- Response Caching: Reduce API costs and improve performance with intelligent caching
- Graceful Degradation: System continues to work even if API is unavailable
- Multiple Search Contexts: Configure different search profiles for different use cases
Configuration
API Key Setup
First, obtain a Perplexity API key from https://perplexity.ai and set it as an environment variable:
Alternatively, you can specify the API key directly in the haystack configuration.
Basic Configuration
Here's a minimal configuration for a Perplexity haystack:
Advanced Configuration
For more control, use the full configuration options:
Configuration Parameters
Required Parameters
- service: Must be set to
"Perplexity" - location: Should be
"https://api.perplexity.ai" - API key: Either via
PERPLEXITY_API_KEYenvironment variable orapi_keyin extra_parameters
Optional Parameters
| Parameter | Default | Description |
|-----------|---------|-------------|
| model | "sonar-medium-online" | Perplexity model to use |
| max_tokens | None | Maximum tokens in response |
| temperature | None | Response randomness (0.0-1.0) |
| cache_ttl_hours | 1 | How long to cache responses |
| search_domains | None | Comma-separated list of domains to search |
| search_recency | None | Time filter: "hour", "day", "week", "month" |
Available Models
Perplexity offers several models optimized for different use cases:
- sonar-small-online: Fastest, most cost-effective for simple queries
- sonar-medium-online: Balanced performance and quality (recommended)
- sonar-large-online: Highest quality responses for complex queries
Example Configurations
Research Scientist Profile
News Analyst Profile
General Research Profile
Usage Examples
Running a Search
Once configured, searches work through the standard Terraphim interface:
# Using the server API
# Or through the desktop interface
# Just type your query and the Perplexity results will be includedTesting the Integration
Run the test suite to verify your configuration:
# Run basic configuration tests
# Run live API test (requires PERPLEXITY_API_KEY)
Caching and Performance
Response Caching
The integration includes intelligent caching to:
- Reduce API costs by avoiding duplicate requests
- Improve response times for repeated queries
- Maintain performance during high-usage periods
Cache behavior:
- Default TTL: 1 hour (configurable)
- Cache Key: Based on normalized query text
- Storage: Uses Terraphim's persistence layer
- Invalidation: Time-based expiration
Performance Optimization Tips
-
Choose the right model:
- Use
sonar-small-onlinefor simple factual queries - Use
sonar-medium-onlinefor balanced performance - Use
sonar-large-onlineonly for complex research tasks
- Use
-
Configure caching appropriately:
- Longer cache times for stable information
- Shorter cache times for rapidly changing topics
-
Use domain filtering:
- Speeds up searches by focusing on relevant sources
- Improves result quality for specialized topics
-
Set appropriate token limits:
- Higher limits for detailed research
- Lower limits for quick facts and summaries
Error Handling and Resilience
The integration includes robust error handling:
Graceful Degradation
- System continues working if Perplexity API is unavailable
- Returns empty results rather than failing entire searches
- Logs warnings for debugging without breaking user experience
Common Error Scenarios
- Missing API Key: Returns empty results with warning log
- API Rate Limits: Respects limits and caches responses
- Network Issues: Timeout handling with configurable retries
- Invalid Configuration: Clear error messages for debugging
Monitoring and Debugging
Enable debug logging to monitor API usage:
LOG_LEVEL=debug This will show:
- API request/response times
- Cache hit/miss ratios
- Token usage statistics
- Error details for debugging
Best Practices
API Usage
- Set reasonable token limits to control costs
- Use caching effectively to reduce redundant requests
- Choose appropriate models based on query complexity
- Monitor usage to stay within API limits
Configuration
- Environment variables for sensitive data (API keys)
- Domain filtering for focused, relevant results
- Appropriate recency filters based on content type
- Role-based configurations for different user needs
Integration
- Combine with local haystacks for comprehensive coverage
- Use different cache TTLs based on information freshness needs
- Configure multiple Perplexity profiles for different use cases
- Test configurations before deploying to users
Troubleshooting
Common Issues
No results returned
- Check API key is set correctly
- Verify network connectivity
- Check Perplexity API status
- Review log messages for specific errors
Slow response times
- Try a smaller/faster model
- Reduce max_tokens setting
- Check cache hit rates
- Consider domain filtering
High API costs
- Increase cache TTL
- Reduce max_tokens
- Use smaller models where appropriate
- Monitor usage patterns
Configuration errors
- Validate JSON syntax
- Check parameter names and types
- Verify API key format
- Test with minimal configuration first
Debug Commands
# Test basic configuration
# Test live API (requires API key)
PERPLEXITY_API_KEY=your-key
# Check server logs
LOG_LEVEL=debug API Reference
Supported Perplexity API Features
- Chat Completions: Primary interface for search queries
- Online Models: Real-time web search capabilities
- Domain Filtering: Restrict search to specific websites
- Recency Filtering: Time-based result filtering
- Citation Tracking: Automatic source attribution
Response Format
Perplexity responses are converted to Terraphim Documents with:
- Title: Generated from query with "[Perplexity]" prefix
- Body: AI-generated summary with citations
- Description: Context about the search
- URL: Custom perplexity:// scheme with encoded query
- Tags: ["perplexity", "ai-search", "web-search", "real-time"]
- Rank: High ranking (1000) for prioritization
- Sources: Separate documents for each citation (optional)
Security Considerations
API Key Management
- Store API keys as environment variables
- Never commit API keys to version control
- Rotate keys regularly
- Monitor usage for unauthorized access
Data Privacy
- Search queries are sent to Perplexity's servers
- Responses are cached locally
- Consider data sensitivity when using external APIs
- Review Perplexity's privacy policy for compliance
Future Enhancements
Potential improvements for the integration:
- Streaming Responses: Real-time result streaming
- Advanced Caching: Smart cache invalidation based on content type
- Usage Analytics: Detailed API usage and cost tracking
- Custom Prompts: Configurable system prompts for different use cases
- Multi-language Support: International search capabilities
- Result Ranking: Integration with Terraphim's relevance scoring
Contributing
To contribute improvements to the Perplexity integration:
- Follow the existing code patterns in
crates/terraphim_middleware/src/haystack/perplexity.rs - Add tests for new functionality in
tests/perplexity_haystack_test.rs - Update this documentation for any new features
- Ensure all tests pass:
cargo test -p terraphim_middleware - Submit a pull request with clear description of changes
Support
For issues specific to the Perplexity integration:
- Check this documentation for configuration guidance
- Run the test suite to verify setup
- Review logs with debug level enabled
- Check Perplexity API documentation for service updates
- Open an issue in the Terraphim AI repository
This integration enables Terraphim AI to access real-time web information through Perplexity's advanced AI search capabilities, making it a powerful tool for research, current events analysis, and staying up-to-date with rapidly changing information.