Perplexity Integration for Terraphim AI

This document describes the integration of Perplexity AI's web search API as a haystack type in Terraphim AI, providing real-time, AI-powered web search capabilities.

Overview

The Perplexity integration allows Terraphim AI to access current information from the web through Perplexity's AI-powered search API. This enables users to get real-time insights, current events, and up-to-date information that complements local knowledge bases.

Features

Real-time Web Search: Access current information from across the web
AI-Powered Summaries: Get concise, relevant summaries of search results
Citation Tracking: Automatic source attribution and verification
Configurable Models: Support for different Perplexity models (sonar-small, medium, large)
Domain Filtering: Restrict searches to specific domains for focused results
Response Caching: Reduce API costs and improve performance with intelligent caching
Graceful Degradation: System continues to work even if API is unavailable
Multiple Search Contexts: Configure different search profiles for different use cases

Configuration

API Key Setup

First, obtain a Perplexity API key from https://perplexity.ai and set it as an environment variable:

export PERPLEXITY_API_KEY="your-api-key-here"

Alternatively, you can specify the API key directly in the haystack configuration.

Basic Configuration

Here's a minimal configuration for a Perplexity haystack:

{
  "haystacks": [
    {
      "location": "https://api.perplexity.ai",
      "service": "Perplexity",
      "read_only": true,
      "extra_parameters": {
        "model": "sonar-medium-online"
      }
    }
  ]
}

Advanced Configuration

For more control, use the full configuration options:

{
  "haystacks": [
    {
      "location": "https://api.perplexity.ai",
      "service": "Perplexity",
      "read_only": true,
      "extra_parameters": {
        "api_key": "your-api-key",
        "model": "sonar-large-online",
        "max_tokens": "2000",
        "temperature": "0.1",
        "cache_ttl_hours": "2",
        "search_domains": "arxiv.org,github.com,stackoverflow.com",
        "search_recency": "week"
      }
    }
  ]
}

Configuration Parameters

Required Parameters

service: Must be set to "Perplexity"
location: Should be "https://api.perplexity.ai"
API key: Either via PERPLEXITY_API_KEY environment variable or api_key in extra_parameters

Optional Parameters

| Parameter | Default | Description | |-----------|---------|-------------| | model | "sonar-medium-online" | Perplexity model to use | | max_tokens | None | Maximum tokens in response | | temperature | None | Response randomness (0.0-1.0) | | cache_ttl_hours | 1 | How long to cache responses | | search_domains | None | Comma-separated list of domains to search | | search_recency | None | Time filter: "hour", "day", "week", "month" |

Available Models

Perplexity offers several models optimized for different use cases:

sonar-small-online: Fastest, most cost-effective for simple queries
sonar-medium-online: Balanced performance and quality (recommended)
sonar-large-online: Highest quality responses for complex queries

Example Configurations

Research Scientist Profile

{
  "name": "AI Research Scientist",
  "haystacks": [
    {
      "location": "https://api.perplexity.ai",
      "service": "Perplexity",
      "read_only": true,
      "extra_parameters": {
        "model": "sonar-large-online",
        "max_tokens": "2000",
        "temperature": "0.1",
        "search_domains": "arxiv.org,huggingface.co,papers.nips.cc",
        "search_recency": "month",
        "cache_ttl_hours": "4"
      }
    }
  ]
}

News Analyst Profile

{
  "name": "News Analyst",
  "haystacks": [
    {
      "location": "https://api.perplexity.ai",
      "service": "Perplexity",
      "read_only": true,
      "extra_parameters": {
        "model": "sonar-medium-online",
        "max_tokens": "1500",
        "temperature": "0.3",
        "search_domains": "reuters.com,bbc.com,cnn.com,npr.org",
        "search_recency": "day",
        "cache_ttl_hours": "0.5"
      }
    }
  ]
}

General Research Profile

{
  "name": "General Researcher",
  "haystacks": [
    {
      "location": "https://api.perplexity.ai",
      "service": "Perplexity",
      "read_only": true,
      "extra_parameters": {
        "model": "sonar-medium-online",
        "max_tokens": "1000",
        "temperature": "0.2",
        "cache_ttl_hours": "2"
      }
    }
  ]
}

Usage Examples

Running a Search

Once configured, searches work through the standard Terraphim interface:

# Using the server API
curl -X POST "http://localhost:8080/documents/search" \
  -H "Content-Type: application/json" \
  -d '{"search_term": "latest developments in AI safety", "limit": 5}'

# Or through the desktop interface
# Just type your query and the Perplexity results will be included

Testing the Integration

Run the test suite to verify your configuration:

# Run basic configuration tests
cargo test -p terraphim_middleware test_perplexity

# Run live API test (requires PERPLEXITY_API_KEY)
cargo test -p terraphim_middleware perplexity_live_api_test -- --ignored

Caching and Performance

Response Caching

The integration includes intelligent caching to:

Reduce API costs by avoiding duplicate requests
Improve response times for repeated queries
Maintain performance during high-usage periods

Cache behavior:

Default TTL: 1 hour (configurable)
Cache Key: Based on normalized query text
Storage: Uses Terraphim's persistence layer
Invalidation: Time-based expiration

Performance Optimization Tips

Choose the right model:
- Use sonar-small-online for simple factual queries
- Use sonar-medium-online for balanced performance
- Use sonar-large-online only for complex research tasks
Configure caching appropriately:
- Longer cache times for stable information
- Shorter cache times for rapidly changing topics
Use domain filtering:
- Speeds up searches by focusing on relevant sources
- Improves result quality for specialized topics
Set appropriate token limits:
- Higher limits for detailed research
- Lower limits for quick facts and summaries

Error Handling and Resilience

The integration includes robust error handling:

Graceful Degradation

System continues working if Perplexity API is unavailable
Returns empty results rather than failing entire searches
Logs warnings for debugging without breaking user experience

Common Error Scenarios

Missing API Key: Returns empty results with warning log
API Rate Limits: Respects limits and caches responses
Network Issues: Timeout handling with configurable retries
Invalid Configuration: Clear error messages for debugging

Monitoring and Debugging

Enable debug logging to monitor API usage:

LOG_LEVEL=debug cargo run

This will show:

API request/response times
Cache hit/miss ratios
Token usage statistics
Error details for debugging

Best Practices

API Usage

Set reasonable token limits to control costs
Use caching effectively to reduce redundant requests
Choose appropriate models based on query complexity
Monitor usage to stay within API limits

Configuration

Environment variables for sensitive data (API keys)
Domain filtering for focused, relevant results
Appropriate recency filters based on content type
Role-based configurations for different user needs

Integration

Combine with local haystacks for comprehensive coverage
Use different cache TTLs based on information freshness needs
Configure multiple Perplexity profiles for different use cases
Test configurations before deploying to users

Troubleshooting

Common Issues

No results returned

Check API key is set correctly
Verify network connectivity
Check Perplexity API status
Review log messages for specific errors

Slow response times

Try a smaller/faster model
Reduce max_tokens setting
Check cache hit rates
Consider domain filtering

High API costs

Increase cache TTL
Reduce max_tokens
Use smaller models where appropriate
Monitor usage patterns

Configuration errors

Validate JSON syntax
Check parameter names and types
Verify API key format
Test with minimal configuration first

Debug Commands

# Test basic configuration
cargo test -p terraphim_middleware test_perplexity_config_parsing

# Test live API (requires API key)
PERPLEXITY_API_KEY=your-key cargo test -p terraphim_middleware perplexity_live_api_test -- --ignored

# Check server logs
LOG_LEVEL=debug cargo run -- --config perplexity_researcher_config.json

API Reference

Supported Perplexity API Features

Chat Completions: Primary interface for search queries
Online Models: Real-time web search capabilities
Domain Filtering: Restrict search to specific websites
Recency Filtering: Time-based result filtering
Citation Tracking: Automatic source attribution

Response Format

Perplexity responses are converted to Terraphim Documents with:

Title: Generated from query with "[Perplexity]" prefix
Body: AI-generated summary with citations
Description: Context about the search
URL: Custom perplexity:// scheme with encoded query
Tags: ["perplexity", "ai-search", "web-search", "real-time"]
Rank: High ranking (1000) for prioritization
Sources: Separate documents for each citation (optional)

Security Considerations

API Key Management

Store API keys as environment variables
Never commit API keys to version control
Rotate keys regularly
Monitor usage for unauthorized access

Data Privacy

Search queries are sent to Perplexity's servers
Responses are cached locally
Consider data sensitivity when using external APIs
Review Perplexity's privacy policy for compliance

Future Enhancements

Potential improvements for the integration:

Streaming Responses: Real-time result streaming
Advanced Caching: Smart cache invalidation based on content type
Usage Analytics: Detailed API usage and cost tracking
Custom Prompts: Configurable system prompts for different use cases
Multi-language Support: International search capabilities
Result Ranking: Integration with Terraphim's relevance scoring

Contributing

To contribute improvements to the Perplexity integration:

Follow the existing code patterns in crates/terraphim_middleware/src/haystack/perplexity.rs
Add tests for new functionality in tests/perplexity_haystack_test.rs
Update this documentation for any new features
Ensure all tests pass: cargo test -p terraphim_middleware
Submit a pull request with clear description of changes

Support

For issues specific to the Perplexity integration:

Check this documentation for configuration guidance
Run the test suite to verify setup
Review logs with debug level enabled
Check Perplexity API documentation for service updates
Open an issue in the Terraphim AI repository

This integration enables Terraphim AI to access real-time web information through Perplexity's advanced AI search capabilities, making it a powerful tool for research, current events analysis, and staying up-to-date with rapidly changing information.