Terraphim Agent Session Search - Feature Specification
Version: 1.2.0 Status: Phase 3 Complete Created: 2025-12-03 Updated: 2025-12-04 Inspired by: Coding Agent Session Search (CASS)
Executive Summary
This specification defines enhancements to terraphim-agent that enable cross-agent session search, AI-friendly CLI interfaces, and knowledge graph-enhanced session analysis. The goal is to unify coding assistant history across multiple tools while leveraging Terraphim's unique knowledge graph capabilities.
Problem Statement
Current Limitations
-
Fragmented Knowledge: Developers use multiple AI coding assistants (Claude Code, Cursor, Copilot, Aider, Cline). Solutions discovered in one tool are invisible to others.
-
AI Integration Barriers: Current CLI is designed for humans, not AI agents. Lacks structured output, tolerant parsing, and self-documentation.
-
No Session Persistence:
terraphim-agentmaintains command history but no conversation/session tracking or cross-session search. -
Limited Discoverability: Past solutions are hard to find without remembering exact terms used.
Goals
| Goal | Description | Success Metric | |------|-------------|----------------| | G1 | Enable search across all AI coding assistant sessions | Search latency <100ms for 10K sessions | | G2 | Make CLI usable by AI agents | Zero parse failures from typos | | G3 | Self-documenting API | Complete JSON schema for all commands | | G4 | Knowledge graph enrichment | Connect sessions via shared concepts | | G5 | Token-aware output | Precise control over response size |
Non-Goals
- Real-time sync with cloud services (privacy-first, local only)
- Training or fine-tuning models on session data
- Replacing existing search functionality (augmenting it)
Feature Specifications
F1: Robot Mode
F1.1 Structured Output
Description: All commands support machine-readable output formats.
Formats:
json: Pretty-printed JSON (default for robot mode)jsonl: Newline-delimited JSON for streamingtable: Human-readable tables (default for interactive)minimal: Compact single-line JSON
Syntax:
Output Schema:
Error Schema:
F1.2 Exit Codes
| Code | Name | Description |
|------|------|-------------|
| 0 | SUCCESS | Operation completed successfully |
| 1 | ERROR_GENERAL | Unspecified error |
| 2 | ERROR_USAGE | Invalid arguments or syntax |
| 3 | ERROR_INDEX_MISSING | Required index not initialized |
| 4 | ERROR_NOT_FOUND | No results for query |
| 5 | ERROR_AUTH | Authentication required |
| 6 | ERROR_NETWORK | Network/connectivity issue |
| 7 | ERROR_TIMEOUT | Operation timed out |
F1.3 Token Budget Management
Description: Control output size for LLM context windows.
Parameters:
--max-tokens <n>: Maximum tokens in response (estimated)--max-results <n>: Maximum number of results--max-content-length <n>: Truncate content fields at n characters--fields <mode>: Field selection mode
Field Modes:
full: All fields including body contentsummary: title, url, description, score, conceptsminimal: title, url, score onlycustom:field1,field2,...: Specific fields
Truncation Indicators:
F2: Forgiving CLI
F2.1 Typo Tolerance
Description: Auto-correct command typos using edit distance matching.
Algorithm: Jaro-Winkler similarity (existing in terraphim_automata)
Thresholds:
- Edit distance ≤ 2: Auto-correct with notification
- Edit distance 3-4: Suggest alternatives, don't auto-correct
- Edit distance > 4: Treat as unknown command
Behavior:
$ terraphim-agent serach "query"
⚡ Auto-corrected: serach → search
[search results...]Robot Mode Behavior:
F2.2 Command Aliases
Built-in Aliases:
| Alias | Canonical Command |
|-------|-------------------|
| /q, /query, /find | /search |
| /h, /? | /help |
| /c | /config |
| /r | /role |
| /s | /sessions |
| /ac | /autocomplete |
Custom Aliases (via config):
[aliases]
ss = "sessions search"
si = "sessions import"F2.3 Argument Flexibility
Features:
- Case-insensitive flags:
--Format=--format - Flag value separators:
--format=json=--format json - Boolean flag variations:
--verbose,-v,--verbose=true - Quoted argument handling:
"multi word query"or'multi word query'
F3: Self-Documentation API
F3.1 Capabilities Endpoint
Command: terraphim-agent robot capabilities
Output:
F3.2 Schema Documentation
Command: terraphim-agent robot schemas [command]
Output (for search):
F3.3 Examples Endpoint
Command: terraphim-agent robot examples [command]
Provides runnable examples with expected outputs.
F4: Session Search & Indexing
F4.1 Session Connectors
Supported Sources:
| Source | Format | Location |
|--------|--------|----------|
| Claude Code | JSONL | ~/.claude/ |
| Cursor | SQLite | ~/.cursor/ |
| Aider | Markdown | .aider.chat.history.md |
| Cline | JSON | ~/.cline/ |
| OpenCode | JSONL | ~/.opencode/ |
| Codex | JSONL | ~/.codex/ |
Connector Interface:
F4.2 Session Data Model
F4.3 Session Index
Technology: Tantivy (Rust full-text search, same as CASS)
Index Schema:
Tokenization:
- Edge n-gram for code patterns (handles
snake_case,camelCase, symbols) - Standard tokenizer for natural language
- Language-specific tokenizers for code
F4.4 Session Commands
# Import sessions
# Search sessions
# Timeline and analysis
# Export
F5: Knowledge Graph Enhancement
F5.1 Session Enrichment
Process:
- On import, extract text from messages
- Run through
terraphim_automatato identify concepts - Store concept matches with sessions
- Update
RoleGraphwith session-concept relationships
Enrichment Data:
F5.2 Concept-Based Discovery
Commands:
# Find sessions by concept
# Find concept paths between sessions
# Cluster sessions by concept similarity
F5.3 Cross-Session Learning
Integration with Agent Evolution:
- Successful solutions become "lessons learned"
- Patterns across sessions inform future recommendations
- Concept frequency informs knowledge graph weighting
User Experience
Interactive Mode
$ terraphim-agent
🔮 Terraphim Agent v0.1.0
> /sessions search "async database"
â•──────┬────────────────────────────────┬──────────┬───────────╮
│ Rank │ Session │ Source │ Date │
├──────┼────────────────────────────────┼──────────┼───────────┤
│ 1 │ Fixing async pool exhaustion │ claude │ 2024-12-01│
│ 2 │ SQLx connection handling │ cursor │ 2024-11-28│
│ 3 │ Tokio runtime in tests │ aider │ 2024-11-15│
╰──────┴────────────────────────────────┴──────────┴───────────╯
Concepts matched: async, database, connection_pool, tokio
3 results in 42ms
> /sessions expand 1 --context 5
[Expands session 1 with 5 messages of context]Robot Mode
{
}Security & Privacy
Data Handling
- Local Only: All session data stored locally, never transmitted
- Source Paths: Configurable, defaults respect source tool conventions
- Encryption at Rest: Optional encryption for session index
- Access Control: Sessions inherit file system permissions
Sensitive Data
- API Keys: Redacted during import (regex patterns)
- Secrets: Optional secret scanning with configurable patterns
- PII: No special handling (user responsibility)
Performance Requirements
| Metric | Target | Notes | |--------|--------|-------| | Import speed | >1000 sessions/sec | Batch processing | | Search latency | <100ms | For 10K sessions | | Index size | <10MB per 1K sessions | With compression | | Memory usage | <100MB | During search | | Startup time | <500ms | With warm index |
Compatibility
Minimum Requirements
- Rust 1.75+
- 50MB disk space (base)
- 100MB RAM
Platform Support
- Linux (primary)
- macOS
- Windows (via WSL recommended)
Integration Points
- MCP server (existing)
- HTTP API (existing)
- Unix pipes (new)
- JSON-RPC (future)
Success Criteria
Phase 1 (Robot Mode)
- [x] All commands support
--format jsonvia--robotand--formatflags - [x] Exit codes defined (OutputFormat enum)
- [ ] Token budget management working
- [x] Forgiving CLI implemented (
ForgivingParserwith Jaro-Winkler) - [x] Self-documentation API (
CapabilitiesDoc,CommandDoc)
Phase 2 (Session Search)
- [x] Claude Code connector (via
terraphim-session-analyzerintegration) - [x] Cursor SQLite connector (via CLA
CursorConnector) - [x] Basic session commands (
/sessions sources|import|list|search|stats|show) - [x] Feature-gated architecture (
terraphim_sessionscrate)
Phase 3 (Knowledge Graph)
- [x] Session enrichment pipeline (
SessionEnricher, feature-gated viaenrichment) - [x] Concept-based session discovery (
/sessions concepts,/sessions related) - [x] Timeline and export (
/sessions timeline,/sessions export) - [ ] Cross-session learning integration (future enhancement)