Research Document: Underutilized Terraphim Features for Pre/Post-LLM Knowledge Graph Workflows
1. Problem Restatement and Scope
Problem Statement
Terraphim has powerful knowledge graph capabilities that are currently underutilized. Four specific features could be leveraged to create a local-first workflow that:
- Pre-LLM: Validates and enriches context before sending to LLMs
- Post-LLM: Validates domain model compliance in LLM outputs
IN Scope
- Graph connectivity (
is_all_terms_connected_by_path) for semantic coherence validation - Fuzzy autocomplete for suggesting alternatives when no exact match exists
- Role-based replacement with different thesauruses per role
- Paragraph extraction for smarter commit message handling
- New/updated skills and hooks leveraging these capabilities
- Local-first knowledge graph validation workflows
OUT of Scope
- Changes to core automata algorithms (already optimized)
- New LLM integrations (use existing OpenRouter/Ollama)
- Remote/cloud knowledge graph storage
- UI/frontend changes
2. User & Business Outcomes
For AI Coding Agents (Primary User)
| Outcome | Benefit | |---------|---------| | Pre-LLM semantic validation | Catch nonsensical queries before wasting LLM tokens | | Post-LLM domain checklist | Verify outputs use correct terminology | | Fuzzy term suggestions | Recover from typos/near-matches gracefully | | Role-aware context | Different domains get appropriate knowledge graphs |
For Developers Using Terraphim
| Outcome | Benefit | |---------|---------| | Smarter commit messages | Auto-extract relevant concepts from changed files | | Hook-based validation | Prevent commits that violate domain model | | Skill-based workflows | Reusable patterns for pre/post-LLM validation |
Business Value
- Reduced LLM API costs (filter bad queries)
- Higher quality AI outputs (domain-validated)
- Better knowledge retention (local-first graphs)
- Improved developer experience (intelligent suggestions)
3. System Elements and Dependencies
Current Feature Implementations
3.1 Graph Connectivity
| Element | Location | Status |
|---------|----------|--------|
| Core algorithm | terraphim_rolegraph/src/lib.rs:204-277 | β
Complete |
| MCP tool wrapper | terraphim_mcp_server/src/lib.rs:1027-1138 | β οΈ Placeholder (doesn't call real implementation) |
| Unit tests | terraphim_rolegraph/src/lib.rs:1226-1246 | β οΈ 1 ignored test |
| Integration tests | terraphim_mcp_server/tests/test_advanced_automata_functions.rs | β
Multiple scenarios |
| Benchmarks | terraphim_rolegraph/benches/throughput.rs:190-196 | β
Available |
| CLI exposure | None | β Missing |
Algorithm: DFS backtracking to find if single path connects all matched terms. O(n!) worst case but optimized for β€8 nodes with fast-fail isolation check.
3.2 Fuzzy Autocomplete
| Element | Location | Status |
|---------|----------|--------|
| Jaro-Winkler (default) | terraphim_automata/src/autocomplete.rs:328-412 | β
Complete |
| Levenshtein (baseline) | terraphim_automata/src/autocomplete.rs:236-321 | β
Complete |
| MCP tools | terraphim_mcp_server/src/lib.rs:471-620 | β
4 tools exposed |
| CLI exposure | None | β Missing |
| Hook integration | None | β Missing |
Performance: Jaro-Winkler is 2.3x faster than Levenshtein with better prefix weighting.
3.3 Role-Based Replacement
| Element | Location | Status |
|---------|----------|--------|
| Role configuration | terraphim_config/src/lib.rs:175-249 | β
Complete |
| KnowledgeGraph per role | terraphim_config/src/lib.rs:393-420 | β
Complete |
| RoleGraph loading | terraphim_config/src/lib.rs:865-930 | β
Complete |
| PreToolUse hook | .claude/hooks/npm_to_bun_guard.sh | β
Single role only |
| Multi-role hook support | None | β Missing |
| Role selection in replace | terraphim-agent replace | β οΈ Uses default role only |
Current Hook Flow:
PreToolUse β detect npm/yarn/pnpm β terraphim-agent replace β KG lookup β modified command3.4 Paragraph Extraction
| Element | Location | Status |
|---------|----------|--------|
| Core function | terraphim_automata/src/matcher.rs:101-125 | β
Complete |
| find_paragraph_end | terraphim_automata/src/matcher.rs:130-148 | β
Complete |
| MCP tool | terraphim_mcp_server/src/lib.rs:843-911 | β
Complete |
| CLI exposure | None | β Missing |
| Commit message integration | None | β Missing |
Dependency Graph
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Skills & Hooks Layer β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β pre-llm- β β post-llm- β β smart-commit β β
β β validation β β checklist β β hook β β
β ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ β
βββββββββββΌββββββββββββββββββΌββββββββββββββββββΌββββββββββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β terraphim-agent CLI β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β replace β β validate β β extract β β
β β --role X β β --checklist β β --paragraphs β β
β ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ β
βββββββββββΌββββββββββββββββββΌββββββββββββββββββΌββββββββββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Core Crate Layer β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β terraphim_service β β
β β - orchestrates config, rolegraph, automata β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β β β
β βΌ βΌ βΌ β
β ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ β
β βterraphim_configβ βterraphim_role- β β terraphim_ β β
β β - Role struct β βgraph β β automata β β
β β - KG loading β β - connectivityβ β - fuzzy AC β β
β β β β - query_graph β β - paragraph β β
β ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββCross-Cutting Concerns
- Thesaurus format: JSON with
{id, nterm, url}structure - Aho-Corasick: LeftmostLongest matching (longer patterns win)
- Role resolution: Case-insensitive via RoleName struct
- Async boundaries: RoleGraph behind
Arc<Mutex<>>(RoleGraphSync)
4. Constraints and Their Implications
Technical Constraints
| Constraint | Why It Matters | Implication | |------------|----------------|-------------| | Graph connectivity O(n!) | Exponential for >8 matched terms | Must limit term count or use heuristics | | Hooks are shell scripts | Must pipe through terraphim-agent | Need CLI commands for all features | | MCP placeholder for connectivity | Current MCP tool doesn't call real impl | Must fix before MCP-based workflows | | Role loading at startup | ConfigState builds all RoleGraphs | Heavy startup if many roles with large KGs | | WASM compatibility | terraphim_automata targets wasm32 | Cannot use filesystem in WASM builds |
Business/UX Constraints
| Constraint | Why It Matters | Implication | |------------|----------------|-------------| | Local-first requirement | Privacy, offline capability | Cannot require network for validation | | Sub-second latency | Hooks must not slow down coding | Optimize hot paths, cache aggressively | | Backward compatibility | Existing hooks/skills must work | Additive changes only |
Security Constraints
| Constraint | Why It Matters | Implication | |------------|----------------|-------------| | Hooks run arbitrary commands | Could be exploited if input not sanitized | Validate all hook inputs | | Knowledge graphs contain URLs | Could leak sensitive paths | Sanitize KG content |
5. Risks, Unknowns, and Assumptions
UNKNOWNS
-
U1: What is the typical matched term count in real queries?
- Risk: If >8 terms common, connectivity check becomes slow
- De-risk: Add telemetry to measure in production
-
U2: Which roles need different thesauruses?
- Currently only "Terraphim Engineer" has KG
- Need to understand user role patterns
-
U3: What paragraph boundaries work for code vs docs?
- Current: blank lines only
- Code uses different conventions (function boundaries, etc.)
-
U4: MCP placeholder - why wasn't real implementation connected?
- Need to investigate technical blockers
ASSUMPTIONS
- A1: Users want pre-LLM validation to reduce costs (needs validation)
- A2: Fuzzy autocomplete threshold of 0.6 is appropriate default (based on tests)
- A3: Role-based replacement is more valuable than global replacement (needs validation)
- A4: Commit messages benefit from concept extraction (hypothesis)
- A5: Existing hook infrastructure can handle additional complexity (likely true)
RISKS
Technical Risks
| Risk | Likelihood | Impact | Mitigation | |------|------------|--------|------------| | Connectivity check too slow | Medium | High | Add term count limit, timeout | | MCP fix breaks existing tests | Low | Medium | Run full test suite before/after | | Role loading increases startup time | Medium | Medium | Lazy loading, caching | | Paragraph extraction misses code boundaries | High | Low | Add code-aware extraction mode |
Product/UX Risks
| Risk | Likelihood | Impact | Mitigation | |------|------------|--------|------------| | Pre-LLM validation too strict | Medium | High | Allow bypass, tunable thresholds | | Fuzzy suggestions irrelevant | Medium | Medium | User feedback loop, adjust similarity | | Hook complexity confuses users | Low | Medium | Clear documentation, examples |
Security Risks
| Risk | Likelihood | Impact | Mitigation | |------|------------|--------|------------| | Malicious KG injection | Low | High | Validate KG sources, sanitize | | Hook command injection | Low | High | Input validation, sandboxing |
6. Context Complexity vs. Simplicity Opportunities
Sources of Complexity
-
Multiple thesaurus loading paths
- Remote URL (automata_path)
- Local markdown (knowledge_graph_local)
- Direct JSON
- Each role can use different path
-
Async/sync boundary in RoleGraph
- RoleGraphSync wraps in Arc<Mutex<>>
- Can cause contention with many concurrent queries
-
MCP vs CLI vs Direct Rust API
- Three ways to access same functionality
- Inconsistent feature availability across interfaces
-
Hook shell script complexity
- JSON parsing with jq
- Agent discovery logic
- Error handling scattered
Simplification Opportunities
S1: Unified CLI Interface
Create consistent terraphim-agent subcommands that expose ALL features:
S2: Single Hook Entry Point
Replace multiple shell scripts with single Rust-based hook handler:
Benefits: Better error handling, type safety, testability
S3: Phased Validation Pipeline
Create composable validation stages:
Input β [Term Extraction] β [Connectivity Check] β [Fuzzy Fallback] β [Role Replacement] β OutputEach stage can be enabled/disabled, making workflows flexible.
S4: Checklist as Knowledge Graph
Model checklists as specialized KG entries:
7. Questions for Human Reviewer
Critical Questions
-
Q1: Should pre-LLM validation be blocking (reject query) or advisory (add warnings)?
- Affects UX and implementation complexity
-
Q2: What's the acceptable latency budget for hook-based validation?
- Current hooks are <100ms; adding connectivity check may exceed this
-
Q3: Should we fix the MCP connectivity placeholder before building skills on top?
- Blocking for MCP-based workflows
Design Questions
-
Q4: Should fuzzy suggestions be automatic (always try) or opt-in?
- Trade-off: convenience vs. unexpected behavior
-
Q5: How should role selection work in hooks?
- Options: config file, env var, auto-detect from project
-
Q6: What code boundary detection is needed for paragraph extraction?
- Options: language-aware (complex) vs. heuristic (simpler)
Validation Questions
-
Q7: Do you have specific use cases for post-LLM domain validation?
- Need concrete examples to design checklist format
-
Q8: Which existing skills should be updated vs. creating new ones?
- Affects scope and backward compatibility
Appendix: Current Feature Usage Summary
| Feature | Core Impl | MCP | CLI | Hooks | Tests | |---------|-----------|-----|-----|-------|-------| | Graph Connectivity | β | β οΈ Placeholder | β | β | β | | Fuzzy Autocomplete | β | β | β | β | β | | Role-Based Replacement | β | β | β οΈ Default only | β οΈ Single role | β | | Paragraph Extraction | β | β | β | β | β |
Legend: β Complete | β οΈ Partial | β Missing