Research Document: Fix terraphim-agent/terraphim-cli Search Regressions (#578, #579)

Status: Review Author: Codex Date: 2026-02-24 Reviewers: Maintainers of terraphim_agent, terraphim_cli, terraphim_service

Executive Summary

Issues #578 and #579 share a backend behavior: TerraphimGraph search currently returns empty result sets for common queries even when haystack search finds matches. In parallel, terraphim-agent exposes global --robot and --format flags but does not route command output through those settings for normal subcommands.

Primary finding: search result emptiness is caused by strict rolegraph-only retrieval without lexical fallback when query_graph returns no hits, and output noise/format mismatch is caused by direct eprintln! and unconsumed CLI output-mode flags.

Essential Questions Check

| Question | Answer | Evidence | |----------|--------|----------| | Energizing? | Yes | Search correctness is a core user path and active production bug reports were filed today. | | Leverages strengths? | Yes | This repo already has service, CLI, and rolegraph layers with strong test infrastructure for targeted fixes. | | Meets real need? | Yes | Both issues block automation and machine-readable workflows (--robot, --format json, predictable search output). |

Proceed: Yes (3/3 YES)

Problem Statement

Description

Two user-visible defects:

  1. terraphim-agent search ignores global output-mode expectations (--robot, --format) and emits a TerraphimGraph status line to stderr/stdout context instead of structured data (#578).
  2. terraphim-cli search can return {"count":0} for terms expected in haystack content, and users perceive thesaurus output as incomplete due default/pagination behavior and ranking visibility gaps (#579).

Impact

  • AI agents cannot reliably parse output for automation.
  • Users receive empty search results despite indexed content.
  • Confidence in knowledge graph relevance mode drops for default roles.

Success Criteria

  • search returns relevant docs when haystack matches exist.
  • No noisy TerraphimGraph status line in machine-oriented paths.
  • terraphim-agent --robot --format ... search ... produces structured output.
  • thesaurus output communicates completeness and limits clearly.

Current State Analysis

Existing Implementation

  • terraphim-agent parses global --robot/--format in CLI struct, but dispatch paths do not pass these flags into command execution.
  • TerraphimGraph branch in service emits direct eprintln! status text.
  • TerraphimGraph retrieval calls rolegraph query methods and can return empty even when haystack layer discovered docs.
  • terraphim-cli outputs JSON wrapper correctly, but relies on service result set that may be empty after graph-only query.

Code Locations

| Component | Location | Purpose | |-----------|----------|---------| | Agent global output flags | crates/terraphim_agent/src/main.rs (CLI struct) | Declares robot and format flags | | Agent dispatch | crates/terraphim_agent/src/main.rs (main, run_offline_command, run_server_command) | Executes subcommands but ignores global output mode | | TerraphimGraph search | crates/terraphim_service/src/lib.rs (RelevanceFunction::TerraphimGraph arm) | Graph-based ranking and doc retrieval | | Rolegraph search API | crates/terraphim_config/src/lib.rs (search_indexed_documents) | Executes query_graph / query_graph_with_operators | | CLI surface | crates/terraphim_cli/src/main.rs + src/service.rs | JSON command output and service wiring |

Data Flow

Current search flow:

  1. CLI/agent builds SearchQuery.
  2. TerraphimService::search executes haystack search + relevance branch.
  3. TerraphimGraph branch calls search_indexed_documents (rolegraph query).
  4. Result documents are mapped from scored indexed docs.
  5. If graph query returns empty, user can receive zero results despite haystack matches.

Integration Points

  • terraphim_middleware ripgrep haystack search (rg --json ...).
  • terraphim_config::ConfigState rolegraph query.
  • terraphim_types search query and role types.

Constraints

Technical Constraints

  • Must preserve existing command signatures and compatibility.
  • Must avoid breaking REPL/TUI behavior while fixing non-interactive CLI behavior.
  • Search currently couples graph ranking and retrieval in TerraphimGraph mode.

Business Constraints

  • Bugfix scope should be minimal and merge-safe (high-confidence, test-backed).
  • No speculative redesign of ranking architecture in this fix.

Non-Functional Requirements

| Requirement | Target | Current | |-------------|--------|---------| | Machine-readable output for automation | Deterministic JSON when requested | Not guaranteed for terraphim-agent search | | Search usability | Non-empty results when haystack has lexical matches | Empty results observed for kimi | | Noise in CLI output | No unsolicited status banners in robot/json modes | eprintln!("🧠 TerraphimGraph search initiated...") present |

Vital Few (Essentialism)

Essential Constraints (Max 3)

| Constraint | Why It's Vital | Evidence | |------------|----------------|----------| | Preserve command compatibility | Avoid breaking existing scripts/users | Existing CLI tests and command docs | | Ensure lexical fallback when graph query is empty | Prevent false zero-result responses | Reproduced: terraphim-cli search kimi returned count:0 with haystack scan running | | Make output mode explicit and enforced | Required for AI automation and #578 | --robot/--format currently not applied in command output pipeline |

Eliminated from Scope

| Eliminated Item | Why Eliminated | |-----------------|----------------| | Rewriting full relevance/ranking architecture | Too broad for bugfix; high regression risk | | New external indexing backend | Not required to fix observed failures | | Multi-role aggregated search feature | Not part of issue requirements | | Major UX redesign of all CLI outputs | Out of scope; focus on correctness and machine parsing | | Persistence subsystem overhaul | Not necessary for targeted behavior fixes |

Dependencies

Internal Dependencies

| Dependency | Impact | Risk | |------------|--------|------| | terraphim_service search logic | Core behavior for both binaries | High | | terraphim_config rolegraph query behavior | Determines when docs are considered matches | High | | terraphim_agent CLI dispatch | Determines whether --robot/--format are applied | High | | terraphim_cli wrappers | JSON envelope and user-visible contract | Medium |

External Dependencies

| Dependency | Version | Risk | Alternative | |------------|---------|------|-------------| | ripgrep command availability | Environment-provided | Low/Medium | none in current design |

Risks and Unknowns

Known Risks

| Risk | Likelihood | Impact | Mitigation | |------|------------|--------|------------| | Fallback logic changes ranking semantics | Med | Med | Gate fallback to empty-graph case only | | Output contract drift for human mode | Low | Med | Add explicit tests for human vs json/robot modes | | Hidden callers rely on status banner | Low | Low | Move banner to debug log only |

Open Questions

  1. Should lexical fallback return all haystack matches or only top-N tied to requested limit? — Maintainers to confirm.
  2. For --robot, should errors use a common envelope from robot/schema.rs for all commands or only search path in this fix? — Maintainers to confirm.

Assumptions Explicitly Stated

| Assumption | Basis | Risk if Wrong | Verified? | |------------|-------|---------------|-----------| | #578 expects machine-readable output for search when --robot/--format json is set | Issue text and examples | Could under-fix if expectation is different | Yes | | #579 expects non-empty results when haystack has lexical match, even if graph term matching is sparse | Issue text and reproduction | Could conflict with intended strict graph-only semantics | Partially | | TerraphimGraph status line is unintended for automation output | Behavior conflicts with issue expectations and --robot docs | Could remove desired user feedback | Partially |

Multiple Interpretations Considered

| Interpretation | Implications | Why Chosen/Rejected | |----------------|--------------|---------------------| | Strict graph-only search is intended, zero results are acceptable | No fallback; docs must match graph terms only | Rejected for this issue set because users expect haystack-backed search | | Graph-first with lexical fallback when graph returns none | Preserves graph relevance while avoiding false empties | Chosen as minimal behavior-preserving fix |

Research Findings

Key Insights

  1. terraphim-agent global flags are declared but not propagated into non-interactive command execution paths.
  2. TerraphimGraph branch uses a direct eprintln! status line that pollutes output for automation.
  3. Graph query can return empty independently of haystack discovery, producing user-facing false-negative search results.
  4. terraphim-cli thesaurus currently supports explicit limit and total count, but user confusion remains if default-limited output is mistaken as complete.

Relevant Prior Art

  • Existing CLI tests in crates/terraphim_cli/tests/cli_command_tests.rs validate JSON structure and can be expanded.
  • Agent integration tests in crates/terraphim_agent/tests/comprehensive_cli_tests.rs already exercise search command pathways.

Technical Spikes Needed

| Spike | Purpose | Estimated Effort | |-------|---------|------------------| | None required | Bugs are localizable and reproducible now | 0 |

Recommendations

Proceed/No-Proceed

Proceed with focused bugfix design covering both issues together, since they share search/output layers and can be addressed without architectural rewrite.

Scope Recommendations

  1. Implement graph-empty lexical fallback in terraphim_service TerraphimGraph branch.
  2. Remove/redirect TerraphimGraph status banner from eprintln! to log::debug! or structured output path.
  3. Thread agent global output mode (robot, format) into command output for search and other non-interactive commands.
  4. Add/adjust tests for:
    • agent search in robot/json mode
    • TerraphimGraph fallback behavior when rolegraph query is empty but haystack docs exist
    • thesaurus limit clarity (total_count vs shown_count)

Risk Mitigation Recommendations

  • Keep fallback conditional and minimal.
  • Add regression tests before refactor.
  • Preserve existing human-readable output when no robot/json flags are set.

Next Steps

If approved:

  1. Produce Phase 2 implementation design document with file-level changes, APIs, and test matrix.
  2. Run disciplined quality evaluation on the design before implementation phase.

Appendix

Reproduction Evidence (local)

  • cargo run -q -p terraphim-cli -- search kimi --format json returned:
    • TerraphimGraph status banner
    • {"count":0,"query":"kimi","results":[],"role":"Terraphim Engineer"}
  • cargo run -q -p terraphim-agent -- search kimi emitted:
    • 🧠 TerraphimGraph search initiated for role: Terraphim Engineer
    • no structured robot/json output path in this command mode

Key References