Implementation Plan: Fix Search Output/Result Regressions (#578, #579)

Status: Review Research Doc: docs/plans/issues-578-579-research-2026-02-24.md Author: Codex Date: 2026-02-24 Estimated Effort: 1-2 days

Overview

Summary

This plan fixes two linked regressions:

terraphim-agent search ignores global output mode expectations (--robot, --format) for non-interactive commands.
TerraphimGraph search can return empty results even when haystack search has matches.

Approach

Use a minimal, behavior-preserving design:

Add explicit output-mode plumbing in terraphim-agent non-interactive command paths.
Keep TerraphimGraph ranking first, but add lexical fallback when graph query returns no hits.
Remove unstructured TerraphimGraph status output from stderr in machine-oriented flows.

Scope

In Scope:

Output-mode enforcement for terraphim-agent non-interactive commands (--robot => JSON output path).
TerraphimGraph empty-result fallback in service search.
Regression and contract tests for output shape and fallback behavior.

Out of Scope:

Full scoring model redesign.
New ranking algorithms or ontology schema changes.
REPL/TUI UX redesign.

Avoid At All Cost (from 5/25 analysis):

Re-architecting search subsystem.
Introducing new external search/index dependencies.
Expanding this fix into multi-role/aggregated search.
Changing existing command names/flags.
Coupling this fix with unrelated performance refactors.

Architecture

Component Diagram

terraphim-agent CLI
  -> Command dispatch (offline/server)
     -> Output mode adapter (human/json/json-compact/robot-json)
        -> terraphim_service search
           -> TerraphimGraph branch
              -> graph query
              -> (if empty) lexical fallback from haystack docs

Data Flow

[CLI args + command]
  -> [Agent output mode resolver]
  -> [Command execution]
  -> [Service search]
  -> [Graph result?]
      yes -> [ranked docs]
      no  -> [lexical fallback docs]
  -> [Structured or human formatter]

Key Design Decisions

| Decision | Rationale | Alternatives Rejected | |----------|-----------|----------------------| | Keep graph-first semantics | Preserves existing TerraphimGraph behavior when it works | Lexical-first always | | Fallback only when graph result is empty | Minimal and safe bugfix; avoids broad relevance drift | Always merge graph+lexical | | Route --robot into non-interactive formatter pipeline | Matches user expectation and issue #578 | Keep --robot as no-op for subcommands | | Replace eprintln! status banner with logger | Avoids machine-output pollution | Keep banner and ask callers to filter |

Eliminated Options (Essentialism)

| Option Rejected | Why Rejected | Risk of Including | |-----------------|--------------|-------------------| | Full rolegraph query rewrite | Not required to fix current defect | High regression and scope creep | | Dual ranked merge of graph + lexical every request | Additional complexity with unclear ranking expectations | Hard-to-debug ordering changes | | New robot envelope for all commands in one pass | Too broad for current bugfix | Breaks existing automation contracts |

Simplicity Check

"Minimum code that solves the problem. Nothing speculative."

Simplest effective design:

Introduce a small output context in terraphim-agent command path.
Add one fallback branch in TerraphimGraph search after graph query returns empty.
Add focused tests for exactly the failing behaviors.

Senior Engineer Test: This is not overcomplicated; it is a localized behavior fix.

Nothing Speculative Checklist:

[x] No features user didn’t request
[x] No future-proof abstractions without need
[x] No speculative flexibility layers
[x] No impossible-scenario handling additions
[x] No premature optimization

5/25 Rule

Candidate Work Items Considered

Agent output mode enforcement
Service TerraphimGraph empty-result fallback
Remove noisy TerraphimGraph status print
Agent integration tests for robot/json search
Service tests for fallback behavior
CLI docs refresh
REPL behavior cleanup
Graph ranking tuning
Add new query operators
Add perf benchmark suite

Vital 5 (IN scope)

Agent output mode enforcement
TerraphimGraph empty-result fallback
Remove noisy status print
Robot/json output regression tests
Fallback regression tests

Avoid At All Cost (OUT of scope)

6-10 above are explicitly deferred.

File Changes

New Files

| File | Purpose | |------|---------| | crates/terraphim_agent/tests/robot_output_mode_tests.rs | Contract tests for --robot + --format in non-interactive commands | | crates/terraphim_service/tests/terraphim_graph_fallback_tests.rs | Verifies fallback when graph query is empty |

Modified Files

| File | Changes | |------|---------| | crates/terraphim_agent/src/main.rs | Add output-mode context and route command output through structured formatter when --robot enabled | | crates/terraphim_service/src/lib.rs | Replace TerraphimGraph status eprintln!; add lexical fallback branch after empty graph results | | crates/terraphim_cli/tests/cli_command_tests.rs | Add assertion coverage around non-empty search behavior for fallback scenario (where test fixture supports it) | | crates/terraphim_agent/tests/comprehensive_cli_tests.rs | Extend tests to check robot-mode structured output path |

Deleted Files

| File | Reason | |------|--------| | None | Not required |

API Design

Public/Module Types (agent command output)

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
enum CommandOutputMode {
    Human,
    Json,
    JsonCompact,
}

#[derive(Debug, Clone, Copy)]
struct CommandOutputConfig {
    mode: CommandOutputMode,
    robot: bool,
}

Function Signatures

fn resolve_output_config(robot: bool, format: OutputFormat) -> CommandOutputConfig;

async fn run_offline_command(
    command: Command,
    output: CommandOutputConfig,
) -> Result<()>;

async fn run_server_command(
    command: Command,
    server_url: &str,
    output: CommandOutputConfig,
) -> Result<()>;

// terraphim_service internal helper (private)
fn fallback_documents_from_index(index: &Index, limit: Option<usize>) -> Vec<Document>;

Error Types

No new public error enums required. Existing anyhow::Result and crate error types remain.

Test Strategy

Unit Tests

| Test | Location | Purpose | |------|----------|---------| | test_resolve_output_config_robot_forces_json | crates/terraphim_agent/src/main.rs | Ensures --robot overrides to JSON output mode | | test_terraphim_graph_no_stderr_banner | crates/terraphim_service/tests/terraphim_graph_fallback_tests.rs | Ensures no direct status banner leaks from service path |

Integration Tests

| Test | Location | Purpose | |------|----------|---------| | test_search_robot_outputs_valid_json | crates/terraphim_agent/tests/robot_output_mode_tests.rs | Verifies non-interactive search emits parseable JSON with --robot | | test_search_robot_respects_format_json_compact | crates/terraphim_agent/tests/robot_output_mode_tests.rs | Verifies compact JSON behavior | | test_graph_empty_uses_lexical_fallback | crates/terraphim_service/tests/terraphim_graph_fallback_tests.rs | Ensures fallback returns docs when graph query returns empty | | test_cli_search_not_false_zero_when_haystack_matches | crates/terraphim_cli/tests/cli_command_tests.rs | Guards #579 behavior with fixture-compatible config |

Property Tests

Not required for this bugfix; deterministic behavior and command contracts are better covered by integration tests.

Implementation Steps

Step 1: Output Mode Plumbing in `terraphim-agent`

Files: crates/terraphim_agent/src/main.rs Description: Add output config resolver and pass into non-interactive command execution paths. Tests: Unit tests for mode resolution. Estimated: 3 hours

Step 2: Structured Search Output in Robot Mode

Files: crates/terraphim_agent/src/main.rs, crates/terraphim_agent/tests/robot_output_mode_tests.rs Description: Ensure search and related command outputs are JSON when robot mode is enabled. Tests: New robot output integration tests. Dependencies: Step 1 Estimated: 4 hours

Step 3: TerraphimGraph Empty-Result Fallback

Files: crates/terraphim_service/src/lib.rs, crates/terraphim_service/tests/terraphim_graph_fallback_tests.rs Description: If rolegraph query returns empty, return lexical haystack docs (bounded by limit), preserving graph-first behavior. Tests: Fallback-specific service tests. Dependencies: None Estimated: 4 hours

Step 4: Banner/Noise Removal + Regression Hardening

Files: crates/terraphim_service/src/lib.rs, crates/terraphim_agent/tests/comprehensive_cli_tests.rs, crates/terraphim_cli/tests/cli_command_tests.rs Description: Replace direct eprintln! status text with logger and add regression coverage across binaries. Tests: Updated integration test suites. Dependencies: Steps 2-3 Estimated: 3 hours

Step 5: Verification

Files: N/A (test execution) Description: Run project quality gates for touched crates:

cargo fmt
cargo clippy -p terraphim-agent -p terraphim-cli -p terraphim-service -- -D warnings
cargo test -p terraphim-agent
cargo test -p terraphim-cli
cargo test -p terraphim-service terraphim_graph_fallback Estimated: 1-2 hours

Rollback Plan

If regressions appear:

Revert output-mode changes in terraphim_agent/src/main.rs only.
Revert fallback branch in terraphim_service/src/lib.rs.
Keep new tests and mark with #[ignore] only if temporary stabilization is needed.

Migration

No database or config schema migration required.

Dependencies

New Dependencies

None.

Dependency Updates

None planned.

Performance Considerations

Expected Performance

| Metric | Target | Measurement | |--------|--------|-------------| | Search latency with graph hits | No regression | Existing search tests/log timing | | Search latency with fallback | Bounded by existing haystack result size/limit | Integration test with fixed fixture | | Memory overhead | Minimal | Code inspection + test run |

Benchmarks to Add

Not required for this patch-sized bugfix.

Open Items

| Item | Status | Owner | |------|--------|-------| | Final JSON schema for robot-mode search envelope | Resolved (use existing JSON command output contract) | Maintainers/Codex | | Whether to extend robot-force-json to every non-interactive command in same PR | In scope per user decision | Codex |

Approval

[ ] Technical review complete
[ ] Test strategy approved
[ ] Performance targets agreed
[ ] Human approval received