Aggregator decoupling sketch (Stage B, Gitea #1910)
Design sketch only -- no code changes in Stage B. Targets the two fan-out hubs the DSM flags as modularity drags. Both feed the polyrepo split: today each hub depends directly on every crate it dispatches to, so they cannot be cut into a separate repo without dragging all providers with them.
Measured intra-workspace dependencies (cargo metadata, post-Stage-A):
terraphim_middleware(normal):terraphim_types,terraphim_config,terraphim_automata,terraphim_rolegraph,terraphim_persistence,terraphim_file_search,terraphim-session-analyzer,haystack_jmap.terraphim_service(normal):terraphim_types,terraphim_config,terraphim_automata,terraphim_rolegraph,terraphim_persistence,terraphim_middleware,terraphim_router.
1. terraphim_middleware -- haystack provider dispatch
Current coupling
crates/terraphim_middleware/src/indexer/mod.rs defines trait IndexMiddleware (the right
abstraction) but the orchestrator dispatches with a hardcoded match ServiceType { ... } over 11
arms: Ripgrep, Atomic, QueryRs, ClickUp, Mcp, Perplexity, GrepApp, AiAssistant,
Quickwit, Jmap. Each arm constructs a concrete provider, so the match site (and Cargo.toml)
gains a hard dependency on every provider crate. Adding or removing a haystack means editing
middleware's match, its imports, and its manifest -- the textbook open/closed violation, and the
reason middleware fans in/out so widely.
Target seam: provider trait in haystack_core + a registry
- Move (or confirm)
trait IndexMiddlewareintohaystack_coreso providers depend on the trait, not onterraphim_middleware. The trait stays dyn-safe:async fn index(&self, needle: &str, haystack: &Haystack) -> Result<Index>behind#[async_trait]. - Each provider crate (
haystack_jmap, the ripgrep/atomic/clickup/quickwit/etc. indexers) implementsIndexMiddlewarefor its own type and exposes a constructor + aServiceTypetag. - Replace the
matchwith aHashMap<ServiceType, Arc<dyn IndexMiddleware>>registry built once at startup.search_haystackslooks the provider up by the haystack'sservicefield. - The binary (
terraphim_server/ clients) owns registration -- it depends on the provider crates and inserts them into the registry.terraphim_middlewarethen depends only onhaystack_core+ the domain crates (types,config,rolegraph,automata,persistence), not on any provider.
Effect
- middleware's provider fan-out drops from ~11 to 0; providers become leaves under
haystack_core. - New haystacks register without touching middleware (open for extension, closed for modification).
- Feature-gating per provider moves to the composition root, shrinking default build graph.
- Enables cutting
terraphim_middlewareand the haystack providers into separate repos cleanly.
Constraints
search_haystacksis consumed byterraphim_service(lib.rs:1540) andbuild_thesaurus_from_haystack(lib.rs:11). The registry must be threaded throughConfigState(already cloned into the call) so the publicsearch_haystackssignature is preserved or extended additively (changelog if changed).- Keep
ServiceTypeinterraphim_config/terraphim_typesas the registry key to avoid a new enum.
2. terraphim_service -- capability aggregation
Current coupling
terraphim_service/src/lib.rs is the god file (~3876 lines) and aggregates search + KG/thesaurus +
document + LLM-chat behind one type, pulling middleware, router, and the four domain crates. It is
the E3 extraction target and the largest single contributor to complex_fn_count.
Target seam: capability traits, lib.rs becomes facade
- Define capability traits (search / KG-thesaurus / document / LLM-chat) -- aligns with architecture-improvement-plan Phase 2 and ADR-002.
- Move each capability's implementation into its own module/crate;
lib.rsbecomes a thin facade that wires capabilities and exposes the existing public API unchanged. - The LLM-chat capability depends on
terraphim_router; isolating it lets the router dependency be feature-gated and keeps it out of the search hot path.
Effect
god_file_count1 -> 0 (a Stage E0b / DoDquality_signaldriver).- Each capability becomes independently testable and the
multi_agent -> service -> configedge stops being tangled through one monolith. - Precondition for the clean
terraphim-servicerepo extraction (E3).
Constraints
terraphim_servicecarries no frozen public API itself, but it re-exports/consumesterraphim_types(frozen). Decomposition must keep the facade's public surface stable; runcargo public-api -p terraphim_service diffbefore/after to confirm.
Sequencing
Both sketches are MUST-PRECEDE their respective Stage E extractions (middleware/providers before the haystack repos; service before E3) and should land while still in the single workspace, where the trait moves and registry wiring are a single reviewable change rather than a cross-repo migration.