Research Document: Next PR Selection -- Post 5-PR Merge Sprint
Status: Draft Author: Planning Orchestrator Date: 2026-03-30 Reviewers: Alex
Executive Summary
After merging 5 PRs (#732-#736) into upstream/main, we need to select the best next task. Analysis of 3 conflicting PRs, 8 Gitea ready issues, and 7 code improvement issues reveals that Gitea #138 (Wire up read_url for remote thesaurus loading) is the strongest candidate. It is a 1-function fix in a single file, directly builds on our just-merged #733 URL validation fix, and unblocks the entire remote thesaurus loading feature which is currently dead code.
Essential Questions Check
| Question | Answer | Evidence | |----------|--------|----------| | Energizing? | Yes | Completes unfinished work from #733; dead code in the codebase is a clear signal | | Leverages strengths? | Yes | We just fixed the URL validation in the same file -- deep context is fresh | | Meets real need? | Yes | Remote thesaurus loading is a documented feature that currently silently fails |
Proceed: Yes (3/3 YES)
Candidate Analysis
Category A: Conflicting PRs Requiring Rebase
PR #731 (ADF agent spawning, 10 files, +991/-31)
- Scope: Large, touches spawner, config, CLI
- Conflict risk: Medium -- depends on what changed in spawner recently
- Effort: 2-4 hours for rebase + testing
- Value: Important but not urgent; agent spawning is experimental
PR #726 (terraphim-agent fixes, 17 files, +464/-167)
- Scope: Medium-large, touches automata, config, middleware
- Conflict risk: HIGH -- directly conflicts with #733 (AutomataPath::from_remote) and #734 (LazyLock)
- Effort: 3-5 hours for rebase + conflict resolution
- Value: High (fixes real user-facing bugs) but conflict resolution is risky
PR #426 (RLM orchestration, 106 files, +12544/-200)
- Scope: Massive, experimental
- Verdict: ELIMINATED -- too large, too risky for single session
Category B: Well-Scoped Gitea Issues
#138: Wire up read_url for remote thesaurus loading
- Location:
crates/terraphim_automata/src/lib.rslines 346-400 - Problem:
read_url()function exists (lines 348-378) but is#[allow(dead_code)]. TheRemotematch arm (lines 391-395) returns an error instead of callingread_url(). - Fix: Change
Remotearm to callread_url(url.clone()).await?instead of returning error - Tests: Existing
test_load_thesaurus_from_urltest (line 449, currently#[ignore]) validates this - Effort: 30 minutes
- Risk: Very low -- the function already exists and was tested before being broken
- Value: HIGH -- unblocks remote thesaurus loading, the primary use case for
AutomataPath::Remote - Dependency: Builds directly on #733 (URL validation fix we just merged)
#143: Remove unnecessary memoization from magic_pair/magic_unpair
- Location:
crates/terraphim_rolegraph/src/lib.rslines 1268-1287 - Problem:
magic_pair(x,y)andmagic_unpair(z)use#[memoize(CustomHasher: ahash::AHashMap)]but the functions are pure arithmetic (2-3 operations). Memoization adds hash table overhead that exceeds computation cost. - Fix: Remove
#[memoize]attributes, removememoize = "0.5.1"from Cargo.toml, removeuse memoize::memoize;import - Callers: 26 occurrences across 5 files (but all call the same 2 functions)
- Effort: 30 minutes
- Risk: Very low -- pure functions, removing cache can only make behavior more predictable
- Value: Medium -- removes unnecessary dependency, simplifies code, may improve performance for small inputs
- Bonus: Can be combined with #138 in same PR
#142: Remove redundant async wrappers in RoleGraph
- Location:
crates/terraphim_rolegraph/src/lib.rs - Problem:
RoleGraph::new()(line 287) andfrom_serializable()(line 360) areasync fnthat just call sync versions.RoleGraphSyncmethods (lines 1210-1241) are legitimately async (they acquire locks). - Fix: Could deprecate async wrappers on RoleGraph, but callers use them widely
- Effort: 1-2 hours (need to update all callers across crates)
- Risk: Medium -- changing public API signatures ripples across workspace
- Value: Low -- the wrapper is harmless (zero-cost when optimized)
- Verdict: Defer -- high effort-to-value ratio, breaking API change
#141: Improve ID generation for Concept uniqueness per KG
- Effort: 2-4 hours (design needed)
- Risk: Medium-high -- changes fundamental data model
- Verdict: Defer -- needs its own research phase
#140: Reduce cloning in RoleGraph hot paths
- Effort: 2-3 hours
- Risk: Medium -- needs profiling to identify actual hot paths
- Verdict: Defer -- needs benchmarking first
#137: Make TriggerIndex threshold and stopwords configurable
- Location:
crates/terraphim_rolegraph/src/lib.rslines 52-211 - Problem: Threshold is hardcoded to 0.3, stopwords hardcoded in
is_stopword() - Fix: Accept threshold and optional stopword set in constructor, thread through from config
- Effort: 1-2 hours
- Risk: Low -- additive change, backward compatible with defaults
- Value: Medium -- enables tuning for different domains
- Verdict: Good second-tier candidate
#135: Clean up dead code and deprecated functions
- Effort: 1-2 hours (audit + removal)
- Risk: Low but tedious
- Verdict: Could combine with #138 (read_url IS the dead code issue)
Category C: High-Priority Gitea Issues (Too Large for One Session)
#116, #144, #45, #100, #57: All require multi-session design and implementation. ELIMINATED from consideration.
Recommendation: Combined #138 + #143
Rationale
-
#138 is the natural next step after #733. We fixed
from_remote()URL validation; now we need to actually wire up the remote loading that was broken. This is a 1-line fix in a function that already exists. -
#143 is a clean dependency removal. The
memoizecrate adds a runtime hash map cache for 2 arithmetic operations. Removing it simplifies the dependency tree and makes the code more predictable. -
Both changes are orthogonal -- they touch different crates (
terraphim_automatavsterraphim_rolegraph), so no merge conflicts between them. -
Combined effort: ~1 hour. Both are well-scoped with existing test coverage.
-
Combined value: HIGH. One unblocks a documented feature. The other removes unnecessary complexity.
Constraints
Technical Constraints
- #138 requires
remote-loadingfeature to be enabled for testing - #138's live test (
test_load_thesaurus_from_url) requires network access and is#[ignore] - #143's
memoizeremoval needs verification that no other code in the workspace uses thememoizecrate
Vital Few (Max 3)
| Constraint | Why Vital | Evidence |
|------------|-----------|----------|
| Must not break WASM build | terraphim_automata has WASM target; remote-loading is feature-gated | Feature gate already handles this |
| Must keep backward compatibility | magic_pair/magic_unpair are public API | Removing memoize doesn't change signatures |
| Must have test coverage | Both changes touch core functionality | Existing tests cover both paths |
Eliminated from Scope
| Eliminated Item | Why Eliminated | |-----------------|----------------| | PR #426 (RLM) | 12K+ lines, experimental, too large | | PR #726 rebase | High conflict with #733/#734, 3-5 hours | | PR #731 rebase | Medium scope but agent spawning is not urgent | | #142 async wrappers | Breaking API change, low value | | #141 ID generation | Needs design phase, changes data model | | #140 reduce cloning | Needs profiling first | | #116/#144/#45/#100/#57 | Multi-session features, not one-session tasks |
Risks and Unknowns
| Risk | Likelihood | Impact | Mitigation |
|------|------------|--------|------------|
| Remote thesaurus URL in test is stale | Low | Low | Test is #[ignore], can verify manually |
| memoize crate used elsewhere in workspace | Low | Low | Grep confirms only in terraphim_rolegraph |
| Removing memoize regresses performance for repeated calls | Very Low | Low | Pure math is faster than hash lookup for u64 |
Assumptions
| Assumption | Basis | Risk if Wrong | Verified? |
|------------|-------|---------------|-----------|
| read_url function body is correct | It was written intentionally, just not wired up | Would need debugging | Partially (code reads correctly) |
| reqwest is available when remote-loading feature is enabled | Feature gate in Cargo.toml | Build failure | Yes (feature exists in Cargo.toml) |
| No other workspace crate depends on memoize | Grep search | Build failure | Need to verify |
Code Locations
| Component | Location | Purpose |
|-----------|----------|---------|
| load_thesaurus (remote) | crates/terraphim_automata/src/lib.rs:346-400 | Broken Remote arm |
| read_url (dead code) | crates/terraphim_automata/src/lib.rs:348-378 | HTTP fetch for thesaurus |
| magic_pair | crates/terraphim_rolegraph/src/lib.rs:1268-1271 | Memoized pairing function |
| magic_unpair | crates/terraphim_rolegraph/src/lib.rs:1282-1287 | Memoized unpairing function |
| memoize import | crates/terraphim_rolegraph/src/lib.rs:3 | Dependency import |
| memoize dep | crates/terraphim_rolegraph/Cargo.toml:23 | Crate dependency |
Next Steps
If approved, proceed to Phase 2 (Design) for the combined #138 + #143 implementation plan.