Knowledge Graph Ranking Expansion Testing

Overview

The Knowledge Graph Ranking Expansion Test validates how adding synonyms to the knowledge graph dramatically improves search ranking performance. This test demonstrates the power of semantic connections in the Terraphim graph-based search system.

Test Location

File: crates/terraphim_middleware/tests/knowledge_graph_ranking_expansion_test.rs

Test Objectives

The test validates five core requirements:

  1. Knowledge Graph Construction - Build KG from docs/src/kg markdown files
  2. Node/Edge Counting - Measure graph structure before and after expansion
  3. Synonym Addition - Add new records with defined synonyms including target terms
  4. Structure Changes - Verify nodes/edges increased after expansion
  5. Ranking Validation - Confirm "terraphim-graph" rank changed using Terraphim Engineer role

Test Workflow

Phase 1: Setup Test Environment

// Create isolated temporary directory
let temp_dir = TempDir::new().expect("Failed to create temp directory");
let temp_kg_path = temp_dir.path().join("kg");

// Copy existing KG files to temp directory
// Files: terraphim-graph.md, service.md, haystack.md

Phase 2: Build Initial Knowledge Graph

let logseq_builder = Logseq::default();
let initial_thesaurus = logseq_builder
    .build("Terraphim Engineer".to_string(), &temp_kg_path)
    .await?;

let initial_rolegraph = RoleGraph::new(role_name, initial_thesaurus.clone()).await?;

Initial State Metrics:

  • Thesaurus: 10 terms extracted from 3 markdown files
  • Graph Structure: 3 nodes, 5 edges, 3 documents
  • Search Rank: "terraphim-graph" baseline rank measured

Phase 3: Add New Knowledge Graph Record

Create graph-analysis.md with comprehensive synonym set:

# Graph Analysis

## Advanced Graph Processing

Graph Analysis is a comprehensive approach to understanding complex data relationships.

synonyms:: data analysis, network analysis, graph processing, relationship mapping, connectivity analysis, terraphim-graph, graph embeddings

This concept extends the capabilities of graph-based systems by providing deeper insights.

New Synonyms Added:

  • data analysis
  • network analysis
  • graph processing
  • relationship mapping
  • connectivity analysis
  • terraphim-graph (target term)
  • graph embeddings

Phase 4: Rebuild and Measure Expansion

let expanded_thesaurus = logseq_builder
    .build("Terraphim Engineer".to_string(), &temp_kg_path)
    .await?;

let expanded_rolegraph = RoleGraph::new(role_name, expanded_thesaurus.clone()).await?;

Phase 5: Validate Changes

Growth Verification:

assert!(expanded_thesaurus_size > initial_thesaurus_size);
assert!(expanded_nodes_count > initial_nodes_count);
assert!(expanded_edges_count > initial_edges_count);
assert_ne!(expanded_rank, initial_rank);

Test Results

Dramatic Performance Improvement

| Metric | Initial | Expanded | Growth | |--------|---------|----------|--------| | Thesaurus Terms | 10 | 16 | +6 (+60%) | | Nodes | 3 | 4 | +1 (+33%) | | Edges | 5 | 8 | +3 (+60%) | | Documents | 3 | 4 | +1 (+33%) | | "terraphim-graph" Rank | 28 | 117 | +89 (+318%) |

Key Findings

  1. Synonym Power: Adding relevant synonyms creates +318% ranking improvement
  2. Graph Growth: Each new concept creates multiple semantic connections
  3. Search Enhancement: New terms become immediately discoverable
  4. Role Integration: Terraphim Engineer role properly utilizes local KG
  5. Ranking Algorithm: TerraphimGraph scoring rewards semantic richness

New Synonym Searchability

All 6 new synonyms are immediately searchable and return results:

  • βœ… "data analysis" β†’ 1 result
  • βœ… "network analysis" β†’ 1 result
  • βœ… "graph processing" β†’ 1 result
  • βœ… "relationship mapping" β†’ 1 result
  • βœ… "connectivity analysis" β†’ 1 result
  • βœ… "graph embeddings" β†’ 2 results

Technical Implementation

Key Components

  1. Logseq Builder: Extracts synonyms using synonyms:: syntax from markdown
  2. RoleGraph: Manages graph structure with precise node/edge counting
  3. Terraphim Engineer Role: Uses TerraphimGraph relevance function
  4. TempDir: Provides isolated, safe testing environment

Graph Structure Access

// Precise counting methods
let nodes_count = rolegraph.nodes_map().len();
let edges_count = rolegraph.edges_map().len();
let documents_count = rolegraph.get_all_documents().count();

Search Ranking Analysis

let results = rolegraph.query_graph("terraphim-graph", Some(0), Some(10))?;
let rank = results.first()
    .map(|(_, indexed_doc)| indexed_doc.rank)
    .unwrap_or(0);

Usage for Knowledge Graph Expansion

Strategy for Improving Search Rankings

  1. Identify Target Terms: Find important terms that should rank higher
  2. Research Synonyms: Discover related terms users might search for
  3. Create Knowledge Records: Add markdown files with synonym mappings
  4. Measure Impact: Use this test pattern to validate improvements
  5. Iterate: Refine synonym sets based on ranking improvements

Best Practices

  1. Relevant Synonyms: Use domain-appropriate terms users actually search for
  2. Semantic Connections: Connect related concepts through shared synonyms
  3. Measurement: Always measure before/after to validate improvements
  4. Testing: Use isolated environments to prevent production interference

Production Applications

Knowledge Graph Enhancement Workflow

  1. Baseline Measurement: Record current search rankings for key terms
  2. Synonym Research: Identify related terms and concepts
  3. Content Creation: Add markdown files with strategic synonym mappings
  4. Impact Validation: Measure ranking improvements using test framework
  5. Production Deployment: Apply successful synonym strategies to live KG

Performance Monitoring

Track these metrics when expanding knowledge graphs:

  • Thesaurus Growth: Number of new terms added
  • Graph Structure: Node and edge count increases
  • Search Performance: Ranking improvements for target terms
  • Discoverability: New synonym search success rates

Conclusion

The Knowledge Graph Ranking Expansion Test demonstrates that strategic synonym addition can create dramatic search ranking improvements. The test framework provides a reliable method for validating knowledge graph enhancements before production deployment.

Key Takeaway: Adding 6 relevant synonyms improved "terraphim-graph" search ranking by 318%, proving the power of semantic connections in graph-based search systems.