module
Memo::RRF
Overview
Reciprocal Rank Fusion (RRF) for combining ranked search results
TODO Decide on RRF utility approach:
- Remove entirely - too specialized, apps can implement (only ~20 lines)
- Make generic - accept any type with #id and #score (but Search::Result has chunk_id)
- Keep current - but requires conversion to RRF::Item (loses metadata)
Current implementation loses Search::Result metadata (source_type, offset, etc.) when converting to RRF::Item. Apps need to re-lookup full results after merge.
RRF merges multiple ranked lists by computing a score based on rank position: score = 1 / (k + rank)
Where k is a constant (typically 60) that reduces the impact of high ranks.
Benefits over simple score merging:
- Rank-based, not score-based (handles different scoring scales)
- No score normalization needed
- Proven effective in information retrieval (IR research)
Example:
keyword_results = [
RRF::Item.new(id: 1, score: 10.0),
RRF::Item.new(id: 2, score: 8.0),
]
semantic_results = [
RRF::Item.new(id: 2, score: 0.95),
RRF::Item.new(id: 3, score: 0.85),
]
merged = RRF.merge([keyword_results, semantic_results])
# Result: [id=2 (in both lists), id=1, id=3]
Extended Modules
Defined in:
memo/rrf.crConstant Summary
-
DEFAULT_K =
60 -
Default k constant for RRF algorithm
Instance Method Summary
-
#merge(lists : Array(Array(Item)), k : Int32 = DEFAULT_K) : Array(Item)
Merge multiple ranked result lists using RRF
Instance Method Detail
Merge multiple ranked result lists using RRF
Returns results sorted by RRF score (highest first)