module Memo::RRF

Overview

Reciprocal Rank Fusion (RRF) for combining ranked search results

TODO Decide on RRF utility approach:

  1. Remove entirely - too specialized, apps can implement (only ~20 lines)
  2. Make generic - accept any type with #id and #score (but Search::Result has chunk_id)
  3. Keep current - but requires conversion to RRF::Item (loses metadata)

Current implementation loses Search::Result metadata (source_type, offset, etc.) when converting to RRF::Item. Apps need to re-lookup full results after merge.

RRF merges multiple ranked lists by computing a score based on rank position: score = 1 / (k + rank)

Where k is a constant (typically 60) that reduces the impact of high ranks.

Benefits over simple score merging:

Example:

keyword_results = [
  RRF::Item.new(id: 1, score: 10.0),
  RRF::Item.new(id: 2, score: 8.0),
]
semantic_results = [
  RRF::Item.new(id: 2, score: 0.95),
  RRF::Item.new(id: 3, score: 0.85),
]
merged = RRF.merge([keyword_results, semantic_results])
# Result: [id=2 (in both lists), id=1, id=3]

Extended Modules

Defined in:

memo/rrf.cr

Constant Summary

DEFAULT_K = 60

Default k constant for RRF algorithm

Instance Method Summary

Instance Method Detail

def merge(lists : Array(Array(Item)), k : Int32 = DEFAULT_K) : Array(Item) #

Merge multiple ranked result lists using RRF

Returns results sorted by RRF score (highest first)


[View source]