class HClust::UnionFind

Overview

A UnionFind is a data structure that stores a partition of a set into disjoint subsets encoded as contiguous zero-based indexes.

This is a specialized implementation of a union–find data structure for linkage, where subsets correspond to cluster labels. It supports efficient lookups and frequent unions.

Internally, it is implemented as an array of size N + N - 1 such that each element points to the enclosing cluster label of the corresponding cluster. Disjoint clusters (root) are labeled with 0. Labels of the newly created clusters follows the SciPy convention, where new labels start at N.

Defined in:

hclust/union_find.cr

Constructors

Instance Method Summary

Constructor Detail

def self.new(size : Int32) #

Creates a new UnionFind with the cluster labels in the range [0, size * size - 1).


[View source]

Instance Method Detail

def find(index : Int) : Int32 | Nil #

Returns the root cluster for the given cluster label, or nil if out of bounds.

Iteratively goes through all parent elements until a root (parent = 0) is found. To make subsequent lookups faster, the label for the given cluster and all its parents is updated with the found root element.


[View source]
def union(i : Int, j : Int) : Int32 | Nil #

Joins two clusters with the given labels and returns the label of the newly created cluster, or nil if the two clusters are already joined. Raises IndexError if either i or j is out of bounds.

If the two clusters belongs to the same parent cluster, this method does nothing.


[View source]