class HClust::Dendrogram

Overview

A step-wise dendrogram that encodes the arrangement of the clusters produced by hierarchical clustering as a binary tree.

A dendrogram consists of a sequence of N - 1 merge steps (see Step), where N is the number of elements or observations that were clustered, and a step corresponds to a merge between two distinct clusters.

The labeling of the clusters follows the SciPy convention, where new labels start at N:

Consequently, the labels of the newly created clusters ranges from N to N + N - 1.

Defined in:

hclust/dendrogram.cr

Constructors

Instance Method Summary

Constructor Detail

def self.new(observations : Int32) #

Creates a new Dendrogram with the given number of original elements or observations.


[View source]

Instance Method Detail

def <<(step : Step) : self #

Appends the given merge step. Raises ArgumentError if the dendrogram is already full (contains N - 1 steps).


[View source]
def ==(rhs : self) : Bool #

Returns true if the merge steps are equal to rhs's steps, else false.


[View source]
def add(c_i : Int32, c_j : Int32, distance : Float64) : Step #

Creates and appends a merge step between clusters c_i and c_j with the given distance.


[View source]
def flatten(height : Number) : Array(Array(Int32)) #

Returns flat clusters of the original observations obtained by cutting the dendrogram at height (cophenetic distance).


[View source]
def flatten(*, count : Int) : Array(Array(Int32)) #

Returns count or fewer flat clusters of the original observations. Raises ArgumentError if count is negative or zero.

It internally computes the smallest height at which cutting the dendrogram would generate count or fewer clusters, and then flattens the dendrogram at the computed height.


[View source]
def observations : Int32 #

Number of the original elements or observations that were clustered.


[View source]
def relabel(ordered : Bool = false) : self #

Returns a new Dendrogram with relabeled clusters. If ordered is true, the dendrogram's steps will be sorted by the dissimilarities first.

Internally, it uses a UnionFind data structure for creating merge steps with the new cluster labels efficiently.

NOTE Cluster labels will follow the SciPy convention, where new clusters start at N with N equal to the number of observations (see UnionFind).


[View source]
def steps : Array::View(Step) #

Returns a view of the merge steps.


[View source]