class Cadmium::Summarizer::TextRank
Overview
An implementation of TextRank algorithm for summarization. Step 1 : Create a stochastic matrix for PageRank. From sumy source code : Element at row i and column j of the matrix corresponds to the similarity of sentence i and j, where the similarity is computed as the number of common words between them, divided by their sum of logarithm of their lengths. After such matrix is created, it is turned into a stochastic matrix by normalizing over columns i.e. making the columns sum to one. TextRank uses PageRank algorithm with damping, so a damping factor is incorporated as explained in TextRank's paper. The resulting matrix is a stochastic matrix ready for power method. Source: https://web.eecs.umich.edu/~mihalcea/papers/mihalcea.emnlp04.pdf
Included Modules
- Apatite