Top Level Namespace
Defined in:
Constant Summary
-
CACHE_TTL =
60 * 60
Method Summary
-
append_section(sections : Array(NamedTuple(number: String, title: String, content: String)), toc : NamedTuple(id: String, tocnumber: String, toctext: String), parser : Myhtml::Parser, article : String)
Pushes a section of content onto the sections array.
-
breakpoint?(node : Myhtml::Node)
Returns true if it looks like the current section has ended or the next section has been reached.
-
get_article_contents(url : String | Nil = nil, type : ContentType | Nil = ContentType::ParsedHTML)
Returns the article contents as an Array of NamedTuples from the article.
-
get_article_contents_cache(url : String | Nil = nil, type : ContentType | Nil = ContentType::ParsedHTML, cache : Redis::PooledClient | Nil = nil)
Returns article contents from cache first if available.
-
get_article_name_from_parser(parser : Myhtml::Parser)
Returns the article name as a String from the html parser.
-
get_article_name_from_url(url : String)
Returns the article name as a String from the url.
-
get_base_url(url : String)
Returns the base url from the passed url.
-
get_query_parameters(params)
Returns the query parameter values.
-
ignore?(node : Myhtml::Node)
Returns true if the node should be ignored for text appending.
-
is_mediawiki?(parser : Myhtml::Parser)
Returns true if the body tag of the HTML contains the mediawiki class.
-
math_element?(node : Myhtml::Node | Nil)
Returns the math element if found in the node, else nil.
-
remove_references(inner_text : String)
Removes all the wikipedia source references within text.
-
replace_whitespaces(inner_text : String)
Replaces some of the whitespace with a single space.
-
skip_section?(toctext : String)
Returns true if the string matches a section to be skipped, else false.
Method Detail
Pushes a section of content onto the sections array.
Helper function for get_article_contents
.
Jumps to a section noted from the table of contents to begin parsing.
Returns true if it looks like the current section has ended or the next section has been reached.
Returns the article contents as an Array of NamedTuples from the article.
Returns article contents from cache first if available.
Returns the article name as a String from the html parser.
Returns true if the body tag of the HTML contains the mediawiki class.
Returns the math element if found in the node, else nil.
Removes all the wikipedia source references within text.
Returns true if the string matches a section to be skipped, else false.