class HTML5::StreamingParser

Overview

StreamingParser wraps the standard HTML5 parser and emits SAX-style events via a StreamingHandler as the document tree is constructed.

It works by subclassing Parser and intercepting the tree-building methods (#add_child, #add_text) to emit events while preserving correct HTML5 parsing behavior. Element close events are emitted via a custom NodeStack that intercepts pop operations.

Defined in:

html5/streaming.cr

Constructors

Instance Method Summary

Instance methods inherited from class HTML5::Parser

acknowledge_self_closing_tag acknowledge_self_closing_tag, add_child(n : Node) add_child, add_element add_element, add_formatting_element add_formatting_element, add_text(text : String) add_text, adjusted_current_node adjusted_current_node, clear_active_formatting_elements clear_active_formatting_elements, clear_stack_to_context(s : Scope) clear_stack_to_context, doc : Node doc, element_in_scope(s : Scope, *match_tags) element_in_scope, foster_parent(n : Node) foster_parent, fragment : Bool fragment, generate_implied_end_tags(*exceptions) generate_implied_end_tags, has_self_closing_token : Bool has_self_closing_token, has_self_closing_token=(has_self_closing_token : Bool) has_self_closing_token=, in_body_end_tag_formatting(atom : Atom::Atom, tag_name : String) in_body_end_tag_formatting, in_body_end_tag_other(atom : Atom::Atom, tag_name : String) in_body_end_tag_other, in_foreign_content in_foreign_content, index_of_element_in_scope(s, *match_tags) index_of_element_in_scope, oe=(arr : Array(Node)) oe=, parse parse, parse_current_token parse_current_token, parse_generic_raw_text_elements parse_generic_raw_text_elements, parse_implied_token(t : TokenType, atom : Atom::Atom, data : String) parse_implied_token, pop_until(s : Scope, *match_tags : Atom::Atom) pop_until, reconstruct_active_formatting_elements reconstruct_active_formatting_elements, reset_insertion_mode reset_insertion_mode, set_original_im set_original_im, should_foster_parent should_foster_parent, top : Node top

Constructor methods inherited from class HTML5::Parser

new(r : IO, **opts) new

Constructor Detail

def self.new(r : IO, handler : StreamingHandler, **opts) #

[View source]

Instance Method Detail

def add_child(n : Node) #

Override add_child to emit events when nodes are added to the tree.


[View source]
def add_text(text : String) #

Override add_text to emit text events correctly for all three code paths:

  1. Foster parenting (bypasses add_child)
  2. Appending to existing text node (bypasses add_child)
  3. New text node (goes through add_child)

[View source]
def oe=(arr : Array(Node)) #

Override oe= to emit close events for elements removed by stack slicing. Many insertion modes do p.oe = p.oe[...i] to pop multiple elements.


[View source]
def parse #

Run the parse loop.


[View source]