module HTML5::StreamingHandler

Overview

StreamingHandler is a callback interface for SAX-style streaming HTML parsing.

Implement this module and pass it to HTML5.stream to receive events as the HTML5 parser constructs the document tree. Events are emitted in document order as the parser processes tokens — you don't have to wait for the full document to be parsed.

The parser still builds the full DOM tree internally (required by the HTML5 spec for correct handling of misnested markup), but your handler receives events incrementally as nodes are created.

Example

class MyHandler
  include HTML5::StreamingHandler

  def on_element_open(tag : String, attrs : Array(HTML5::Attribute), namespace : String)
    puts "Open: <#{tag}>"
  end

  def on_element_close(tag : String, namespace : String)
    puts "Close: </#{tag}>"
  end

  def on_text(text : String)
    puts "Text: #{text}" unless text.strip.empty?
  end
end

handler = MyHandler.new
HTML5.stream(io, handler)

Defined in:

html5/streaming.cr

Instance Method Summary

Instance Method Detail

def on_comment(text : String) #

Called when a comment node is added to the tree.


[View source]
def on_doctype(data : String) #

Called when a doctype node is added to the tree.


[View source]
def on_document_end(doc : Node) #

Called when parsing is complete. The final document Node is provided for any post-processing that needs the full tree.


[View source]
def on_element_close(tag : String, namespace : String) #

Called when an element is closed (popped from the stack of open elements). Note: void elements like <br> and <img> will receive both an #on_element_open and an #on_element_close call.


[View source]
def on_element_open(tag : String, attrs : Array(Attribute), namespace : String) #

Called when an element node is added to the tree. tag is the lower-cased tag name, attrs are the element's attributes, and namespace is empty for HTML elements or "math"/"svg" for foreign content.


[View source]
def on_text(text : String) #

Called when a text node is added to the tree.


[View source]