module Luce
Overview
Parses text in a Markdown-like format building an AST tree that can then be rendered to HTML.
If you are only interested in rendering Markdown to HTML, please refer
to the README which explains the use of Luce.to_html
.
The main entrypoint to the library is the Document
which
encapsulates the parsing process converting a Markdown text into
a tree of Node
(Array(Node)
).
The two main parsing mechanics used are:
- Blocks, representing top-level elements
implemented via
BlockSyntax
subclasses, such as headers, paragraphs, blockquotes, and code blocks. - Inlines, representing chunks of test within a block with special meaning,
implemented via
InlineSyntax
subclasses, such as links, emphasis, and inlined code.
Looking closely at Document.new()
a few other concepts merit a mention:
ExtensionSet
that provides configurations for common Markdown flavorsResolver
which aid in resolving links and images.
If you are looking at extending the library to support custom formatting what you might want is to:
- Implement your own
InlineSyntax
subclasses - Implement your own
BlockSyntax
subclasses - Instruct the library to use those by:
- Creating a new
ExtensionSet
from one of the existing flavors and adding your syntaxes. - Passing your syntaxes to
Document
orLuce.to_html
as parameters.
- Creating a new
Defined in:
luce.crluce/assets/html_entities.cr
luce/ast.cr
luce/block_parser.cr
luce/block_syntaxes/alert_block_syntax.cr
luce/block_syntaxes/block_syntax.cr
luce/block_syntaxes/blockquote_syntax.cr
luce/block_syntaxes/code_block_syntax.cr
luce/block_syntaxes/dummy_block_syntax.cr
luce/block_syntaxes/empty_block_syntax.cr
luce/block_syntaxes/fenced_blockquote_syntax.cr
luce/block_syntaxes/fenced_code_block_syntax.cr
luce/block_syntaxes/footnote_def_syntax.cr
luce/block_syntaxes/header_syntax.cr
luce/block_syntaxes/header_with_id_syntax.cr
luce/block_syntaxes/horizontal_rule_syntax.cr
luce/block_syntaxes/html_block_syntax.cr
luce/block_syntaxes/link_reference_definition_syntax.cr
luce/block_syntaxes/list_syntax.cr
luce/block_syntaxes/ordered_list_syntax.cr
luce/block_syntaxes/ordered_list_with_checkbox_syntax.cr
luce/block_syntaxes/paragraph_syntax.cr
luce/block_syntaxes/setext_header_syntax.cr
luce/block_syntaxes/setext_header_with_id_syntax.cr
luce/block_syntaxes/table_syntax.cr
luce/block_syntaxes/unordered_list_syntax.cr
luce/block_syntaxes/unordered_list_with_checkbox_syntax.cr
luce/charcode.cr
luce/document.cr
luce/emojis.cr
luce/extension_set.cr
luce/html_renderer.cr
luce/inline_parser.cr
luce/inline_syntaxes/autolink_extension_syntax.cr
luce/inline_syntaxes/autolink_syntax.cr
luce/inline_syntaxes/code_syntax.cr
luce/inline_syntaxes/color_swatch_syntax.cr
luce/inline_syntaxes/decode_html_syntax.cr
luce/inline_syntaxes/delimiter_syntax.cr
luce/inline_syntaxes/email_autolink_syntax.cr
luce/inline_syntaxes/emoji_syntax.cr
luce/inline_syntaxes/emphasis_syntax.cr
luce/inline_syntaxes/escape_html_syntax.cr
luce/inline_syntaxes/escape_syntax.cr
luce/inline_syntaxes/footnote_ref_syntax.cr
luce/inline_syntaxes/image_syntax.cr
luce/inline_syntaxes/inline_html_syntax.cr
luce/inline_syntaxes/inline_syntax.cr
luce/inline_syntaxes/line_break_syntax.cr
luce/inline_syntaxes/link_syntax.cr
luce/inline_syntaxes/soft_line_break_syntax.cr
luce/inline_syntaxes/strikethrough_syntax.cr
luce/inline_syntaxes/text_syntax.cr
luce/legacy_emojis.cr
luce/line.cr
luce/link_parser.cr
luce/patterns.cr
luce/text_parser.cr
luce/util.cr
Constant Summary
-
VERSION =
"0.5.0"
Class Method Summary
-
.alert_pattern : Regex
Alert type patterns.
-
.ascii_punctuation_characters : String
ASCII punctuation characters.
-
.ascii_punctuation_escaped : String
ASCII punctuation characters with some characters escaped, in order to be used in the RegExp character set.
-
.blockquote_fence_pattern : Regex
Fenced blockquotes
-
.blockquote_pattern : Regex
The line starts with
>
with one optional space after. -
.code_fence_pattern : Regex
Fenced code block.
-
.dummy_pattern : Regex
A pattern which should never be used.
-
.empty_pattern : Regex
The line contains only whitespace or is empty
-
.footnote_pattern : Regex
A line starting with
[^
and contains with]:
, but without special chars (\] \r\n\x00\t
) between. -
.header_pattern : Regex
Leading (and trailing)
#
define atx-style headers. -
.hr_pattern : Regex
Three or more hyphens, asterisks or underscores by themselves.
-
.html_block_pattern : Regex
A pattern to match the start of an HTML block.
-
.html_characters_pattern : Regex
A pattern to match HTML entity references and numeric character references.
- .html_entities_map : Hash(String, String)
-
.indent_pattern : Regex
A line indented four spaces.
-
.link_reference_definition_pattern : Regex
A line starts with
[
. -
.list_pattern : Regex
Unordered list A list starting with one of these markers:
-
,*
,+
. -
.named_tag_definition : String
A
String
pattern to match a named tag like<table>
or</table>
. -
.render_html(nodes : Array(Node), enable_tag_filter : Bool = false) : String
Render nodes to HTML.
-
.setext_pattern : Regex
A series of
=
or-
(on the next line) define setext-style headers. -
.table_pattern : Regex
A line of hyphens separated by at least one pipe.
-
.to_html(markdown : String, block_syntaxes = Array(BlockSyntax).new, inline_syntaxes = Array(InlineSyntax).new, extension_set : ExtensionSet | Nil = nil, link_resolver : Resolver | Nil = nil, image_link_resolver : Resolver | Nil = nil, inline_only : Bool = false, encode_html : Bool = true, enable_tag_filter : Bool = false, with_default_block_syntaxes : Bool = true, with_default_inline_syntaxes : Bool = true) : String
Converts the given string of Markdown to HTML
Class Method Detail
Alert type patterns.
A alert block is similar to a blockquote, starts with > [!TYPE]
, and only
5 types are supported (case-insensitive).
ASCII punctuation characters with some characters escaped, in order to be used in the RegExp character set.
A pattern which should never be used.
It just satisfies non-nullability of pattern methods.
A line starting with [^
and contains with ]:
, but without special chars
(\] \r\n\x00\t
) between. Same as GFM.
Leading (and trailing) #
define atx-style headers.
Starts with 1-6 unescaped #
characters which must not be followed
by a non-space character. Line may end with any number of #
characters.
Three or more hyphens, asterisks or underscores by themselves.
Note that a line like ----
is valid as both HR and SETEXT. In
case of a tie, SETEXT should win.
A pattern to match the start of an HTML block.
The 7 conditions here correspond to the 7 start conditions in the Commonmark specification one by one: https://spec.commonmark.org/0.30/#html-block.
A pattern to match HTML entity references and numeric character references.
Unordered list
A list starting with one of these markers: -
, *
, +
. May have up to
three leading spaces before the marker and any number of spaces or tabs
after.
Ordered list
A line starting with a number like 123.
. May have up to three leading
spaces before the marker and any number of spaces or tabs after.
Render nodes to HTML.
A series of =
or -
(on the next line) define setext-style headers.
Converts the given string of Markdown to HTML