class Arborist::Matcher
- Arborist::Matcher
- Reference
- Object
Included Modules
Defined in:
matcher.crConstructors
Instance Method Summary
- #add_rule(rule_name, expr : Expr)
- #add_rule_while_matching_is_in_progress_if_necessary(rule_name : String, expr : Expr)
- #add_skip_rule_if_necessary
- #apply_calls_in_call_stack : Array(ApplyCall)
- #apply_calls_that_resulted_in_left_recursion : Array(ApplyCall)
-
#apply_skip_rule(expr : Expr)
The application of the skip rule is internally represented as a parameterized rule, such that the skip rule is specialized for each expression that will follow.
- #consume(c : Char) : Bool
-
#consume(count : Int32) : String | Nil
consumes a string of length
count
returns nil if unable to consumecount
characters - #eof?
- #expr_call_tree_controller : ExprCallTreeController
- #expr_call_tree_controller=(expr_call_tree_controller : ExprCallTreeController)
- #expr_failures : Hash(Int32, Set(MatchFailure))
- #expr_failures=(expr_failures : Hash(Int32, Set(MatchFailure)))
- #get_rule(rule_name) : Rule
- #get_seed(rule : Rule, pos : Int32)
- #get_seed?(rule : Rule, pos : Int32) : ParseTree | Nil
- #growing : Hash(Int32, Hash(Rule, ParseTree | Nil))
- #growing=(growing : Hash(Int32, Hash(Rule, ParseTree | Nil)))
- #has_memoized_result?(rule : Rule) : Bool
- #input : CharArray
- #log_match_failure(pos : Int32, expr : Expr) : Nil
-
#lookup_most_recent_rule_application_in_call_stack(rule) : ApplyCall | Nil
returns the deepest/most-recent application of
rule
in the rule application stack -
#lookup_oldest_rule_application_that_resulted_in_left_recursion(rule) : ApplyCall | Nil
returns the leftmost/earliest/oldest/shallowest application of
rule
in the rule application stack that resulted in left recursion -
#lookup_rule_application_in_call_stack(rule, pos) : ApplyCall | Nil
returns the deepest/most-recent application of
rule
at position#pos
in the rule application stack -
#mark_most_recent_applications_unsafe_to_memoize(oldest_application)
this method marks all ApplyCall calls on the call stack occurring more recent than oldest_application as unsafe to memoize
- #mark_parent_seed_growths_as_resulting_in_deeper_seed_growth(child_seed_growth_rule_application)
-
#match(input : String, start_rule_name = (@rules.first_key? || "start")) : ApplyTree | Nil
returns nil if the grammar rules don't match the full input string
- #memoize_result(pos, next_pos, rule, parse_tree : ParseTree | Nil) : MemoizedParseTree
- #mode : Symbol
- #mode=(mode : Symbol)
- #most_recent_rule_application : ApplyCall | Nil
-
#ohm_mode?
if we're operating in Ohm mode, then the syntactic rule semantics apply.
- #pop_off_of_call_stack(the_top_of_stack_expr_call_successfully_parsed : Bool) : ExprCall
- #pos : Int32
- #pos=(pos : Int32)
-
#prepare_for_matching
per https://tratt.net/laurie/research/pubs/html/tratt__direct_left_recursive_parsing_expression_grammars/: growing is the data structure at the heart of the algorithm.
- #print_match_failure_error
- #push_onto_call_stack(expr_application : ExprCall)
- #python_mode?
- #remove_seed(rule : Rule, pos : Int32)
-
#remove_seeds_between(start_pos, end_pos)
remove all seeds of any rule in the range [start_pos, end_pos]
-
#remove_seeds_to_the_right_of(start_pos)
remove all seeds of any rule in the range (start_pos,
] -
#remove_seeds_used_to_grow_larger_seed(rule, larger_seed : ParseTree)
this removes all "descendant" seeds that were used to grow a larger seed, <larger_seed>, that encompases all the descendant seeds
-
#rule_in_recursion_call_stack_state : Array(Tuple(Int32, Rule))
returns an array of pairs of the form {pos, rule}, each summarizing an ApplyCall
- #rules : Hash(String, Rule)
- #seed_defined?(rule : Rule, pos : Int32)
- #set_mode(desired_mode)
-
#set_seed(rule : Rule, pos : Int32, seed_parse_tree : ParseTree | Nil) : ParseTree | Nil
todo: decide whether this should return the previous seed value
- #simple_mode?
- #skip_whitespace_if_in_syntactic_context(expr : Expr)
- #use_memoized_result(rule_name) : ParseTree | Nil
Instance methods inherited from module Arborist::DSL
alt(strings : Array(String)) : Expralt(alts : Array(String | Expr)) : Expr
alt(strings : Set(String)) : Expr
alt(*alternatives : String | Expr) : Expr alt, apply(rule_name : String) : Expr apply, choice(alternatives : Array(Expr)) : Expr
choice(*alternatives) : Expr choice, dot : Expr dot, label(label : String, expr : Expr) : Expr label, neg(expr : Expr) : Expr neg, opt(expr : Expr) : Expr opt, plus(expr : Expr) : Expr plus, pos(expr : Expr) : Expr pos, range(chars : Range(Char, Char)) : Expr range, seq(exprs : Array(Expr)) : Expr
seq(*exprs) : Expr seq, star(expr : Expr) : Expr star, term(string : String) : Expr term
Constructor Detail
Instance Method Detail
The application of the skip rule is internally represented as a parameterized rule, such that the skip rule is specialized
for each expression that will follow. This approach made it possible implement the skip rule as:
parameterized_skip_rule[following_expr] <- !following_expr skip*
Eventually I concluded that implementing the skip rule as !{following expression} skip*
was not what I wanted right now,
but this establishes the pattern of implementing parameterized rules, and I may decide to go back to implementing the skip
rule as !{following expression} skip*
.
consumes a string of length count
returns nil if unable to consume count
characters
returns the deepest/most-recent application of rule
in the rule application stack
returns the leftmost/earliest/oldest/shallowest application of rule
in the rule application stack that resulted in left recursion
returns the deepest/most-recent application of rule
at position #pos
in the rule application stack
this method marks all ApplyCall calls on the call stack occurring more recent than oldest_application as unsafe to memoize
returns nil if the grammar rules don't match the full input string
if we're operating in Ohm mode, then the syntactic rule semantics apply. See https://github.com/harc/ohm/blob/master/doc/syntax-reference.md#syntactic-lexical for more information. From https://github.com/harc/ohm/blob/master/doc/syntax-reference.md#syntactic-lexical:
Syntactic vs. Lexical Rules
A syntactic rule is a rule whose name begins with an uppercase letter, and lexical rule is one whose name begins with a lowercase letter. The difference between lexical and syntactic rules is that syntactic rules implicitly skip whitespace characters.
For the purposes of a syntactic rule, a "whitespace character" is anything that matches its enclosing grammar's "space" rule. The default implementation of "space" matches ' ', '\t', '\n', '\r', and any other character that is considered whitespace in the ES5 spec.
per https://tratt.net/laurie/research/pubs/html/tratt__direct_left_recursive_parsing_expression_grammars/: growing is the data structure at the heart of the algorithm. A programming language-like type for it would be Map<Rule,Map<Int,Result>>. Since we statically know all the rules for a PEG, growing is statically initialised with an empty map for each rule at the beginning of the algorithm (line 1).
So, we want to initialize the growing map just prior to using it, since that will be the only point that we know for sure that all of the rules have been added to the matcher.
remove all seeds of any rule in the range [start_pos, end_pos]
remove all seeds of any rule in the range (start_pos,
this removes all "descendant" seeds that were used to grow a larger seed, <larger_seed>, that encompases all the descendant seeds
returns an array of pairs of the form {pos, rule}, each summarizing an ApplyCall
todo: decide whether this should return the previous seed value