Acorn Build Status

🚧 Under Construction 👷

A state machine compiler with no runtime dependency. Define a grammar using a subset of regular expression notation, then compile it into a blazing-fast state machine. Acorn supports lexers or custom string-based state machines.

Installation

Add this to your application's shard.yml:

dependencies:
  acorn:
    github: "rmosolgo/acorn"

Usage

Regular Expressions

Tokens are defined with a small regular expression language:

Feature | Example ---|--- Character| a, 1, ❤️ Sequence |ab, 123 Alternation | a|b ~~Grouping~~ | (ab)|c ~~Any character~~ | . One of | [abc] ~~Not one of~~ | [^abc] Escape | \[, \. Unicode character range | a-z, 0-9 Zero-or-more | a* One-or-more | a+ Zero-or-one | a? Specific number | a{3} Between numbers | a{3,4} At least | a{3,}

Build Step

An Acorn module is a Crystal program that generates code. To get a lexer, you have to run the Acorn module. Then, your main program should use the generated code.

For example, if you define a lexer:

# build/my_lexer.cr
class MyLexer < Acorn::Lexer
  # ...
  generate("./app/my_lexer.cr")
end

You should run the file with Crystal to generate the specified file:

crystal run build/my_lexer.cr

Then, your main program should require the generated file:

# my_app.cr
require "app/my_lexer"
MyLexer.scan(input) # => Array(Tuple(Symbol, String))

The generated code has no dependency on Acorn, so you only need this library during development.

Tokens

Acorn returns an of array tokens. Each token is a tuple with:

Line numbers and column numbers are 1-indexed, so the first character in the input is 1:1.

Custom Machines

Acorn lexers are actually a special case of state machine. You can specify a custom machine, too.

Development

Goals & Non-Goals

Goals:

Non-goals:

TODO

License

LGPLv3