module
Noir::ZigCalleeExtractor
Overview
Shared structural helper for the Zig framework analyzers (jetzig, zap, httpz, tokamak). Zig has no vendored tree-sitter grammar in noir, so the analyzers lean on this hand-rolled miniparser for two jobs:
#function_table/#function_bodies— index everyfn name(...) { … }in a source file so a route whose handler lives in a named function (httpzrouter.get("/x", getUser, .{}), tokamak.get("/", hello)) can resolve that handler's body for callee extraction.#callees_for_body— pull the 1-hop function calls out of a handler body, used by--include-calleeand--ai-context.
The scanners blank out comments and string/char literals first
(#strip_non_code) so a call-shaped token inside a doc-string or a //
comment is never surfaced as a phantom callee, and they address the
source through an Array(Char) (O(1) indexing) so a single non-ASCII byte
anywhere in the file can't turn the per-character loops into O(n²).
Extended Modules
Defined in:
miniparsers/zig_callee_extractor.crConstant Summary
-
CALL_REGEX =
/(?<![A-Za-z0-9_.@])(@?[A-Za-z_][A-Za-z0-9_]*(?:\s*\.\s*[A-Za-z_][A-Za-z0-9_]*)*)\s*\(/ -
A call expression: an optional
@(builtin) + identifier, then any number of.identifieraccessors, immediately followed by(. The lookbehind stops the match from starting in the middle of an identifier or chain, sofoo.bar(yieldsfoo.baronce, not alsobar. -
FUNCTION_REGEX =
/(?:^|[^A-Za-z0-9_.])fn\s+([A-Za-z_][A-Za-z0-9_]*)\s*\(/ -
(?:pub )? (modifiers)* fn name (— captures the function name. Theextern "C"calling-convention string is already blanked by#strip_non_code, so only the keyword spacing has to be tolerated. -
KEYWORDS =
Set {"addrspace", "align", "allowzero", "and", "anyframe", "anytype", "asm", "async", "await", "break", "callconv", "catch", "comptime", "const", "continue", "defer", "else", "enum", "errdefer", "error", "export", "extern", "fn", "for", "if", "inline", "noalias", "noinline", "nosuspend", "opaque", "or", "orelse", "packed", "pub", "resume", "return", "linksection", "struct", "suspend", "switch", "test", "threadlocal", "try", "union", "unreachable", "usingnamespace", "var", "volatile", "while"} -
Zig keywords that are followed by
(in normal code (if (…),while (…),switch (…),catch (…)) and would otherwise be reported as callees.fn/return/tryetc. never precede a call paren directly but are kept here for clarity. -
NOISE_ROOTS =
Set {"std"} -
Receiver roots whose calls are pure noise for endpoint review context.
std.*(std.debug.print, std.mem.eql, std.fmt.*) appears in nearly every handler and drowns the meaningful callees. -
TEST_BLOCK_RE =
/(?:^|[^A-Za-z0-9_.])test\s*(?:"(?:[^"\\]|\\.)*"\s*)?\{/ -
test { … }/test "name" { … }block opener. Route registrations inside a test block are unit-test fixtures — and, in a framework's own source vendored as a loose file (modules/httpz.zig,modules/router.zig), its self-tests — never runtime endpoints. -
VENDORED_FRAMEWORK_RE =
/\/(?:deps|dep|lib|libs|vendor|vendored|pkg|pkgs|zig-pkg|packages|third_party|third-party|modules|subprojects|external|\.deps)\/(?:zap|httpz|http\.zig|tokamak|jetzig|zmpl|zmd)\// -
A
.zigfile that belongs to a vendored copy of a framework checked into the source tree (zig-pkg/zap/…,src/deps/tokamak/…) rather than to the application. Such trees ship the framework's own tests and examples (zap/src/tests/test_auth.zig's.path = "/test",tokamak/example/'s@"GET /:name"), whose route literals would otherwise surface as phantom app endpoints. The standard fetched-dependency cache (.zig-cache) is already pruned upstream; this matches the manual-vendoring layouts — a vendor directory immediately followed by a framework package directory.
Instance Method Summary
- #attach_to(endpoint : Endpoint, callees : Array(Entry))
- #callees_for_body(body : String, file_path : String, start_line : Int32) : Array(Entry)
-
#find_matching(chars : Array(Char), open_index : Int32, open : Char, close : Char) : Int32 | Nil
Find the index of the delimiter matching the opener at
open_index. -
#function_bodies(source : String, file_path : String) : Hash(String, FunctionBody)
Convenience map keyed by simple function name.
- #function_table(source : String, file_path : String) : Array(FunctionInfo)
- #in_test_block?(offset : Int32, ranges : Array(Tuple(Int32, Int32))) : Bool
- #line_at(chars : Array(Char), offset : Int32) : Int32
-
#strip_comments(source : String) : String
Like
#strip_non_codebut keeps the contents of double-quoted string literals — route paths live inside"…", so the framework analyzers scan this form to read the URL while still ignoring routes that sit in a comment, a\\multiline doc-string, or a char literal. - #strip_comments(chars : Array(Char)) : Array(Char)
- #strip_non_code(source : String) : String
- #strip_non_code(chars : Array(Char)) : Array(Char)
-
#test_block_ranges(stripped : String) : Array(Tuple(Int32, Int32))
Byte ranges of test blocks, brace-matched on the string-blanked source so a
{/}inside a literal can't throw the matching off. - #vendored_framework_path?(path : String) : Bool
Instance Method Detail
Find the index of the delimiter matching the opener at open_index.
Returns nil on imbalance. Operates on the already-stripped char array so
braces inside strings/comments are gone.
Convenience map keyed by simple function name. When two functions share a
name (e.g. several zap endpoint structs each defining get) the first
wins; callers that need struct scoping should filter #function_table
by offset instead.
Like #strip_non_code but keeps the contents of double-quoted string
literals — route paths live inside "…", so the framework analyzers scan
this form to read the URL while still ignoring routes that sit in a
comment, a \\ multiline doc-string, or a char literal.
Byte ranges of test blocks, brace-matched on the string-blanked source so a
{/} inside a literal can't throw the matching off. #in_test_block?
then lets an analyzer drop a route whose registration sits inside one.