module Noir::TreeSitterGoRouteExtractor

Overview

Tree-sitter-backed Go route extractor.

Scope for this first cut: recognise the idioms shared by Gin / Echo / Fiber / Hertz / Iris — a router or group object with HTTP-verb methods attached (r.GET("/path", handler)), plus .Group("/prefix") chaining so nested groups resolve correctly.

Deliberately not covered yet (legacy regex extractor still handles these):

All of the above can grow into this extractor once the PoC is proven.

Extended Modules

Defined in:

miniparsers/go_route_extractor_ts.cr

Constant Summary

ANY_FAN_OUT_VERBS = ["GET", "POST", "PUT", "PATCH", "DELETE", "HEAD", "OPTIONS"]

The seven canonical HTTP methods Gin's r.Any, Echo's e.Any, Beego's * route etc. all stand for. Used by analyzer-level fan-out (see .fan_out_verbs).

BEEGO_CONTROLLER_HTTP_METHODS = {"Get" => "GET", "Post" => "POST", "Put" => "PUT", "Delete" => "DELETE", "Patch" => "PATCH", "Head" => "HEAD", "Options" => "OPTIONS"}

When a web.Router call carries no method-mapping string, Beego auto-maps incoming requests to controller methods whose names match an HTTP verb (Go-cased). Maps the receiver-method name to the HTTP verb it serves so a mapping-less registration emits exactly the methods the controller actually implements.

BEEGO_ROUTER_OPERANDS = Set {"web", "beego"}

Beego registers controllers with web.Router("/path", &Ctrl{}, "get:Method;post:Other"). The receiver is the web package (v2, github.com/beego/beego/v2/server/web) or the legacy beego package alias (v1, github.com/astaxie/beego). Restricting the operand to these two names keeps something.Router(...) calls on unrelated types from minting phantom endpoints.

ENGINE_CONSTRUCTORS = Set {"New", "Default", "NewRouter"}

Framework constructors that mint a root router/engine — the receiver they're assigned to carries no path prefix. A name bound to one of these is the application root, never a sub-group.

ENGINE_PARAM_TYPES = Set {"Engine", "Echo", "App", "Mux"}

Engine/root type names (final identifier of the parameter type, pointer stripped). A parameter of one of these types is the root router handed in by the caller — gin.Engine, echo.Echo, fiber.App, chi.Mux/mux.Router (the last shares Router with group types, so it's intentionally omitted to avoid excluding genuine group params).

HTTP_VERB_METHODS = Set {"GET", "POST", "PUT", "DELETE", "PATCH", "HEAD", "OPTIONS", "Get", "Post", "Put", "Delete", "Patch", "Head", "Options", "ANY", "Any", "All"}

HTTP verbs Gin/Echo/Fiber/etc. accept as method names on router objects. Mixed case is allowed because both r.GET(...) (Gin) and r.Get(...) (fiber, gin alt) appear in the wild.

NON_ROUTER_OPERANDS = Set {"gjson", "result", "results", "header", "headers", "Header", "Headers", "cookie", "cookies", "Cookie", "Cookies", "params", "Params", "values", "Values", "vars", "Vars", "url", "URL", "uri", "URI", "cache", "Cache", "db", "DB", "tx", "Tx", "conn", "Conn", "config", "cfg", "conf", "Config", "logger", "log", "client", "Client", "request", "Request", "req", "Req", "response", "Response", "resp", "Resp", "fixtures", "Fixtures", "slog", "zap", "http"}

Common non-router identifiers in Go code that expose .Get(string) or .Post(...) style methods but emit values, not routes. The selector-expression walk emits a verb route on every match of <operand>.<HttpVerb>(stringLit, ...), so without this guard patterns like gjson.Get(json, "Files.0.UID"), header.Get("Content-Type"), or params.Get("user") become bogus /Files.0.UID, /Content-Type, /user endpoints.

Keep this list conservative — it only rejects names that are almost never used to hold a real router instance. Generic names like r, c, app, mux, engine are intentionally not included.

PASSTHROUGH_CHAIN_METHODS = Set {"Use", "SetMeta", "RemoveMeta", "Middleware", "GlobalMiddleware", "CORS", "Bind", "Unbind", "BindFunc", "UnbindFunc"}

Chain methods that return the receiving router/group unchanged — middleware / metadata registration. Gin's RouterGroup.Use(...) and Engine.Use(...) (and Fiber's app.Use(...)) return IRoutes, so r.Use(mw).GET("/x", h) and r.Group("/api").Use(mw).POST(...) are valid, common shapes.

Goyave's router exposes a fluent builder whose configuration methods (SetMeta, Middleware, CORS, ...) all return the same *Router, so authRouter := subrouter.Group().SetMeta(k, v) binds authRouter to the group's prefix — the .SetMeta(...) tail must be peeled to reach the prefix-bearing .Group() call underneath (otherwise the parent prefix is lost and every route under authRouter falls back to /).

None of these add a path segment, so the operand walk peels them and resolves the prefix against the underlying router/group rather than dropping the route (or its prefix) entirely.

RESTFUL_PARAM_KINDS = {"PathParameter" => "path", "QueryParameter" => "query", "HeaderParameter" => "header", "BodyParameter" => "json", "FormParameter" => "form"}
RESTFUL_VERBS = ["GET", "POST", "PUT", "DELETE", "PATCH", "HEAD", "OPTIONS"] of ::String

Class Method Summary

Instance Method Summary

Class Method Detail

def self.fan_out_verbs(verb : String) : Array(String) #

Returns the list of verbs to emit for a given extracted route verb. ANY / ALL (case-insensitive — verbs are uppercased before they reach this helper) expand to every canonical HTTP method so downstream output formats list each method explicitly instead of carrying a non-HTTP "ANY" verb that tools like SARIF/Postman can't ingest. Anything else passes through as a single-element list.


[View source]

Instance Method Detail

def collect_router_builder_callsites(source : String, builders : Set(String)) : Array(Tuple(String, String)) #

Finds calls to any of the named builder functions and returns [{func_name, first_arg_identifier}]. The first argument names the group passed in (addUserRoutes(v1) -> {"addUserRoutes", "v1"}), which the caller resolves to a prefix via the package group map.


[View source]
def collect_router_group_builders(source : String) : Hash(String, RouterBuilder) #

Detects top-level Gin router-builder helpers. The canonical gin project layout splits registration across func addXRoutes(rg *gin.RouterGroup) helpers called from a central getRoutes() with a versioned group (addUserRoutes(router.Group("/v1"))). The group prefix lives at the call site, not in the helper, so the helper's routes need that prefix grafted on (see #extract_routes_from_function). Returns {func_name => RouterBuilder}; only functions with exactly one *gin.RouterGroup parameter qualify (an ambiguous count can't be bound to a single prefix).


[View source]
def extract_beego_routes(source : String, controller_methods : Hash(String, Array(String)) = Hash(String, Array(String)).new) : Array(Route) #

Extracts Beego controller-style routes:

web.Router("/health", ctrl, "get:Health") -> GET /health web.Router("/x", c, "get,post:Handle") -> GET /x, POST /x web.Router("/x", c, "get:Read;post:Write") -> GET /x, POST /x web.Router("/any", c, "*:Any") -> ANY /any (fan-out) web.Router("/", &MainController{}) -> verb routes for each HTTP method the controller implements

controller_methods (see #extract_controller_methods) supplies the method set for the mapping-less form; when the controller type can't be resolved (e.g. a cross-package &controllers.User{}), the route falls back to a single GET so the endpoint is still surfaced rather than dropped. The Route's handler carries the controller-method name so the analyzer can attribute it as a callee.


[View source]
def extract_chi_routes(source : String, skip_functions : Set(String) = Set(String).new, external_string_values : Hash(String, String) = Hash(String, String).new) : Array(Route) #

net/http-style registrations chi exposes alongside the verb shortcuts: r.MethodFunc("GET", "/x", h) (method as the first string arg, incl. custom verbs from chi.RegisterMethod) and r.HandleFunc("/x", h) / r.Handle("/x", h) (match ANY method). Gated behind ScopedConfig#net_http_methods? so the gf walker — which shares this recognizer — is untouched.


[View source]
def extract_controller_methods(source : String) : Hash(String, Array(String)) #

Collects Beego controller types and the HTTP-verb-named methods they implement, keyed by the (package-unqualified) type name. Used to resolve mapping-less web.Router("/path", &Ctrl{}) registrations into the concrete set of methods the controller serves. Built once per package directory by the Beego analyzer (controllers and their router registrations usually share a package).

Only HTTP-verb method names are recorded — a MainController that defines Get, Health, Update contributes {"MainController" => ["Get"]}, because Beego's default mapping only routes verb-named methods; Health/Update are reachable solely via an explicit "get:Health" mapping string.


[View source]
def extract_engine_names(source : String) : Set(String) #

Collects names that denote a root engine/router rather than a path-bearing group:

r := gin.New() / r := gin.Default() r := chi.NewRouter() / e := echo.New() func setup(r *gin.Engine) / func setup(e *echo.Echo)

The cross-file group pre-pass excludes these so a same-named local group in a sibling file (e.g. r := v1.Group("/sysjob")) can't leak a prefix onto the root and contaminate every route in the package. Each file still resolves its own r locally during route extraction; this only governs what crosses file boundaries.


[View source]
def extract_engine_names_and_groups(source : String, group_method : String = "Group", group_aliases : Array(String) = [] of String) : Tuple(Set(String), Hash(String, String)) #

Single-parse combination of #extract_engine_names + #extract_groups (with an empty external map). The Go engine's group pre-pass needs BOTH per file — the root-engine names to exclude from cross-file propagation and the file's own group declarations — so folding them into one tree-sitter parse halves the pre-pass parse count. Behaviour is identical to calling the two extractors separately; only the parse is shared.


[View source]
def extract_gf_meta_routes(source : String) : Array(GfMetaRoute) #

GoFrame standardized routing: scan every type X struct { ... } for an embedded g.Meta field whose tag declares a route (path:"/x" method:"get"). Each such struct is one endpoint (or several, when method lists more than one verb). The struct's own named fields become request params. This is method-/group-agnostic on purpose: the tag fully specifies the route, the same way gf's OpenAPI generator treats it, so we don't need to resolve the group.Bind(...) site (whose prefix is often a runtime config value we can't see statically).


[View source]
def extract_gf_routes(source : String) : Array(Route) #

[View source]
def extract_go_restful_routes(source : String) : Array(RestfulRoute) #

[View source]
def extract_goyave_statics(source : String) : Array(StaticPath) #

Goyave-style <router>.Static(&fs, "/prefix", false): the first /-prefixed string argument is the URL prefix; the disk path is derived by stripping its leading slash (matching the legacy extractor's behaviour, which used the same identifier for both).


[View source]
def extract_gozero_routes(source : String) : Array(Route) #

go-zero registers routes as rest.Route struct literals rather than verb calls, in two shapes:

server.AddRoutes( # generated routes.go []rest.Route{ {Method: http.MethodPost, Path: "/user/login", Handler: h}, }, rest.WithPrefix("/usercenter/v1"), )

server.AddRoute(rest.Route{Method: http.MethodGet, Path: "/"}) apiGroup := server.Group("/api/v1") # hand-written grouping apiGroup.AddRoute(rest.Route{Path: "/products", ...})

The verb/path live in the struct (not a .Get(...) call), and the mount prefix comes from a trailing rest.WithPrefix(...) option and/or a server.Group("/p") receiver — so the generic verb extractor sees nothing. This decodes every route to its full mounted path so it dedupes against the same route declared (prefix-applied) in a .api file. handler carries the registered handler expression for callee wiring.


[View source]
def extract_groups(source : String, external_groups : Hash(String, String) = Hash(String, String).new, group_method : String = "Group", group_aliases : Array(String) = [] of String) : Hash(String, String) #

Extracts only <name> := <parent>.<group_method>("/prefix") declarations. Used by the Go engine to run a cross-file fixpoint so group names defined in one file but referenced in another are known by the time #extract_routes runs on the referencing file.


[View source]
def extract_mux_statics(source : String) : Array(StaticPath) #

Mux-style <router>.PathPrefix("/x/").Handler(<... http.Dir("./x/") ...>). URL prefix comes from the PathPrefix arg; disk path from the http.Dir(...) call nested somewhere inside the Handler(...) argument expression.


[View source]
def extract_routes(source : String, external_groups : Hash(String, String) = Hash(String, String).new, group_method : String = "Group", handle_method : String | Nil = nil, handlefunc_methods : Bool = false, group_aliases : Array(String) = [] of String, extra_verbs : Array(String) = [] of String, handle_many_method : String | Nil = nil, closure_group_methods : Array(String) = [] of String) : Array(Route) #

Parses source and returns every verb route it can resolve. external_groups supplies group prefixes defined in other files of the same Go package, so cross-file patterns like routes.go calling v1.GET(...) under a v1 := r.Group("/v1") declared in main.go resolve correctly. group_method is the method name used for grouping — Gin/Echo/Fiber/ Hertz use .Group(...), Iris uses .Party(...). Mux uses the special two-call chain <parent>.PathPrefix("/prefix").Subrouter(); pass "Subrouter" and the collector will peek through the chain to pull the prefix from the .PathPrefix(...) call. handle_method is the "method-first" shape some routers use (httprouter's .Handle("METHOD", "/path", handler)); set to nil to disable. handlefunc_methods enables mux's <router>.HandleFunc("/path", h).Methods("METHOD") chain — the outer call is .Methods(...), so this piggybacks on the walk rather than decode_verb_call.


[View source]
def extract_routes_from_function(source : String, func_name : String, external_groups : Hash(String, String), handle_method : String | Nil = nil) : Array(Route) #

Extracts the verb routes registered inside one named function's body, seeding external_groups with the function's group parameter bound to a call-site prefix ({rg => "/v1"}). This grafts the call-site prefix onto routes a router-builder helper registers on its parameter group (users := rg.Group("/users"); users.GET("/") -> /v1/users/). Route line numbers stay relative to source so code paths remain accurate.


[View source]
def extract_simple_statics(source : String, method_name : String = "Static") : Array(StaticPath) #

<router>.<method_name>("/prefix", "./dir", ...). The first two string args are taken as (url_prefix, disk_path). Covers the Gin/Echo/Fiber/Hertz/GoZero shape.


[View source]
def extract_string_values(source : String) : Hash(String, String) #

Collects <name> := "literal" / const <name> = "literal" string bindings from source, keyed by name. Real chi/mux apps routinely declare route paths as package constants (const tokenPath = "/api/v2/token") and register them with r.Get(tokenPath, h); the analyzer merges these per-package so the scoped walker can resolve a constant/variable path argument to its literal value. Conflicting redefinitions are dropped by collect_string_values.


[View source]
def walk_chi_public(node : LibTreeSitter::TSNode, source : String, sink : Array(Route), string_values : Hash(String, String) = Hash(String, String).new) #

Exposes the closure-scoped walker against an arbitrary node (typically a function body captured elsewhere). Uses chi defaults incl. the net/http registrations (MethodFunc/HandleFunc/Handle) so a Mount-expanded router function body is parsed like any chi file.


[View source]