class Noir::JSRouteExtractor

Overview

JSRouteExtractor provides a unified interface for extracting routes from JavaScript files

Defined in:

miniparsers/js_route_extractor.cr

Constant Summary

BRACKET_ROUTE_CALL_PATTERN = /\[\s*['"](?:get|post|put|delete|del|patch|options|head|all)['"]\s*\]\s*\(/i
CLIENT_SIDE_FRAMEWORK_MARKERS = ["from \"vue\"", "from 'vue'", "from \"@vue/", "from '@vue/", "from \"vue-router\"", "from 'vue-router'", "from \"@vueuse/", "from '@vueuse/", "from \"pinia\"", "from 'pinia'", "from \"react\"", "from 'react'", "from \"react-dom", "from 'react-dom", "from \"react-router", "from 'react-router", "from \"@angular/", "from '@angular/", "from \"svelte\"", "from 'svelte'", "from \"svelte/", "from 'svelte/", "from \"solid-js", "from 'solid-js", "from \"preact\"", "from 'preact'", "from \"preact/", "from 'preact/", ".vue\"", ".vue'", ".svelte\"", ".svelte'"]

Client-side UI framework imports. A file that imports a browser UI framework (Vue, React, Angular, Svelte, Solid, Preact) and its satellite libs (pinia, vue-router, @vueuse, react-router, ...) is SPA/frontend code, not an HTTP server. Its route-shaped calls are outbound API-client requests against a configured client — e.g. directus's admin app does api.get(/users/${userId}) where api is a wrapped axios instance imported from @/api. The existing axios/got/ky markers miss these because the wrapper hides the raw client behind a local module, but the UI-framework import is an unambiguous "this is browser code" signal. directus's admin SPA alone parks ~61 phantom Express endpoints across app/src/{stores,composables,layouts,...} this way. Like the test-stub markers, this is gated by the HTTP-server-import exemption below: an SSR entrypoint that imports BOTH vue and express keeps its routes.

FLEXIBLE_ROUTE_CALL_PATTERN = /\.(?:\s|\n|\r)*(?:get|post|put|delete|del|patch|options|head|all|route|register|use)(?:\s|\n|\r)*\(/i
HTTP_SERVER_LIBRARY_MARKERS = ["from \"express\"", "from 'express'", "require(\"express\")", "require('express')", "from \"fastify\"", "from 'fastify'", "require(\"fastify\")", "require('fastify')", "from \"koa\"", "from 'koa'", "require(\"koa\")", "require('koa')", "from \"hono\"", "from 'hono'", "require(\"hono\")", "require('hono')", "from \"restify\"", "from 'restify'", "require(\"restify\")", "require('restify')", "from \"polka\"", "from 'polka'", "from \"h3\"", "from 'h3'", "from \"@nestjs/", "from '@nestjs/"]

Real HTTP-server library imports. When any of these is present alongside a test-stub marker, the file is doing legitimate server work (e.g., spinning up a test instance of an Express app) and we still want to extract its routes.

MINIFIED_AVG_LINE_THRESHOLD = 1000

Average bytes-per-line above which a file is considered dominated by long lines, i.e. a bundle rather than hand-written source that merely carries one fat literal (a big inline JSON seed, an embedded base64 data URI, a long regex). Real code keeps the average low because it has many short lines around any such literal.

MINIFIED_LINE_THRESHOLD = 5000

Byte length above which a single source line is considered "long". Hand-written JS/TS keeps lines well under this even in dense route tables (noir's own widest fixture line is ~150 bytes); webpack/ rollup/esbuild bundles and *.min.js assets routinely pack tens of thousands of bytes onto one line, so 5000 leaves a wide margin. NB: the metric is bytes, not characters — a dense single-line non-Latin blob (>=5000 bytes but fewer chars) can trip it, which is acceptable since real route registrations are ASCII verbs/paths.

PARSER_ROUTE_CALL_HINTS = [".get(", ".post(", ".put(", ".delete(", ".patch(", ".options(", ".head(", ".all(", ".route(", ".register(", ".use(", ".get (", ".post (", ".put (", ".delete (", ".patch (", ".options (", ".head (", ".all (", ".route (", ".register (", ".use ("]

Pre-filter for .extract_routes: returns false when content contains no shape the JS parser knows how to emit (any verb invocation pattern like .get(/.post(/... or Fastify/Restify .route(, plus Express-style mounts .use( which feed into the cross-file router prefix table). Substring-checking is millions of times cheaper than tokenizing the file.

ROUTER_PREFIX_KEY = Analyzer::Javascript::ExpressConstants::ROUTER_PREFIX_KEY

Import constants for key generation

STRICT_TEST_PATH_MARKERS = ["/e2e/", "/cypress/", "/playwright/", "/__mocks__/", "/__tests__/", "/e2e-tests/", "/mirage/"]

True when the file's route-shaped calls are almost certainly mock-server stubs (Ember pretender, MSW, nock, ...) rather than real route registrations. Two routes:

Path markers strict enough that the HTTP-server-import exemption shouldn't apply: /e2e/, /cypress/, /playwright/, /__mocks__/, /__tests__/, /e2e-tests/, /mirage/. Real apps never park production handlers under any of these — even when the harness file imports express to spin up a faked service (Ghost's e2e/helpers/services/stripe/fake-stripe-server.ts is the canonical example). Keeping the exemption out of these paths catches the harness fakes without affecting legit backend code.

TEST_STUB_FILENAME_MARKERS = [".test.", ".spec.", "-spec.", "-test.", ".test-d."]

Hard test-file markers: when the filename itself follows a ubiquitous test convention, the file practically never defines real routes. Skip these even when the file imports a real HTTP server lib — NestJS e2e tests routinely import @nestjs/platform-express for type-only references, and supertest harnesses import the same modules they exercise. The supertest request(app).get(...) shape would otherwise ride the HTTP-server-import exemption straight back into the parser.

TEST_STUB_LIBRARY_MARKERS = ["pretender", "miragejs", "ember-cli-mirage", "from \"msw\"", "from 'msw'", "from \"msw/", "from 'msw/", "require(\"msw\")", "require('msw')", "from \"nock\"", "from 'nock'", "require(\"nock\")", "require('nock')", "setupApplicationTest", "setupRenderingTest", "/// <reference types=\"cypress\" />", "from \"cypress\"", "from 'cypress'", "require(\"cypress\")", "require('cypress')", "from \"@playwright/test\"", "from '@playwright/test'", "from \"playwright\"", "from 'playwright'", "from \"supertest\"", "from 'supertest'", "require(\"supertest\")", "require('supertest')", "from \"axios\"", "from 'axios'", "require(\"axios\")", "require('axios')", "from \"purest\"", "from 'purest'", "require(\"purest\")", "require('purest')", "from \"got\"", "from 'got'", "require(\"got\")", "require('got')", "from \"ky\"", "from 'ky'", "require(\"ky\")", "require('ky')", "from \"superagent\"", "from 'superagent'", "require(\"superagent\")", "require('superagent')", "from \"node-fetch\"", "from 'node-fetch'", "require(\"node-fetch\")", "require('node-fetch')", "from \"ofetch\"", "from 'ofetch'", "require(\"ofetch\")", "require('ofetch')", "from \"undici\"", "from 'undici'", "require(\"undici\")", "require('undici')", "from \"request\"", "from 'request'", "require(\"request\")", "require('request')", "from \"apollo-datasource-rest\"", "from 'apollo-datasource-rest'", "require(\"apollo-datasource-rest\")", "require('apollo-datasource-rest')", "from \"@apollo/datasource-rest\"", "from '@apollo/datasource-rest'", "require(\"@apollo/datasource-rest\")", "require('@apollo/datasource-rest')"]

Test-fixture libraries whose API mimics route registration: pretender/miragejs expose server.get("/x", ...), MSW and nock expose handler builders, sinon-via-faker likewise. When these libraries are imported, virtually every route-shaped call in the file is a stub, not a real registration. Substring match is enough — these tokens never appear in production HTTP server source under normal circumstances.

TEST_STUB_PATH_MARKERS = ["-pretender.", "-pretenders.", ".pretender.", "-mirage.", ".mirage.", "/tests/helpers/", "/test/helpers/", "/tests/api/", "/__tests__/", "/test/integration/", "/tests/integration/", "/test/e2e/", "/tests/e2e/", "/cypress/", "/playwright/", "/e2e-tests/", "/e2e/", "/mirage/", "/__mocks__/", "/dist/", "/build/", "/.next/", "/.nuxt/", "/.output/", "/coverage/", "/vendor/", "/app/javascript/", "/public/"]

Path-level evidence that a file is a mock-server fixture. Pretender helpers in particular get a helper/this arg and call this.get(...) / this.post(...) directly, so they have no library-name imports the content filter can hook on — fall back to the convention-based filename match.

Class Method Summary

Class Method Detail

def self.attach_callees(endpoint : Endpoint, callees_by_route : Hash(String, Array(JSCalleeExtractor::Entry)), method : String, path : String, line : Int32) #

[View source]
def self.extract_body_params(handler_body : String, endpoint : Endpoint) #

[View source]
def self.extract_cookie_params(handler_body : String, endpoint : Endpoint) #

[View source]
def self.extract_header_params(handler_body : String, endpoint : Endpoint) #

[View source]
def self.extract_params_from_context(content : String, pattern : JSRoutePattern, endpoint : Endpoint) #

[View source]
def self.extract_path_params(handler_body : String, endpoint : Endpoint) #

[View source]
def self.extract_query_params(handler_body : String, endpoint : Endpoint) #

[View source]
def self.extract_routes(file_path : String, content : String | Nil = nil, debug : Bool = false, *, include_callees : Bool = false, route_callees : Hash(String, Array(JSCalleeExtractor::Entry)) | Nil = nil) : Array(Endpoint) #

[View source]
def self.extract_static_paths(content : String, framework : Symbol | Nil = nil) : Array(Hash(String, String)) #

Extract static path declarations from JavaScript content Returns array of hashes with static_path (URL prefix) and file_path (directory) framework scopes the scan to one framework's static-mount idiom so a framework analyzer running over a sibling project's file (every JS analyzer walks all .js/.ts files) doesn't pick up another framework's static declaration and re-emit it under the wrong tech. nil runs every pattern (back-compat for un-scoped callers).


[View source]
def self.find_matching_brace(content : String, open_brace_idx : Int32) : Int32 | Nil #

Delegate to JSLiteralScanner for literal-aware brace matching


[View source]
def self.find_matching_paren(content : String, open_paren_idx : Int32) : Int32 | Nil #

Delegate to JSLiteralScanner for literal-aware paren matching


[View source]
def self.minified_content?(content : String, line_threshold : Int32 = MINIFIED_LINE_THRESHOLD, avg_threshold : Int32 = MINIFIED_AVG_LINE_THRESHOLD) : Bool #

True when content looks like a minified/bundled asset rather than hand-written source. Two conditions must BOTH hold so we never drop the routes of a normal file that just happens to carry one long line (issue #1903 review):

  1. at least one line reaches MINIFIED_LINE_THRESHOLD bytes, and
  2. the file's average line length reaches MINIFIED_AVG_LINE_THRESHOLD — long lines dominate, newline density is low. webpack/rollup output and *.min.js satisfy both (the whole file is one or a few enormous lines); a route module with a 7 KB inline payload amid dozens of short route lines satisfies neither, so its real endpoints survive. Skipping such a file is purely a parser optimization — small files lex fast regardless — so there is no need to skip one merely because it embeds a fat literal.

[View source]
def self.normalize_http_method(method : String) : String #

Normalize HTTP method names to standard format


[View source]
def self.route_call_candidate?(content : String) : Bool #

[View source]
def self.strip_js_comments(content : String) : String #

Replace JS/TS comments with whitespace of the same shape. Preserves newlines and column offsets so downstream line/column math (controller_start_line, regex .begin(0), etc.) stays accurate. Comment bodies are blanked to spaces so a commented- out decorator like // @Get('/old') never matches the route regex.


[View source]
def self.test_stub_only?(file_path : String, content : String, include_client_frameworks : Bool = true) : Bool #
  • Filename markers fire unconditionally — foo.test.ts is a test no matter what it imports.
    • Strict path markers also fire unconditionally — e2e/, cypress/, etc. are dedicated test/mock trees that never contain production handlers, even when the harness file imports a server lib.
    • Library + the remaining directory markers honor an exemption — if the file also imports a real HTTP server lib (express, fastify, ...), keep it so legit test-server harnesses (e.g. mattermost's webhook_serve.js) keep their routes. include_client_frameworks controls whether a client-side UI framework import (Vue/React/...) counts as a skip signal. It must be ON for the verb-DSL extractor (a React/Vue file calling api.get(...) is an outbound client call, not a route), but OFF for analyzers whose OWN route definitions live in client-side files — TanStack Router (createFileRoute) and tRPC route modules routinely import { ... } from 'react', and skipping them on that basis dropped every such route. The test-stub library markers (msw/supertest/...) and path/ filename markers still apply in both modes.

[View source]