module
NoirPassiveScan::FalsePositive
Overview
Heuristics that drop passive-scan results which trip a keyword/regex matcher but are demonstrably not hard-coded secrets — chiefly runtime indirections (environment-variable reads, CI templating expressions) where the real value lives outside the source tree.
Guiding invariant: never hide a real literal. Every form matched
here is one that cannot carry a checked-in secret — a
${{ secrets.FOO }} reference, an os.getenv("FOO") read, a
<your-token> placeholder — so suppressing it cannot turn a true
positive into a false negative. Anything that could be a literal
(an actual ghp_… token, a PEM block, a quoted high-entropy value)
is left untouched.
Scope is intentionally narrow: only secret-category findings are
eligible. The dominant false-positive source for the bundled secret
rules is their word matcher firing on a variable name
(GITHUB_TOKEN, AWS_ACCESS_KEY_ID, …) on lines that merely
reference the variable rather than assign it a literal value.
Defined in:
passive_scan/false_positive.crConstant Summary
-
ASSIGNMENT_VALUE =
/(?::|=>?)\s*(.+?)\s*$/ -
Captures the value to the right of the first assignment separator (
:,=, or the PHP/Ruby hash arrow=>), trimming surrounding whitespace. Lines with no assignment separator (e.g. a bare-----BEGIN RSA PRIVATE KEY-----) never match, so PEM blocks and similar literals fall through untouched. -
COMMENT_PREFIXES =
["#", "//", "/*", "*", "<!--"] -
Whole-line comment markers across the common languages noir scans (shell/Python/Ruby/YAML
#, C-family///**, HTML/XML/MD<!--). A variable name mentioned in a comment is never a leaked secret; a real literal in a comment is still caught by the value-shape regex gate, so this can only drop false positives. -
EMPTY_ASSIGNMENT =
/(?::|=>?)\s*$/ -
Same separators as ASSIGNMENT_VALUE but with an empty value —
AWS_ACCESS_KEY_ID=/password:in a.env.exampleor config template. An empty value cannot carry a secret, so a keyword match on such a line is always a false positive. -
ENV_ACCESSOR_MARKERS =
["process.env", "import.meta.env", "os.environ", "os.getenv", "getenv(", "System.getenv", "System.getProperty", "Deno.env", "ENV[", "ENV.fetch", "Environment.GetEnvironmentVariable", "Sys.getenv"] -
Runtime environment-variable accessors. A line that pulls its value from the environment at runtime has, by construction, no literal secret to leak. These are substring-matched anywhere on the line so
key = os.getenv("OPENAI_API_KEY")is covered regardless of where the accessor sits. -
PLACEHOLDER_VALUE =
/\A(?:<[^>]*>|your[-_ ]|insert[-_ ]your|replace[-_ ](?:me|this|with)|(?:changeme|change[-_]me|replaceme|replace[-_]me|placeholder|redacted|dummy|todo|fixme|none|null|nil|undefined|x{4,}|\*{4,})\b)/i -
Documentation/template placeholder values — what a reader is told to replace, never a real secret. Matched at the start of the value so a
KEY=<token> …orKEY=your-access-key-idexample is caught even with trailing text:- angle-bracket stubs (
<token>,<your-key>) your-…/your_…(your-access-key-id,your_api_key)- explicit "replace this" / "insert your" prose
- bare null / dummy tokens (
nil,null,none,changeme,placeholder,redacted,xxxx…,****) All are forms a genuine high-entropy literal can never take, so this only removes false positives.
- angle-bracket stubs (
-
PURE_REFERENCE =
/\A(?:\$\{?\{?[A-Za-z_][\w.\- ]*\}?\}?|\$\(.+\)|%[A-Za-z_]\w*%|\{\{.+\}\}|<[^>]+>|env\(\s*['"][^'",]+['"]\s*\))\z/ -
Whole-value forms that are references or placeholders, never literals: shell/template variable substitutions (
$VAR,${VAR},${{ … }},$(…),%VAR%,{{ … }}), angle-bracket placeholders (<your-token>), and single-argument env-helper calls (env('AWS_ACCESS_KEY_ID'), common in Laravel/Symfony/Rails config). Anchored so it only fires when the entire value is a reference — a real secret that merely contains a$(e.g.P$ssw0rd…) is not matched. The env-call form is deliberately single-argument: a two-argumentenv('K', 'default')could hide a literal default, so it is left to fall through. -
SECRET_NAME =
/\A(?=[A-Za-z0-9_]*[A-Z_])[A-Za-z_][A-Za-z0-9_]*\z/ -
A secret variable name. The bundled secret rules carry two kinds of
wordpattern: environment-variable names (GITHUB_TOKEN,DATABASE_URL,AWS_ACCESS_KEY_ID) and literal secret markers (-----BEGIN PRIVATE KEY-----). Only the former are eligible for the "merely mentioned" suppression below — a PEM marker is itself the secret and must never be dropped on a mention basis.Names are required to look like an env var: an identifier carrying at least one uppercase letter or underscore (
DATABASE_URL,github_pat_). This deliberately excludes bare lowercase words (token,secret) so a rule keyword that doubles as ordinary prose is never suppressed on a mention basis — keeping the change firmly on the false-positive-only side.
Class Method Summary
-
.assigns_literal?(line : String, name : String) : Bool
True when
nameis assigned a (non-empty) value on the line —NAME=…,NAME: …,"NAME": …,NAME => …. -
.comment_line?(line : String) : Bool
True when
line's leading non-space content is a comment marker. -
.matched_secret_name(rule : PassiveScan, line : String) : String | Nil
The first env-var-name-shaped
wordpattern ofrulethat occurs on the line, or nil. -
.regex_value_hit?(rule : PassiveScan, line : String) : Bool
True when any value-shape (
regex) matcher ofrulematches the line — mirrors detect.cr's matching so the gate above agrees with what actually fired. -
.secret_reference?(line : String) : Bool
True when
lineexposes its secret-bearing value only through an indirection (env read / templating) or a placeholder — i.e. -
.suppress?(category : String, line : String) : Bool
Decide whether a result on
linefor a rule ofcategoryshould be dropped as a false positive. -
.suppress?(rule : PassiveScan, line : String) : Bool
Rule-aware suppression.
Class Method Detail
True when name is assigned a (non-empty) value on the line —
NAME=…, NAME: …, "NAME": …, NAME => …. A bare mention
("DATABASE_URL", env.delete("DATABASE_URL"), prose) does not
match, so it is treated as a non-secret reference.
True when line's leading non-space content is a comment marker.
The first env-var-name-shaped word pattern of rule that occurs
on the line, or nil. Literal markers like -----BEGIN …----- and
bare lowercase words are not env-var-shaped and are excluded.
True when any value-shape (regex) matcher of rule matches the
line — mirrors detect.cr's matching so the gate above agrees with
what actually fired.
True when line exposes its secret-bearing value only through an
indirection (env read / templating) or a placeholder — i.e. there
is no literal secret on the line to leak.
Decide whether a result on line for a rule of category should be
dropped as a false positive. Only secret findings are eligible;
everything else passes through unchanged. This category-only form is
the reference/placeholder check; the rule-aware overload below adds
matcher-type gating and the "merely mentioned" heuristic.
Rule-aware suppression. In addition to the reference/placeholder check it gates on which matcher fired:
- If a value-shape
regexmatcher hits the line, a real secret literal is present (ghp_…,AKIA…, a credentialed URL) — high confidence, never suppressed. - Otherwise the finding is backed only by a
wordmatcher on a variable name. When that name is merely mentioned — in a comment, a string literal, prose, a dependency list — rather than assigned a literal value, it is not a leaked secret.