class PiiTagger
Overview
Flags endpoints that accept personally identifiable information (PII) or other sensitive personal data. These endpoints are prime review targets for data exposure, broken object-level authorization, and sensitive-data logging — knowing which routes touch PII lets a reviewer (or an AI consumer) prioritize accordingly.
Defined in:
tagger/taggers/pii.crConstant Summary
-
MEDIUM_NAMES =
Set {"email", "e_mail", "email_address", "phone", "phone_number", "phone_no", "mobile", "mobile_number", "mobile_phone", "cell_phone", "telephone", "first_name", "last_name", "full_name", "fullname", "given_name", "family_name", "middle_name", "maiden_name", "address", "street_address", "mailing_address", "billing_address", "shipping_address", "home_address", "postal_code", "zip", "zipcode", "zip_code", "gender", "nationality", "birthday", "place_of_birth", "marital_status"} -
Weaker individually (a single one shows up in countless benign forms), so require at least two before tagging.
-
STRONG_NAMES =
Set {"ssn", "social_security", "social_security_number", "credit_card", "creditcard", "credit_card_number", "creditcardnumber", "card_number", "cardnumber", "card_no", "cardno", "cc_number", "ccnumber", "cvv", "cvc", "cvv2", "card_cvv", "card_security_code", "cardholder_name", "card_holder", "card_expiry", "card_expiration", "passport", "passport_number", "passport_no", "passportno", "national_id", "nationalid", "national_identity", "national_insurance_number", "tax_id", "taxid", "tax_number", "taxnumber", "aadhaar", "aadhar", "aadhaar_number", "aadhar_number", "drivers_license", "driver_license", "license_number", "iban", "bank_account", "bank_account_number", "routing_number", "sort_code", "date_of_birth", "dob", "birthdate", "birth_date"} -
Unambiguous, high-signal identifiers. A single one is enough to flag the endpoint because these names rarely appear outside a PII context.
-
STRONG_TOKENS =
Set {"ssn", "cvv", "cvc", "cvv2", "iban", "passport", "dob", "aadhaar", "aadhar"} -
Single, unambiguous tokens. Matched anywhere in a (normalized) param name so compound names like
userSsn,customer_cvv, orapplicantPassportare caught without enumerating every prefix.