alias Regex::CompileOptions

Overview

Represents compile options passed to Regex.new.

This alias is supposed to replace Options.

Alias Definition

Regex::Options

Defined in:

regex.cr

Constant Summary

All = 6980118591_u64
ANCHORED = 16_u64

Force pattern anchoring at the start of the subject.

DOLLAR_ENDONLY = 32_u64
DOTALL = 2_u64
ENDANCHORED = 2147483648_u64

Force pattern anchoring at the end of the subject.

Unsupported with PCRE.

EXTENDED = 8_u64

Ignore white space and # comments.

FIRSTLINE = 262144_u64
IGNORE_CASE = 1_u64

Case insensitive match.

MATCH_INVALID_UTF = 4294967296_u64

Enable matching against subjects containing invalid UTF bytes. Invalid bytes never match anything. The entire subject string is effectively split into segments of valid UTF.

Read more in the PCRE2 documentation.

When this option is set, MatchOptions::NO_UTF_CHECK is ignored at match time.

Unsupported with PCRE.

NOTE This option was introduced in PCRE2 10.34 but a bug that can lead to an infinite loop is only fixed in 10.36 (https://github.com/PCRE2Project/pcre2/commit/e0c6029a62db9c2161941ecdf459205382d4d379).

MULTILINE = 6_u64

Multiline matching.

Equivalent to MULTILINE | DOTALL in PCRE and PCRE2.

MULTILINE_ONLY = 4_u64

Equivalent to MULTILINE in PCRE and PCRE2.

NO_UTF_CHECK = 8192_u64

Do not check the pattern for valid UTF encoding.

This option is potentially dangerous and must only be used when the pattern is guaranteed to be valid (e.g. String#valid_encoding?). Failing to do so can lead to undefined behaviour in the regex library and may crash the entire process.

NOTE String is supposed to be valid UTF-8, but this is not guaranteed or enforced. Especially data originating from external sources should not be trusted.

UTF validation is comparatively expensive, so skipping it can produce a significant performance improvement.

pattern = "fo+"
if pattern.valid_encoding?
  regex = Regex.new(pattern, options: Regex::CompileOptions::NO_UTF_CHECK)
  regex.match(foo)
else
  raise "Invalid UTF in regex pattern"
end

The standard library implicitly applies this option when it can be sure about the patterns's validity (e.g. on repeated matches in String#gsub).

None = 0_u64