class Cadmium::Tokenizer::WordPunctuation
Defined in:
cadmium/tokenizer/word_punctuation.crConstant Summary
-
REGEX_PATTERN =
/(\w+|[а-я0-9_]+|\.|\!|\'|\"")/i
/(\w+|[а-я0-9_]+|\.|\!|\'|\"")/i
Cadmium::Tokenizer::Regex
Cadmium::Tokenizer::Regex
Cadmium::Tokenizer::Base
Cadmium::Tokenizer::Diacritics
Cadmium::Tokenizer::StopWords