Open Sidebar
Close Sidebar
CrystalDoc.info
Source code
Shards.info
cadmium_tokenizer
master
Cadmium
StringExtension
Syllable
Guess
Tokenizer
Aggressive
Base
Case
Diacritics
Paragraph
Pragmatic
Languages
Bulgarian
Common
Czech
Deutsch
English
French
Portuguese
Spanish
MentionsOptions
NumbersOptions
PunctuationOptions
Regex
Sentence
StopWords
TreebankWord
VisibleChar
Whitespace
Word
WordPunctuation
Built with Crystal 1.9.2
2023-10-07 10:26:13 UTC
class
Cadmium::Tokenizer::Case
Cadmium::Tokenizer::Case
Cadmium::Tokenizer::Base
Reference
Object
Defined in:
cadmium/tokenizer/case.cr
Constructors
.new
(preserve_apostrophe : Bool =
false
)
Instance Method Summary
#tokenize
(string : String) : Array(String)
Instance methods inherited from class
Cadmium::Tokenizer::Base
tokenize(string : String) : Array(String)
tokenize
,
trim(arr)
trim
Instance methods inherited from module
Cadmium::Tokenizer::Diacritics
remove_diacritics(str : String)
remove_diacritics
Instance methods inherited from module
Cadmium::Tokenizer::StopWords
add_stopwords_list(language : Symbol)
add_stopwords_list
Constructor Detail
def self.
new
(preserve_apostrophe : Bool =
false
)
#
[
View source
]
Instance Method Detail
def
tokenize
(string : String) : Array(String)
#
[
View source
]