class Wakame::MeCab

Overview

MeCab is the primary class to interact with the MeCab library.

It can be initialized by passing either a complete Wakame::Options object, or a series of arguments to be passed to the underlying Wakame::Options object, or a string of option arguments in command line format.

require "wakame"

mecab = Wakame::MeCab.new
puts mecab.parse("吾輩は猫である。名前はまだ無い。")
# => 吾輩    名詞,代名詞,一般,*,*,*,吾輩,ワガハイ,ワガハイ
#    は      助詞,係助詞,*,*,*,*,は,ハ,ワ
#    猫      名詞,一般,*,*,*,*,猫,ネコ,ネコ
#    で      助動詞,*,*,*,特殊・ダ,連用形,だ,デ,デ
#    ある    助動詞,*,*,*,五段・ラ行アル,基本形,ある,アル,アル
#    。      記号,句点,*,*,*,*,。,。,。
#    名前    名詞,一般,*,*,*,*,名前,ナマエ,ナマエ
#    は      助詞,係助詞,*,*,*,*,は,ハ,ワ
#    まだ    副詞,助詞類接続,*,*,*,*,まだ,マダ,マダ
#    無い    形容詞,自立,*,*,形容詞・アウオ段,基本形,無い,ナイ,ナイ
#    。      記号,句点,*,*,*,*,。,。,。
#    EOS

mecab.parse("吾輩は猫である。名前はまだ無い。") do |node|
  puts "#{node.surface},#{node.feature}" if !node.bos_node? && !node.eos_node?
end
# => 吾輩,名詞,代名詞,一般,*,*,*,吾輩,ワガハイ,ワガハイ
#    は,助詞,係助詞,*,*,*,*,は,ハ,ワ
#    猫,名詞,一般,*,*,*,*,猫,ネコ,ネコ
#    で,助動詞,*,*,*,特殊・ダ,連用形,だ,デ,デ
#    ある,助動詞,*,*,*,五段・ラ行アル,基本形,ある,アル,アル
#    。,記号,句点,*,*,*,*,。,。,。
#    名前,名詞,一般,*,*,*,*,名前,ナマエ,ナマエ
#    は,助詞,係助詞,*,*,*,*,は,ハ,ワ
#    まだ,副詞,助詞類接続,*,*,*,*,まだ,マダ,マダ
#    無い,形容詞,自立,*,*,形容詞・アウオ段,基本形,無い,ナイ,ナイ
#    。,記号,句点,*,*,*,*,。,。,。

# These two are equivalent
mecab = Wakame::MeCab.new(node_format: "%pS%f[7]\\s", eos_format: "\\0")
puts mecab.parse("吾輩は猫である。名前はまだ無い。")
# => ワガハイ ハ ネコ デ アル 。 ナマエ ハ マダ ナイ 。

mecab = Wakame::MeCab.new("-F %pS%f[7]\\s -E \\0")
puts mecab.parse("吾輩は猫である。名前はまだ無い。")
# => ワガハイ ハ ネコ デ アル 。 ナマエ ハ マダ ナイ 。

Defined in:

wakame/wakame.cr

Constructors

Instance Method Summary

Constructor Detail

def self.new(option_str : String) #

Creates a new MeCab instance from the given string of option arguments in the style of MeCab's command line interface.


[View source]
def self.new(options : Options) #

Creates a new MeCab instance with the given Wakame::Options object.


[View source]
def self.new(**option_args) #

Creates a new MeCab instance from the given option arguments. These arguments are first forwarded to instantiate the underlying Wakame::Options object that is needed to instantiate itself.

See Wakame::Options for the all available options.


[View source]

Instance Method Detail

def dicts : Array(Wakame::DictionaryInfo) #

[View source]
def lattice : Pointer(LibMeCab::LatticeT) #

[View source]
def libpath #

[View source]
def model : Pointer(LibMeCab::ModelT) #

[View source]
def options : Wakame::Options #

[View source]
def parse(text : String, boundary_constraints : Regex | String) #

Parses the given text with boundary constraints, returning the MeCab output as a single String.

Boundary constraints provide hints to MeCab on where the morpheme boundaries are located in the given text.

mecab = Wakame::MeCab.new
# Without using boundary constraints
puts mecab.parse("外国人参政権")
# => 外国    名詞,一般,*,*,*,*,外国,ガイコク,ガイコク
#    人参    名詞,一般,*,*,*,*,人参,ニンジン,ニンジン
#    政権    名詞,一般,*,*,*,*,政権,セイケン,セイケン
#    EOS

# Giving MeCab hints with boundary constraints
puts mecab.parse("外国人参政権", /外国|人/)
# => 外国    名詞,一般,*,*,*,*,外国,ガイコク,ガイコク
#    人      名詞,接尾,一般,*,*,*,人,ジン,ジン
#    参政    名詞,サ変接続,*,*,*,*,参政,サンセイ,サンセイ
#    権      名詞,接尾,一般,*,*,*,権,ケン,ケン
#    EOS

[View source]
def parse(text : String, feature_constraints : Hash(String, String)) #

Parses the given text with feature constraints, returning the MeCab output as a single String.

Feature constraints provide instructions to MeCab to use a specific feature for any morphemes that match the given key. Set the morpheme String as a key and the feature String as the value. Wildcard "*" can be used as the feature to let MeCab decide which feature to use.

mecab = Wakame::MeCab.new
# Without using feature constraints
puts mecab.parse("邪神ちゃんドロップキーック!")
# => 邪神    名詞,一般,*,*,*,*,邪神,ジャシン,ジャシン
#    ちゃん  名詞,接尾,人名,*,*,*,ちゃん,チャン,チャン
#    ドロップキーック        名詞,一般,*,*,*,*,*
#    !      記号,一般,*,*,*,*,!,!,!
#    EOS

# Giving MeCab hints with feature constraints
puts mecab.parse("邪神ちゃんドロップキーック!", {"邪神ちゃん" => "*", "キーック" => "*"})
# => 邪神ちゃん      名詞,一般,*,*,*,*,*
#    ドロップ        名詞,一般,*,*,*,*,ドロップ,ドロップ,ドロップ
#    キーック        名詞,一般,*,*,*,*,*
#    !      記号,一般,*,*,*,*,!,!,!
#    EOS

[View source]
def parse(text : String) #

Parses the given text, returning the MeCab output as a single String.


[View source]
def parse(text : String, boundary_constraints : Regex | String, &block : MeCabNode -> ) #

Parses the given text with boundary constraints, yielding each node to the given block.

See #parse(text : String, boundary_constraints : Regex | String) for details.


[View source]
def parse(text : String, feature_constraints : Hash(String, String), &block : MeCabNode -> ) #

Parses the given text with feature constraints, yielding each node to the given block.

See #parse(text : String, feature_constraints : Hash(String, String)) for details.


[View source]
def parse(text : String, &block : MeCabNode -> ) #

Parses the given text, yielding each node to the given block.


[View source]
def tagger : Pointer(LibMeCab::T) #

[View source]
def version : Pointer(UInt8) #

[View source]