class Wakame::MeCab
- Wakame::MeCab
- Reference
- Object
Overview
MeCab
is the primary class to interact with the MeCab library.
It can be initialized by passing either a complete Wakame::Options
object,
or a series of arguments to be passed to the underlying Wakame::Options
object,
or a string of option arguments in command line format.
require "wakame"
mecab = Wakame::MeCab.new
puts mecab.parse("吾輩は猫である。名前はまだ無い。")
# => 吾輩 名詞,代名詞,一般,*,*,*,吾輩,ワガハイ,ワガハイ
# は 助詞,係助詞,*,*,*,*,は,ハ,ワ
# 猫 名詞,一般,*,*,*,*,猫,ネコ,ネコ
# で 助動詞,*,*,*,特殊・ダ,連用形,だ,デ,デ
# ある 助動詞,*,*,*,五段・ラ行アル,基本形,ある,アル,アル
# 。 記号,句点,*,*,*,*,。,。,。
# 名前 名詞,一般,*,*,*,*,名前,ナマエ,ナマエ
# は 助詞,係助詞,*,*,*,*,は,ハ,ワ
# まだ 副詞,助詞類接続,*,*,*,*,まだ,マダ,マダ
# 無い 形容詞,自立,*,*,形容詞・アウオ段,基本形,無い,ナイ,ナイ
# 。 記号,句点,*,*,*,*,。,。,。
# EOS
mecab.parse("吾輩は猫である。名前はまだ無い。") do |node|
puts "#{node.surface},#{node.feature}" if !node.bos_node? && !node.eos_node?
end
# => 吾輩,名詞,代名詞,一般,*,*,*,吾輩,ワガハイ,ワガハイ
# は,助詞,係助詞,*,*,*,*,は,ハ,ワ
# 猫,名詞,一般,*,*,*,*,猫,ネコ,ネコ
# で,助動詞,*,*,*,特殊・ダ,連用形,だ,デ,デ
# ある,助動詞,*,*,*,五段・ラ行アル,基本形,ある,アル,アル
# 。,記号,句点,*,*,*,*,。,。,。
# 名前,名詞,一般,*,*,*,*,名前,ナマエ,ナマエ
# は,助詞,係助詞,*,*,*,*,は,ハ,ワ
# まだ,副詞,助詞類接続,*,*,*,*,まだ,マダ,マダ
# 無い,形容詞,自立,*,*,形容詞・アウオ段,基本形,無い,ナイ,ナイ
# 。,記号,句点,*,*,*,*,。,。,。
# These two are equivalent
mecab = Wakame::MeCab.new(node_format: "%pS%f[7]\\s", eos_format: "\\0")
puts mecab.parse("吾輩は猫である。名前はまだ無い。")
# => ワガハイ ハ ネコ デ アル 。 ナマエ ハ マダ ナイ 。
mecab = Wakame::MeCab.new("-F %pS%f[7]\\s -E \\0")
puts mecab.parse("吾輩は猫である。名前はまだ無い。")
# => ワガハイ ハ ネコ デ アル 。 ナマエ ハ マダ ナイ 。
Defined in:
wakame/wakame.crConstructors
-
.new(option_str : String)
Creates a new MeCab instance from the given string of option arguments in the style of MeCab's command line interface.
-
.new(options : Options)
Creates a new MeCab instance with the given
Wakame::Options
object. -
.new(**option_args)
Creates a new MeCab instance from the given option arguments.
Instance Method Summary
- #dicts : Array(Wakame::DictionaryInfo)
- #lattice : Pointer(LibMeCab::LatticeT)
- #libpath
- #model : Pointer(LibMeCab::ModelT)
- #options : Wakame::Options
-
#parse(text : String, boundary_constraints : Regex | String)
Parses the given text with boundary constraints, returning the MeCab output as a single String.
-
#parse(text : String, feature_constraints : Hash(String, String))
Parses the given text with feature constraints, returning the MeCab output as a single String.
-
#parse(text : String)
Parses the given text, returning the MeCab output as a single String.
-
#parse(text : String, boundary_constraints : Regex | String, &block : MeCabNode -> )
Parses the given text with boundary constraints, yielding each node to the given block.
-
#parse(text : String, feature_constraints : Hash(String, String), &block : MeCabNode -> )
Parses the given text with feature constraints, yielding each node to the given block.
-
#parse(text : String, &block : MeCabNode -> )
Parses the given text, yielding each node to the given block.
- #tagger : Pointer(LibMeCab::T)
- #version : Pointer(UInt8)
Constructor Detail
Creates a new MeCab instance from the given string of option arguments in the style of MeCab's command line interface.
Creates a new MeCab instance with the given Wakame::Options
object.
Creates a new MeCab instance from the given option arguments.
These arguments are first forwarded to instantiate the underlying
Wakame::Options
object that is needed to instantiate itself.
See Wakame::Options
for the all available options.
Instance Method Detail
Parses the given text with boundary constraints, returning the MeCab output as a single String.
Boundary constraints provide hints to MeCab on where the morpheme boundaries are located in the given text.
mecab = Wakame::MeCab.new
# Without using boundary constraints
puts mecab.parse("外国人参政権")
# => 外国 名詞,一般,*,*,*,*,外国,ガイコク,ガイコク
# 人参 名詞,一般,*,*,*,*,人参,ニンジン,ニンジン
# 政権 名詞,一般,*,*,*,*,政権,セイケン,セイケン
# EOS
# Giving MeCab hints with boundary constraints
puts mecab.parse("外国人参政権", /外国|人/)
# => 外国 名詞,一般,*,*,*,*,外国,ガイコク,ガイコク
# 人 名詞,接尾,一般,*,*,*,人,ジン,ジン
# 参政 名詞,サ変接続,*,*,*,*,参政,サンセイ,サンセイ
# 権 名詞,接尾,一般,*,*,*,権,ケン,ケン
# EOS
Parses the given text with feature constraints, returning the MeCab output as a single String.
Feature constraints provide instructions to MeCab to use a specific feature for any morphemes that match the given key. Set the morpheme String as a key and the feature String as the value. Wildcard "*" can be used as the feature to let MeCab decide which feature to use.
mecab = Wakame::MeCab.new
# Without using feature constraints
puts mecab.parse("邪神ちゃんドロップキーック!")
# => 邪神 名詞,一般,*,*,*,*,邪神,ジャシン,ジャシン
# ちゃん 名詞,接尾,人名,*,*,*,ちゃん,チャン,チャン
# ドロップキーック 名詞,一般,*,*,*,*,*
# ! 記号,一般,*,*,*,*,!,!,!
# EOS
# Giving MeCab hints with feature constraints
puts mecab.parse("邪神ちゃんドロップキーック!", {"邪神ちゃん" => "*", "キーック" => "*"})
# => 邪神ちゃん 名詞,一般,*,*,*,*,*
# ドロップ 名詞,一般,*,*,*,*,ドロップ,ドロップ,ドロップ
# キーック 名詞,一般,*,*,*,*,*
# ! 記号,一般,*,*,*,*,!,!,!
# EOS
Parses the given text with boundary constraints, yielding each node to the given block.
See #parse(text : String, boundary_constraints : Regex | String)
for details.
Parses the given text with feature constraints, yielding each node to the given block.
See #parse(text : String, feature_constraints : Hash(String, String))
for details.
Parses the given text, yielding each node to the given block.