class Cadmium::Glove::Model
- Cadmium::Glove::Model
- Reference
- Object
Defined in:
glove/model.crConstant Summary
-
BIAS_FILE =
"word_biases.json"
-
COOC_FILE =
"cooc_matrix.json"
-
CORPUS_FILE =
"corpus.json"
-
VEC_FILE =
"word_vectors.json"
Constructors
-
.new(num_components : Int32 = 30, epochs : Int32 = 25, threads : Int32 = 4, learning_rate : Float64 = 0.05, alpha : Float64 = 0.75, max_count : Int32 = 100)
Creates a new
Glove::Model
instance.
Class Method Summary
-
.load(dir, corpus_file = CORPUS_FILE, cooc_file = COOC_FILE, vec_file = VEC_FILE, bias_file = BIAS_FILE, **options)
Create a new model from an existing dataset.
Instance Method Summary
- #alpha : Float64
- #alpha=(alpha : Float64)
-
#analogy_words(word1, word2, target, num = 3, accuracy = 1e-4)
Get a word that relates to
target
likeword1
relates toword2
. - #cooc_matrix : Apatite::Matrix(Float64)
- #cooc_matrix=(cooc_matrix : Apatite::Matrix(Float64))
- #corpus
- #corpus=(corpus : Corpus | Nil)
- #epochs : Int32
- #epochs=(epochs : Int32)
-
#fit(text, **options)
Fit a String or
Glove::Corpus
instance and build a co-occurrence matrix. - #learning_rate : Float64
- #learning_rate=(learning_rate : Float64)
-
#load(dir, corpus_file = CORPUS_FILE, cooc_file = COOC_FILE, vec_file = VEC_FILE, bias_file = BIAS_FILE)
Loads training data from already existing files.
- #max_count : Int32
- #max_count=(max_count : Int32)
-
#most_similar(word, num = 3)
Get most similar words to
word
. - #num_components : Int32
- #num_components=(num_components : Int32)
-
#save(outdir, corpus_file = CORPUS_FILE, cooc_file = COOC_FILE, vec_file = VEC_FILE, bias_file = BIAS_FILE)
Save trained data to files
- #threads : Int32
- #threads=(threads : Int32)
- #token_index : Hash(String, Int32)
- #token_index=(token_index : Hash(String, Int32))
- #token_pairs : Array(TokenPair)
- #token_pairs=(token_pairs : Array(TokenPair))
-
#train
Train the model.
-
#vector(word)
Find the vector row of @word_vec for a given word.
-
#vector_distance(word : String | Apatite::Vector)
Calculates the cosine distance of all the words in the vocabulary against a given word.
-
#visualize
TODO Generate a graph of the word vector matrix
- #word_biases : Array(Float64)
- #word_biases=(word_biases : Array(Float64))
- #word_vec : Apatite::Matrix(Float64)
- #word_vec=(word_vec : Apatite::Matrix(Float64))
Constructor Detail
Creates a new Glove::Model
instance.
Class Method Detail
Create a new model from an existing dataset.
Instance Method Detail
Get a word that relates to target
like word1
relates to word2
.
Example:
model.analogy_words("quantum", "physics", "atom")
# => [{"electron", 0.98583}, {"energi", 0.98151}, {"photon",0.96650}]
Loads training data from already existing files.
Save trained data to files
Calculates the cosine distance of all the words in the vocabulary against a given word. Results are then sorted in DESC order.