class Llama::Model

Llama::Model
Reference
Object

Overview

Wrapper for the llama_model structure

Defined in:

llama/model.cr
llama/model/error.cr

Constructors

.new(path : String, n_gpu_layers : Int32 = 0, use_mmap : Bool = true, use_mlock : Bool = false, vocab_only : Bool = false)
Creates a new Model instance by loading a model from a file.

Instance Method Summary

#chat_template(name : String | Nil = nil) : String | Nil
Gets the default chat template for this model
#context(*args, **options) : Context
Creates a new Context for this model
#decoder_start_token : Int32
Returns the token that must be provided to the decoder to start generating output For encoder-decoder models, returns the decoder start token For other models, returns -1
#description : String
Gets a string describing the model type
#finalize
Frees the resources associated with this model
#has_decoder? : Bool
Returns whether the model contains a decoder
#has_encoder? : Bool
Returns whether the model contains an encoder
#metadata : Hash(String, String)
Gets all metadata as a hash
#metadata_count : Int32
Gets the number of metadata key/value pairs
#metadata_key_at(i : Int32) : String | Nil
Gets a metadata key name by index
#metadata_value(key : String) : String | Nil
Gets a metadata value as a string by key name
#metadata_value_at(i : Int32) : String | Nil
Gets a metadata value as a string by index
#model_size : UInt64
Returns the total size of all the tensors in the model in bytes
#n_embd : Int32
Returns the number of embedding dimensions in the model
#n_head : Int32
Returns the number of attention heads in the model
#n_layer : Int32
Returns the number of layers in the model
#n_params : UInt64
Returns the number of parameters in the model
#recurrent? : Bool
Returns whether the model is recurrent (like Mamba, RWKV, etc.)
#rope_freq_scale_train : Float32
Returns the model's RoPE frequency scaling factor
#to_unsafe : Pointer(Llama::LibLlama::LlamaModel)
Returns the raw pointer to the underlying llama_model structure
#vocab : Vocab
Returns the vocabulary associated with this model

Constructor Detail

def self.new(path : String, n_gpu_layers : Int32 = 0, use_mmap : Bool = true, use_mlock : Bool = false, vocab_only : Bool = false) #

Creates a new Model instance by loading a model from a file.

Parameters:

path: Path to the model file (.gguf format).
n_gpu_layers: Number of layers to store in VRAM (default: 0). If 0, all layers are loaded to the CPU.
use_mmap: Use mmap if possible (default: true). Reduces memory usage.
use_mlock: Force the system to keep the model in RAM (default: false). May improve performance but increases memory usage.
vocab_only: Only load the vocabulary, no weights (default: false). Useful for inspecting the vocabulary.

Raises:

Llama::Model::Error if the model cannot be loaded.

[View source]

Instance Method Detail

def chat_template(name : String | Nil = nil) : String | Nil #

Gets the default chat template for this model

Parameters:

name: Optional template name (nil for default)

Returns:

The chat template string, or nil if not available

[View source]

def context(*args, **options) : Context #

Creates a new Context for this model

This method delegates to Context.new, passing self as the model parameter and forwarding all other arguments.

Returns:

A new Context instance

Raises:

Llama::Context::Error if the context cannot be created

[View source]

def decoder_start_token : Int32 #

Returns the token that must be provided to the decoder to start generating output For encoder-decoder models, returns the decoder start token For other models, returns -1

[View source]

def description : String #

Gets a string describing the model type

Returns:

A description of the model

[View source]

def finalize #

Frees the resources associated with this model

[View source]

def has_decoder? : Bool #

Returns whether the model contains a decoder

[View source]

def has_encoder? : Bool #

Returns whether the model contains an encoder

[View source]

def metadata : Hash(String, String) #

Gets all metadata as a hash

Returns:

A hash mapping metadata keys to values

[View source]

def metadata_count : Int32 #

Gets the number of metadata key/value pairs

Returns:

The number of metadata entries

[View source]

def metadata_key_at(i : Int32) : String | Nil #

Gets a metadata key name by index

Parameters:

i: The index of the metadata entry

Returns:

The key name, or nil if the index is out of bounds

[View source]

def metadata_value(key : String) : String | Nil #

Gets a metadata value as a string by key name

Parameters:

key: The metadata key to look up

Returns:

The metadata value as a string, or nil if not found

[View source]

def metadata_value_at(i : Int32) : String | Nil #

Gets a metadata value as a string by index

Parameters:

i: The index of the metadata entry

Returns:

The value as a string, or nil if the index is out of bounds

[View source]

def model_size : UInt64 #

Returns the total size of all the tensors in the model in bytes

Returns:

The total size of all tensors in the model (in bytes)

[View source]

def n_embd : Int32 #

Returns the number of embedding dimensions in the model

[View source]

def n_head : Int32 #

Returns the number of attention heads in the model

[View source]

def n_layer : Int32 #

Returns the number of layers in the model

[View source]

def n_params : UInt64 #

Returns the number of parameters in the model

[View source]

def recurrent? : Bool #

Returns whether the model is recurrent (like Mamba, RWKV, etc.)

[View source]

def rope_freq_scale_train : Float32 #

Returns the model's RoPE frequency scaling factor

[View source]

def to_unsafe : Pointer(Llama::LibLlama::LlamaModel) #

Returns the raw pointer to the underlying llama_model structure

[View source]

def vocab : Vocab #

Returns the vocabulary associated with this model

[View source]

CrystalDoc.info

llama

class Llama::Model

Overview

Defined in:

Constructors

Instance Method Summary

Constructor Detail

Instance Method Detail