class Llama::Batch

Overview

Wrapper for the llama_batch structure Provides methods for managing batches of tokens for efficient processing

Defined in:

llama/batch.cr
llama/batch/error.cr

Constructors

Instance Method Summary

Constructor Detail

def self.for_embeddings(embeddings : Array(Array(Float32)), seq_ids : Array(Int32) | Nil = nil, n_seq_max : Int32 = 8) : Batch #

Creates a batch for embeddings with optional parameters

Parameters:

  • embeddings: Array of embedding vectors
  • seq_ids: Sequence IDs to use for all embeddings (default: nil)
  • n_seq_max: Maximum number of sequence IDs per token (default: 8)

Returns:

  • A new Batch instance configured with the provided embeddings

Raises:

  • ArgumentError if embeddings array is empty or contains empty embeddings
  • Llama::Batch::Error if batch creation fails

[View source]
def self.for_tokens(tokens : Array(Int32), compute_logits_for_last : Bool = true, seq_ids : Array(Int32) | Nil = nil, n_seq_max : Int32 = 8) : Batch #

Creates a batch for a sequence of tokens with optional parameters

Parameters:

  • tokens: Array of token IDs
  • compute_logits_for_last: Whether to compute logits only for the last token
  • seq_ids: Sequence IDs to use for all tokens (default: nil)
  • n_seq_max: Maximum number of sequence IDs per token (default: 8)

Returns:

  • A new Batch instance configured with the provided tokens

Raises:

  • ArgumentError if tokens array is empty
  • Llama::Batch::Error if batch creation fails

[View source]
def self.get_one(tokens : Array(Int32)) : Batch #

Creates a new Batch for a single sequence of tokens

Parameters:

  • tokens: Array of token IDs

Returns:

  • A new Batch instance

Raises:

  • Llama::Batch::Error if the batch cannot be created

[View source]
def self.new(n_tokens : Int32, embd : Int32 = 0, n_seq_max : Int32 = 8) #

Creates a new Batch instance with the specified parameters

Parameters:

  • n_tokens: Maximum number of tokens this batch can hold
  • embd: Embedding dimension (0 for token-based batch, >0 for embedding-based batch)
  • n_seq_max: Maximum number of sequence IDs per token (default: 8) Raises:
  • ArgumentError if parameters are invalid
  • Llama::Batch::Error if the batch cannot be created

[View source]
def self.new(handle : LibLlama::LlamaBatch, owned : Bool = false, n_seq_max : Int32 = 8) #

Creates a new Batch instance from a raw llama_batch structure

Note: This constructor is intended for internal use. The batch created this way is not owned by this wrapper and will not be freed.


[View source]

Instance Method Detail

def add_tokens(tokens : Array(Int32), pos_offset : Int32 = 0, seq_ids : Array(Int32) | Nil = nil, compute_logits : Bool = true) #

Adds multiple tokens to the batch

Parameters:

  • tokens: Array of token IDs to add
  • pos_offset: Position offset for the tokens (default: 0)
  • seq_ids: Sequence IDs for all tokens (default: [0])
  • compute_logits: Whether to compute logits for all tokens (default: true)

Raises:

  • ArgumentError if tokens array is empty
  • IndexError if the batch doesn't have enough space
  • Llama::Batch::Error if memory allocation fails

[View source]
def finalize #

Frees the resources associated with this batch


[View source]
def has_crystal_token=(value : Bool) #

Setter for has_crystal_token


[View source]
def n_tokens : Int32 #

Returns the number of tokens in this batch


[View source]
def set_embedding(i : Int32, embedding : Array(Float32), pos : Int32 | Nil = nil, seq_ids : Array(Int32) | Nil = nil, logits : Bool | Nil = nil) #

Sets an embedding at the specified index

Parameters:

  • i: Index in the batch
  • embedding: Array of embedding values
  • pos: Position of the embedding in the sequence (nil for auto-position)
  • seq_ids: Sequence IDs (nil for default sequence 0)
  • logits: Whether to compute logits for this embedding (nil for default)

Raises:

  • IndexError if the index is out of bounds
  • ArgumentError if the batch is not embedding-based
  • Llama::Batch::Error if memory allocation fails

[View source]
def set_token(i : Int32, token : Int32, pos : Int32 | Nil = nil, seq_ids : Array(Int32) | Nil = nil, logits : Bool | Nil = nil) #

Sets a token at the specified index

Parameters:

  • i: Index in the batch
  • token: Token ID to set
  • pos: Position of the token in the sequence (nil for auto-position)
  • seq_ids: Sequence IDs (nil for default sequence 0)
  • logits: Whether to compute logits for this token (nil for default)

Raises:

  • IndexError if the index is out of bounds
  • Llama::Batch::Error if memory allocation fails

[View source]
def to_unsafe : Llama::LibLlama::LlamaBatch #

Returns the raw pointer to the underlying llama_batch structure


[View source]