class OpenAI::CompletionRequest

Included Modules

Extended Modules

Defined in:

openai/api/completion.cr

Constructors

Instance Method Summary

Constructor Detail

def self.new(pull : JSON::PullParser) #

[View source]
def self.new(model : String, prompt : Array(String) | String = "", suffix : Nil | String = nil, max_tokens : Int32 = 16, temperature : Float64 = 1.0, top_p : Float64 = 1.0, num_completions : Int32 = 1, stream : Bool = false, logprobs : Int32 | Nil = nil, echo : Bool = false, stop : Array(String) | String | Nil = nil, presence_penalty : Float64 = 0.0, frequency_penalty : Float64 = 0.0, best_of : Int32 = 1, logit_bias : Nil | Hash(String, Float64) = nil, user : Nil | String = nil) #

[View source]

Instance Method Detail

def best_of : Int32 #

Generates best_of completions server-side and returns the "best" (the one with the highest log probability per token). Results cannot be streamed. When used with n, best_of controls the number of candidate completions and n specifies how many to return – best_of must be greater than n.

Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for max_tokens and stop.


[View source]
def best_of=(best_of : Int32) #

Generates best_of completions server-side and returns the "best" (the one with the highest log probability per token). Results cannot be streamed. When used with n, best_of controls the number of candidate completions and n specifies how many to return – best_of must be greater than n.

Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for max_tokens and stop.


[View source]
def echo : Bool #

Echo back the prompt in addition to the completion


[View source]
def echo=(echo : Bool) #

Echo back the prompt in addition to the completion


[View source]
def frequency_penalty : Float64 #

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.


[View source]
def frequency_penalty=(frequency_penalty : Float64) #

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.


[View source]
def logit_bias : Hash(String, Float64) | Nil #

Modify the likelihood of specified tokens appearing in the completion. Accepts a json object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. You can use this tokenizer tool (which works for both GPT-2 and GPT-3) to convert text to token IDs. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.

As an example, you can pass {"50256": -100} to prevent the <|endoftext|> token from being generated.


[View source]
def logit_bias=(logit_bias : Hash(String, Float64) | Nil) #

Modify the likelihood of specified tokens appearing in the completion. Accepts a json object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. You can use this tokenizer tool (which works for both GPT-2 and GPT-3) to convert text to token IDs. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.

As an example, you can pass {"50256": -100} to prevent the <|endoftext|> token from being generated.


[View source]
def logprobs : Int32 | Nil #

Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. For example, if logprobs is 5, the API will return a list of the 5 most likely tokens. The API will always return the logprob of the sampled token, so there may be up to logprobs+1 elements in the response.


[View source]
def logprobs=(logprobs : Int32 | Nil) #

Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. For example, if logprobs is 5, the API will return a list of the 5 most likely tokens. The API will always return the logprob of the sampled token, so there may be up to logprobs+1 elements in the response.


[View source]
def max_tokens : Int32 #

The maximum number of tokens to generate in the completion. The token count of your prompt plus max_tokens cannot exceed the model's context length.


[View source]
def max_tokens=(max_tokens : Int32) #

The maximum number of tokens to generate in the completion. The token count of your prompt plus max_tokens cannot exceed the model's context length.


[View source]
def model : String #

the model id


[View source]
def model=(model : String) #

the model id


[View source]
def num_completions : Int32 #

[View source]
def num_completions=(num_completions : Int32) #

[View source]
def presence_penalty : Float64 #

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.


[View source]
def presence_penalty=(presence_penalty : Float64) #

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.


[View source]
def prompt : String | Array(String) #

The prompt(s) to generate completions for, encoded as a string, array of strings, array of tokens, or array of token arrays. Note that <|endoftext|> is the document separator that the model sees during training, so if a prompt is not specified the model will generate as if from the beginning of a new document.


[View source]
def prompt=(prompt : String | Array(String)) #

The prompt(s) to generate completions for, encoded as a string, array of strings, array of tokens, or array of token arrays. Note that <|endoftext|> is the document separator that the model sees during training, so if a prompt is not specified the model will generate as if from the beginning of a new document.


[View source]
def stop : String | Array(String) | Nil #

Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.


[View source]
def stop=(stop : String | Array(String) | Nil) #

Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.


[View source]
def stream : Bool #

Whether to stream back partial progress. If set, tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message.


[View source]
def stream=(stream : Bool) #

Whether to stream back partial progress. If set, tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message.


[View source]
def suffix : String | Nil #

The suffix that comes after a completion of inserted text.


[View source]
def suffix=(suffix : String | Nil) #

The suffix that comes after a completion of inserted text.


[View source]
def temperature : Float64 #

What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.


[View source]
def temperature=(temperature : Float64) #

What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.


[View source]
def top_p : Float64 #

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. Alter this or temperature but not both. We generally recommend altering this or temperature but not both.


[View source]
def top_p=(top_p : Float64) #

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. Alter this or temperature but not both. We generally recommend altering this or temperature but not both.


[View source]
def user : String | Nil #

A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse.


[View source]
def user=(user : String | Nil) #

A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse.


[View source]