Implement log probabilities for chat completions #2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Log probability support was introduced across Ollama's OpenAI-compatible endpoints.
Key changes include:
api/types.go
:LogProbs
struct was defined to hold token-level log probabilities, token IDs, and top-N log probability maps.ChatResponse
andGenerateResponse
structs were extended with an optional*LogProbs
field.Options
struct gainedLogProbsEnabled
(bool) andTopLogProbs
(int) fields to control logprob generation.llm/server.go
:completion
struct was updated to parse log probability data from the backend.Completion
function now forwardslogprobs
andtop_logprobs
parameters to the core generation request and populates theLogProbs
field inCompletionResponse
.openai/openai.go
:Choice
,ChunkChoice
,CompleteChunkChoice
) were updated to includeLogprobs
.ChatCompletionRequest
,CompletionRequest
) now acceptlogprobs
andtop_logprobs
parameters, which are then mapped to the internalapi.Options
.api.LogProbs
to the OpenAI schema.server/routes.go
:GenerateHandler
andChatHandler
were updated to include theLogProbs
field in the final API responses.This enables clients to request and receive per-token log probability information (with configurable top-N) in both normal and streaming modes, maintaining compatibility with the OpenAI chat completions schema.