Skip to content

Commit bff58fa

Browse files
release: 1.5.0 (#16)
* chore: update mock server docs * chore(internal): add request options to SSE classes * chore(internal): make `test_proxy_environment_variables` more resilient * chore(internal): make `test_proxy_environment_variables` more resilient to env * feat(api): logprobs and top_logprobs in chat completions API Enable logprobs and top_logprobs in chat completions API * release: 1.5.0 --------- Co-authored-by: stainless-app[bot] <142633134+stainless-app[bot]@users.noreply.github.com>
1 parent 4c155f3 commit bff58fa

File tree

13 files changed

+133
-102
lines changed

13 files changed

+133
-102
lines changed

.release-please-manifest.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
{
2-
".": "1.4.1"
2+
".": "1.5.0"
33
}

.stats.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
configured_endpoints: 7
2-
openapi_spec_url: https://storage.googleapis.com/stainless-sdk-openapi-specs/sambanova%2Fsambanova-67fb54d7a474f361008072996daf572e82688cacbb9b6e83aab3032de2e27146.yml
3-
openapi_spec_hash: caa3a3a58de67026c1dacf4bed4d95de
2+
openapi_spec_url: https://storage.googleapis.com/stainless-sdk-openapi-specs/sambanova%2Fsambanova-9a3a236dd72cf19e3eca6739de243006b82148d4de3bfd2a46e7f6399fcbd658.yml
3+
openapi_spec_hash: 94f49750e6407334d3cfa9ab5d3c4227
44
config_hash: 2daa8a392d338e14be4096c11ce139e8

CHANGELOG.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,21 @@
11
# Changelog
22

3+
## 1.5.0 (2026-03-02)
4+
5+
Full Changelog: [v1.4.1...v1.5.0](https://github.com/sambanova/sambanova-python/compare/v1.4.1...v1.5.0)
6+
7+
### Features
8+
9+
* **api:** logprobs and top_logprobs in chat completions API ([60c5a73](https://github.com/sambanova/sambanova-python/commit/60c5a73a75dfcd2de76030ea1fc768777619767b))
10+
11+
12+
### Chores
13+
14+
* **internal:** add request options to SSE classes ([8bed909](https://github.com/sambanova/sambanova-python/commit/8bed909a64bb234723e77fc5fb8f8d0a92501ff3))
15+
* **internal:** make `test_proxy_environment_variables` more resilient ([8180c2e](https://github.com/sambanova/sambanova-python/commit/8180c2e6f0c54c0217883b13ef178b3b6e094881))
16+
* **internal:** make `test_proxy_environment_variables` more resilient to env ([47cf963](https://github.com/sambanova/sambanova-python/commit/47cf963b0eaf44b022ce08e44630256b10bf8cb4))
17+
* update mock server docs ([e19b34c](https://github.com/sambanova/sambanova-python/commit/e19b34ca6a27e900736fb33b8628a05b39fc6ee0))
18+
319
## 1.4.1 (2026-02-13)
420

521
Full Changelog: [v1.4.0...v1.4.1](https://github.com/sambanova/sambanova-python/compare/v1.4.0...v1.4.1)

CONTRIBUTING.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -88,8 +88,7 @@ $ pip install ./path-to-wheel-file.whl
8888
Most tests require you to [set up a mock server](https://github.com/stoplightio/prism) against the OpenAPI spec to run the tests.
8989

9090
```sh
91-
# you will need npm installed
92-
$ npx prism mock path/to/your/openapi.yml
91+
$ ./scripts/mock
9392
```
9493

9594
```sh

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "sambanova"
3-
version = "1.4.1"
3+
version = "1.5.0"
44
description = "The official Python library for the SambaNova API"
55
dynamic = ["readme"]
66
license = "Apache-2.0"

src/sambanova/_response.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -152,6 +152,7 @@ def _parse(self, *, to: type[_T] | None = None) -> R | _T:
152152
),
153153
response=self.http_response,
154154
client=cast(Any, self._client),
155+
options=self._options,
155156
),
156157
)
157158

@@ -162,6 +163,7 @@ def _parse(self, *, to: type[_T] | None = None) -> R | _T:
162163
cast_to=extract_stream_chunk_type(self._stream_cls),
163164
response=self.http_response,
164165
client=cast(Any, self._client),
166+
options=self._options,
165167
),
166168
)
167169

@@ -175,6 +177,7 @@ def _parse(self, *, to: type[_T] | None = None) -> R | _T:
175177
cast_to=cast_to,
176178
response=self.http_response,
177179
client=cast(Any, self._client),
180+
options=self._options,
178181
),
179182
)
180183

src/sambanova/_streaming.py

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
import json
55
import inspect
66
from types import TracebackType
7-
from typing import TYPE_CHECKING, Any, Generic, TypeVar, Iterator, AsyncIterator, cast
7+
from typing import TYPE_CHECKING, Any, Generic, TypeVar, Iterator, Optional, AsyncIterator, cast
88
from typing_extensions import Self, Protocol, TypeGuard, override, get_origin, runtime_checkable
99

1010
import httpx
@@ -13,6 +13,7 @@
1313

1414
if TYPE_CHECKING:
1515
from ._client import SambaNova, AsyncSambaNova
16+
from ._models import FinalRequestOptions
1617

1718

1819
_T = TypeVar("_T")
@@ -22,7 +23,7 @@ class Stream(Generic[_T]):
2223
"""Provides the core interface to iterate over a synchronous stream response."""
2324

2425
response: httpx.Response
25-
26+
_options: Optional[FinalRequestOptions] = None
2627
_decoder: SSEBytesDecoder
2728

2829
def __init__(
@@ -31,10 +32,12 @@ def __init__(
3132
cast_to: type[_T],
3233
response: httpx.Response,
3334
client: SambaNova,
35+
options: Optional[FinalRequestOptions] = None,
3436
) -> None:
3537
self.response = response
3638
self._cast_to = cast_to
3739
self._client = client
40+
self._options = options
3841
self._decoder = client._make_sse_decoder()
3942
self._iterator = self.__stream__()
4043

@@ -104,7 +107,7 @@ class AsyncStream(Generic[_T]):
104107
"""Provides the core interface to iterate over an asynchronous stream response."""
105108

106109
response: httpx.Response
107-
110+
_options: Optional[FinalRequestOptions] = None
108111
_decoder: SSEDecoder | SSEBytesDecoder
109112

110113
def __init__(
@@ -113,10 +116,12 @@ def __init__(
113116
cast_to: type[_T],
114117
response: httpx.Response,
115118
client: AsyncSambaNova,
119+
options: Optional[FinalRequestOptions] = None,
116120
) -> None:
117121
self.response = response
118122
self._cast_to = cast_to
119123
self._client = client
124+
self._options = options
120125
self._decoder = client._make_sse_decoder()
121126
self._iterator = self.__stream__()
122127

src/sambanova/_version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
# File generated from our OpenAPI spec by Stainless. See CONTRIBUTING.md for details.
22

33
__title__ = "sambanova"
4-
__version__ = "1.4.1" # x-release-please-version
4+
__version__ = "1.5.0" # x-release-please-version

src/sambanova/resources/chat/completions.py

Lines changed: 36 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -132,9 +132,9 @@ def create(
132132
logit_bias: This is not yet supported by our models. Modify the likelihood of specified
133133
tokens appearing in the completion.
134134
135-
logprobs: This is not yet supported by our models. Whether to return log probabilities of
136-
the output tokens or not. If true, returns the log probabilities of each output
137-
token returned in the `content` of `message`.
135+
logprobs: Whether to return log probabilities of the output tokens or not. If true,
136+
returns the log probabilities of each output token returned in the `content` of
137+
`message`.
138138
139139
max_completion_tokens: The maximum number of tokens that can be generated in the chat completion. The
140140
total length of input tokens and generated tokens is limited by the model's
@@ -200,10 +200,9 @@ def create(
200200
means only the first 10 tokens with higher probability are considered. Is
201201
recommended altering this, top_p or temperature but not more than one of these.
202202
203-
top_logprobs: This is not yet supported by our models. An integer between 0 and 20 specifying
204-
the number of most likely tokens to return at each token position, each with an
205-
associated log probability. `logprobs` must be set to `true` if this parameter
206-
is used.
203+
top_logprobs: An integer between 0 and 20 specifying the number of most likely tokens to
204+
return at each token position, each with an associated log probability.
205+
`logprobs` must be set to `true` if this parameter is used.
207206
208207
top_p: Cumulative probability for token choices. An alternative to sampling with
209208
temperature, called nucleus sampling, where the model considers the results of
@@ -312,9 +311,9 @@ def create(
312311
logit_bias: This is not yet supported by our models. Modify the likelihood of specified
313312
tokens appearing in the completion.
314313
315-
logprobs: This is not yet supported by our models. Whether to return log probabilities of
316-
the output tokens or not. If true, returns the log probabilities of each output
317-
token returned in the `content` of `message`.
314+
logprobs: Whether to return log probabilities of the output tokens or not. If true,
315+
returns the log probabilities of each output token returned in the `content` of
316+
`message`.
318317
319318
max_completion_tokens: The maximum number of tokens that can be generated in the chat completion. The
320319
total length of input tokens and generated tokens is limited by the model's
@@ -375,10 +374,9 @@ def create(
375374
means only the first 10 tokens with higher probability are considered. Is
376375
recommended altering this, top_p or temperature but not more than one of these.
377376
378-
top_logprobs: This is not yet supported by our models. An integer between 0 and 20 specifying
379-
the number of most likely tokens to return at each token position, each with an
380-
associated log probability. `logprobs` must be set to `true` if this parameter
381-
is used.
377+
top_logprobs: An integer between 0 and 20 specifying the number of most likely tokens to
378+
return at each token position, each with an associated log probability.
379+
`logprobs` must be set to `true` if this parameter is used.
382380
383381
top_p: Cumulative probability for token choices. An alternative to sampling with
384382
temperature, called nucleus sampling, where the model considers the results of
@@ -487,9 +485,9 @@ def create(
487485
logit_bias: This is not yet supported by our models. Modify the likelihood of specified
488486
tokens appearing in the completion.
489487
490-
logprobs: This is not yet supported by our models. Whether to return log probabilities of
491-
the output tokens or not. If true, returns the log probabilities of each output
492-
token returned in the `content` of `message`.
488+
logprobs: Whether to return log probabilities of the output tokens or not. If true,
489+
returns the log probabilities of each output token returned in the `content` of
490+
`message`.
493491
494492
max_completion_tokens: The maximum number of tokens that can be generated in the chat completion. The
495493
total length of input tokens and generated tokens is limited by the model's
@@ -550,10 +548,9 @@ def create(
550548
means only the first 10 tokens with higher probability are considered. Is
551549
recommended altering this, top_p or temperature but not more than one of these.
552550
553-
top_logprobs: This is not yet supported by our models. An integer between 0 and 20 specifying
554-
the number of most likely tokens to return at each token position, each with an
555-
associated log probability. `logprobs` must be set to `true` if this parameter
556-
is used.
551+
top_logprobs: An integer between 0 and 20 specifying the number of most likely tokens to
552+
return at each token position, each with an associated log probability.
553+
`logprobs` must be set to `true` if this parameter is used.
557554
558555
top_p: Cumulative probability for token choices. An alternative to sampling with
559556
temperature, called nucleus sampling, where the model considers the results of
@@ -784,9 +781,9 @@ async def create(
784781
logit_bias: This is not yet supported by our models. Modify the likelihood of specified
785782
tokens appearing in the completion.
786783
787-
logprobs: This is not yet supported by our models. Whether to return log probabilities of
788-
the output tokens or not. If true, returns the log probabilities of each output
789-
token returned in the `content` of `message`.
784+
logprobs: Whether to return log probabilities of the output tokens or not. If true,
785+
returns the log probabilities of each output token returned in the `content` of
786+
`message`.
790787
791788
max_completion_tokens: The maximum number of tokens that can be generated in the chat completion. The
792789
total length of input tokens and generated tokens is limited by the model's
@@ -852,10 +849,9 @@ async def create(
852849
means only the first 10 tokens with higher probability are considered. Is
853850
recommended altering this, top_p or temperature but not more than one of these.
854851
855-
top_logprobs: This is not yet supported by our models. An integer between 0 and 20 specifying
856-
the number of most likely tokens to return at each token position, each with an
857-
associated log probability. `logprobs` must be set to `true` if this parameter
858-
is used.
852+
top_logprobs: An integer between 0 and 20 specifying the number of most likely tokens to
853+
return at each token position, each with an associated log probability.
854+
`logprobs` must be set to `true` if this parameter is used.
859855
860856
top_p: Cumulative probability for token choices. An alternative to sampling with
861857
temperature, called nucleus sampling, where the model considers the results of
@@ -964,9 +960,9 @@ async def create(
964960
logit_bias: This is not yet supported by our models. Modify the likelihood of specified
965961
tokens appearing in the completion.
966962
967-
logprobs: This is not yet supported by our models. Whether to return log probabilities of
968-
the output tokens or not. If true, returns the log probabilities of each output
969-
token returned in the `content` of `message`.
963+
logprobs: Whether to return log probabilities of the output tokens or not. If true,
964+
returns the log probabilities of each output token returned in the `content` of
965+
`message`.
970966
971967
max_completion_tokens: The maximum number of tokens that can be generated in the chat completion. The
972968
total length of input tokens and generated tokens is limited by the model's
@@ -1027,10 +1023,9 @@ async def create(
10271023
means only the first 10 tokens with higher probability are considered. Is
10281024
recommended altering this, top_p or temperature but not more than one of these.
10291025
1030-
top_logprobs: This is not yet supported by our models. An integer between 0 and 20 specifying
1031-
the number of most likely tokens to return at each token position, each with an
1032-
associated log probability. `logprobs` must be set to `true` if this parameter
1033-
is used.
1026+
top_logprobs: An integer between 0 and 20 specifying the number of most likely tokens to
1027+
return at each token position, each with an associated log probability.
1028+
`logprobs` must be set to `true` if this parameter is used.
10341029
10351030
top_p: Cumulative probability for token choices. An alternative to sampling with
10361031
temperature, called nucleus sampling, where the model considers the results of
@@ -1139,9 +1134,9 @@ async def create(
11391134
logit_bias: This is not yet supported by our models. Modify the likelihood of specified
11401135
tokens appearing in the completion.
11411136
1142-
logprobs: This is not yet supported by our models. Whether to return log probabilities of
1143-
the output tokens or not. If true, returns the log probabilities of each output
1144-
token returned in the `content` of `message`.
1137+
logprobs: Whether to return log probabilities of the output tokens or not. If true,
1138+
returns the log probabilities of each output token returned in the `content` of
1139+
`message`.
11451140
11461141
max_completion_tokens: The maximum number of tokens that can be generated in the chat completion. The
11471142
total length of input tokens and generated tokens is limited by the model's
@@ -1202,10 +1197,9 @@ async def create(
12021197
means only the first 10 tokens with higher probability are considered. Is
12031198
recommended altering this, top_p or temperature but not more than one of these.
12041199
1205-
top_logprobs: This is not yet supported by our models. An integer between 0 and 20 specifying
1206-
the number of most likely tokens to return at each token position, each with an
1207-
associated log probability. `logprobs` must be set to `true` if this parameter
1208-
is used.
1200+
top_logprobs: An integer between 0 and 20 specifying the number of most likely tokens to
1201+
return at each token position, each with an associated log probability.
1202+
`logprobs` must be set to `true` if this parameter is used.
12091203
12101204
top_p: Cumulative probability for token choices. An alternative to sampling with
12111205
temperature, called nucleus sampling, where the model considers the results of

0 commit comments

Comments
 (0)