-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Confirm this is an issue with the Python library and not an underlying OpenAI API
- This is an issue with the Python library
Describe the bug
We've been noticing an increasing number of TPM limit errors when calling an Azure-hosted model via the library. We have a couple of retries configured but these do not help. The reason seems to be that recently the Azure API stopped returning the Retry-After
header in case of limit errors and now return x-rate-limit-reset-tokens
. The library currently only knows how to handle Retry-After
.
To Reproduce
- Force a token limit error on an Azure hosted model
- Observe the response headers. Example
[2023-12-03 18:48:27.180] DEBUG worker_pool_8 [httpcore.http11.trace:45] receive_response_headers.complete return_value=(b'HTTP/1.1', 429, b'Too Many Requests', [(b'Content-Length', b'329'), (b'Content-Type', b'application/json'), (b'x-rate-limit-reset-tokens', b'55'), (b'apim-request-id', b'<uuid>'), (b'Strict-Transport-Security', b'max-age=31536000; includeSubDomains; preload'), (b'x-content-type-options', b'nosniff'), (b'policy-id', b'DeploymentRatelimit-Token'), (b'x-ms-region', b'West US'), (b'x-ratelimit-remaining-requests', b'52'), (b'Date', b'Sun, 03 Dec 2023 18:48:27 GMT')])
The retry-after
header is no longer there and instead the x-rate-limit-reset-tokens
is returned.
Code snippets
No response
OS
macOS
Python version
Python 3.12
Library version
openai 1.3.6
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working