feature request: proactive client-side rate limiting

### Confirm this is a feature request for the Python library and not the underlying OpenAI API.

- [X] This is a feature request for the Python library

### Describe the feature or improvement you're requesting

When making batch requests using `LangChain`, with an `OpenAI` model [as shown in this minimal repro](https://github.com/jeswr/rate-limit-repro/blob/main/minimal.py), it is common to hit the organizational rate limit for tokens per minute (TPM) - [as demonstrated in this error log](https://github.com/jeswr/rate-limit-repro/blob/main/log.txt).

Whilst limiting the concurrency of batches, and introducing exponential backoff can be used to reduce this issue downstream in `LangChain` - I believe there is also room for the [`OpenAI#request`](https://github.com/openai/openai-python/blob/1ed0e35679438cc7409f2e3aeb36d637b3c579c0/src/openai/_base_client.py#L950) function in this library to more intelligently handle parallel invocations so as to better support batch requests, regardless of whether this library, `langchain` or another codebase is responsible for initiating the batch requests.

In particular, I would suggest that the [`SyncAPIClient`](https://github.com/openai/openai-python/blob/1ed0e35679438cc7409f2e3aeb36d637b3c579c0/src/openai/_base_client.py#L767) create queue(s) of requests and determine when enqueued requests can be run based on the `x-ratelimit-*` and `retry-after` headers of existing responses.

### Additional context

Related to https://github.com/openai/openai-python/issues/937#issuecomment-1866784701

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feature request: proactive client-side rate limiting #1579

Confirm this is a feature request for the Python library and not the underlying OpenAI API.

Describe the feature or improvement you're requesting

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feature request: proactive client-side rate limiting #1579

Description

Confirm this is a feature request for the Python library and not the underlying OpenAI API.

Describe the feature or improvement you're requesting

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions