Does the API support batching and load balancing?

Load balancing: Yes, supported out of the box
Continuous batching: Yes, supported
Batch inference: Yes, supported via the Batch API
Streaming: Yes, supported

Current capabilities include:

For asynchronous batch processing of large volumes of requests, see our Batch API documentation.

⌘I