Serverless
tag.
For workloads that exceed serverless rate limits or require models that are not available serverlessly, we offer you the ability to spin up dedicated model deployments. On-demand deployments get billed by the GPU-second and are subject to capacity constraints. Enterprise accounts can purchase reserved capacity deployments to get guaranteed access to compute with a fixed commitment.