A Router is a resource that controls how inference traffic is routed to one or more deployments. Instead of sending all requests to a single deployment, a router lets you split traffic across multiple deployments — useful for A/B testing model variants, gradually migrating traffic to a new deployment, or distributing load. Traffic is split proportionally based on the number of replicas in each deployment. For example, if a router covers two deployments — one with 3 replicas and another with 2 — the first receives 60% of traffic and the second receives 40%.Documentation Index
Fetch the complete documentation index at: https://fireworks.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
When to use a router
Stable alias for deployment replacement
If you plan to replace a deployment later (e.g., changing to a new model later), give your application the router name instead of the deployment name. You can then swap the underlying deployment without your application changing anything.A/B testing between deployments
Place multiple deployments under a single router. Traffic is automatically split by replica count, so you can control the ratio by adjusting replicas on each deployment.Gradual traffic migration
Shift traffic from an old deployment to a new one with zero downtime by scaling replicas up on the new deployment and down on the old. See the worked example below.How traffic routing works
Traffic is distributed based on replica count. Each replica across all deployments in the router receives an equal share of traffic.| Deployment | Replicas | Traffic share |
|---|---|---|
deployment-a | 3 | 60% |
deployment-b | 2 | 40% |
| Total | 5 | 100% |
Sending traffic to a router
Use the router’s name in themodel field of your API request, just like you would use a deployment name:
Routing strategy
Traffic is routed using weighted replica selection: each request is randomly assigned to a deployment, weighted by its replica count. A deployment with more replicas receives proportionally more traffic.Managing routers
Creating a router
A router requires at least one deployment.| Flag | Description |
|---|---|
--router-id | Set a specific router ID. If omitted, a random ID is generated |
--display-name | Human-readable name for the router |
--model | The model to route traffic to |
--strategy | Routing strategy. Default: weighted-random |
--public | Make the router accessible to other accounts |
Listing routers
Getting router details
Updating a router
Update the deployments, strategy, or other properties of an existing router:Deleting a router
Example: traffic migration
This example walks through migrating traffic from an existing deployment to a new one with zero downtime. Step 1 — Create a router for your existing deployment and point your application at the router alias:accounts/<ACCOUNT_ID>/routers/my-router. All traffic goes to current-deployment.
Step 2 — Create the new deployment and add it to the router:
current-deployment has 4 replicas, the split is immediately 80%/20%.
Step 3 — Shift more traffic by increasing replicas on the new deployment and decreasing the old:
new-deployment. Clean up by removing the old deployment from the router: