Creating an on-demand deployment
To create an on-demand deployment, run:On-demand deployments are charged by GPU-hour. See
Pricing for details.
<MODEL_ID>
specified during model upload. Creating the deployment will automatically deploy the base model to the deployment.
Checking whether a model is deployed
You can check the status of a model deployment by looking at the “Deployed Model Refs” section from:State: DEPLOYED
.
Alternatively, you can list all deployed models within your account by running:
Inference
Model identifier
After your model is successfully deployed, it will be ready for inference. A model can be queried using one of the following model identifiers:-
The model and deployment names -
accounts/<ACCOUNT_ID of model>/models/<MODEL_ID>#accounts/<ACCOUNT_ID of deployment>/deployments/<DEPLOYMENT_ID>
, e.g.accounts/fireworks/models/mixtral-8x7b#accounts/alice/deployments/12345678
accounts/alice/models/custom-model#accounts/alice/deployments/12345678
-
The model and deployment short-names -
<ACCOUNT_ID of model>/<MODEL_ID>#<ACCOUNT_ID of deployment>/<DEPLOYMENT_ID>
, e.g.fireworks/mixtral-8x7b#alice/12345678
alice/custom-model#alice/12345678
-
Deployed model name - Instead of needing to use both the model and deployment name to refer to a deployed model, you can optionally just use a unique deployed model name. This name utilizes a unique deployed model ID that is created upon deployment. The deployed model ID takes the form <MODEL_ID>-<AUTOGENERATED_SUFFIX> and can be viewed with “firectl list deployed-models”
accounts/alice/deployedModels/mixtral-8x7b-abcdef
-
If you are deploying a custom model, you can also query it using the model name or model short-name, e.g.:
accounts/alice/models/custom-model
alice/custom-model
<ACCOUNT_ID>/<MODEL_ID>
<ACCOUNT_ID>/<MODEL_ID>#<ACCOUNT_ID>/<DEPLOYMENT_ID>
Multiple deployments
Since a model may be deployed to multiple deployments, querying by model name will route to the “default” deployed model. You can see which deployed model entry is marked withDefault: true
by describing the model:
Querying the model
To test the model using the completions API, run:Publishing a deployed model
By default, models can only be queried by the account that owns them. To make a deployed model public so anyone with a valid Fireworks API key can query it, update the deployed model with the--public
flag.
You must use the deployed model ID, not the model ID. To get a list of
deployed models, run
firectl list deployed-models
.