2025-08-22
Supervised Fine-Tuning
We now support supervised fine tuning with separate thinking traces for reasoning models (e.g. DeepSeek R1, GPT OSS, Qwen3 Thinking etc) that ensures training-inference consistency. An example including thinking traces would look like:2025-08-10
Supervised Fine-Tuning
We now support Qwen3 MoE model (Qwen3 dense models are already supported) and GPT OSS models for supervised fine-tuning. GPT OSS model fine tunning support is single-turn without thinking traces at the moment.2025-07-29
🎨 Vision-Language Model Fine-Tuning
You can now fine-tune Vision-Language Models (VLMs) on Fireworks AI using the Qwen 2.5 VL model family. This extends our Supervised Fine-tuning V2 platform to support multimodal training with both images and text data.Supported models:- Qwen 2.5 VL 3B Instruct
- Qwen 2.5 VL 7B Instruct
- Qwen 2.5 VL 32B Instruct
- Qwen 2.5 VL 72B Instruct
- Fine-tune on datasets containing both images and text in JSONL format with base64-encoded images
- Support for up to 64K context length during training
- Built on the same Supervised Fine-tuning V2 infrastructure as text models
🔧 Build SDK: Deployment Configuration Application Requirement
The Build SDK now requires you to call.apply()
to apply any deployment configurations to Fireworks when using deployment_type="on-demand"
or deployment_type="on-demand-lora"
. This change ensures explicit control over when deployments are created and helps prevent accidental deployment creation.Key changes:.apply()
is now required for on-demand and on-demand-lora deployments- Serverless deployments do not require
.apply()
calls - If you do not call
.apply()
, you are expected to set up the deployment through the deployment page at https://app.fireworks.ai/dashboard/deployments
- Add
llm.apply()
after creating LLM instances withdeployment_type="on-demand"
ordeployment_type="on-demand-lora"
- No changes needed for serverless deployments
- See updated documentation for examples and best practices
This applies to Python SDK version
>=0.19.14
.2025-07-23
🚀 Bring Your Own Rollout and Reward Development for Reinforcement Learning
You can now develop your own custom rollout and reward functionality while using Fireworks to manage the training and deployment of your models. This gives you full control over your reinforcement learning workflows while leveraging Fireworks’ infrastructure for model training and deployment.See the new LLM.reinforcement_step() method and ReinforcementStep class for usage examples and details.2025-07-16
Supervised Fine-Tuning V2
We now support Llama 4 MoE model supervised fine-tuning (Llama 4 Scout, Llama 4 Maverick, Text only).2025-07-10
🏗️ Build SDK LLM
Deployment Logic Refactor
Based on early feedback from users and internal testing, we’ve refactored the
LLM
class deployment logic in the Build SDK to make it easier to understand.Key changes:- The
id
parameter is now required whendeployment_type
is"on-demand"
- The
base_id
parameter is now required whendeployment_type
is"on-demand-lora"
- The
deployment_display_name
parameter is now optional and defaults to the filename where the LLM was instantiated
id
does not
exist. Otherwise, the existing deployment will be reused.2025-07-02
🚀 Support for Responses API in Python SDK
You can now use the Responses API in the Python SDK. This is useful if you want to use the Responses API in your own applications.See the Responses API guide for usage examples and details.2025-07-01
Support for LinkedIn authentication
You can now log in to Fireworks using your LinkedIn account. This is useful if you already have a LinkedIn account and want to use it to log in to Fireworks.To log in with LinkedIn, go to the Fireworks login page and click the “Continue with LinkedIn” button.You can also log in with LinkedIn from the CLI using thefirectl login
command.How it works:- Fireworks uses your LinkedIn primary email address for account identification
- You can switch between different Fireworks accounts by changing your LinkedIn primary email
- See our LinkedIn authentication FAQ for detailed instructions on managing email addresses
2025-06-30
Support for GitHub authentication
You can now log in to Fireworks using your GitHub account. This is useful if you already have a GitHub account and want to use it to log in to Fireworks.To log in with GitHub, go to the Fireworks login page and click the “Continue with GitHub” button.You can also log in with GitHub from the CLI using thefirectl login
command.🚨 Document Inlining Deprecation
Document Inlining has been deprecated and is no longer available on the Fireworks platform. This feature allowed LLMs to process images and PDFs through the chat completions API by appending#transform=inline
to document URLs.Migration recommendations:- For image processing: Use Vision Language Models (VLMs) like Qwen2.5-VL 32B Instruct
- For PDF processing: Use dedicated PDF processing libraries combined with text-based LLMs
- For structured extraction: Leverage our structured responses capabilities
2025-06-24
🎯 Build SDK: Reward-kit integration for evaluator development
The Build SDK now natively integrates with reward-kit to simplify evaluator development for Reinforcement Fine-Tuning (RFT). You can now create custom evaluators in Python with automatic dependency management and seamless deployment to Fireworks infrastructure.Key features:- Native reward-kit integration for evaluator development
- Automatic packaging of dependencies from
pyproject.toml
orrequirements.txt
- Local testing capabilities before deployment
- Direct integration with Fireworks datasets and evaluation jobs
- Support for third-party libraries and complex evaluation logic
Added new Responses API for advanced conversational workflows and integrations
- Continue conversations across multiple turns using the
previous_response_id
parameter to maintain context without resending full history - Stream responses in real time as they are generated for responsive applications
- Control response storage with the
store
parameter—choose whether responses are retrievable by ID or ephemeral
2025-06-13
Supervised Fine-Tuning V2
Supervised Fine-Tuning V2 released.Key features:- Supports Qwen 2/2.5/3 series, Phi 4, Gemma 3, the Llama 3 family, Deepseek V2, V3, R1
- Longer context window up to full context length of the supported models
- Multi-turn function calling fine-tuning
- Quantization aware training
Reinforcement Fine-Tuning (RFT)
Reinforcement Fine-Tuning released. Train expert models to surpass closed source frontier models through verifiable reward. More details in blospost.2025-05-20
2025-05-19
🚀 Easier & faster LoRA fine-tune deployments on Fireworks
You can now deploy a LoRA fine-tune with a single command and get speeds that approximately match the base model:- Create a deployment using
firectl create deployment "accounts/fireworks/models/<MODEL_ID of base model>" --enable-addons
- Then deploy the addon to the deployment:
firectl load-lora <MODEL_ID> --deployment <DEPLOYMENT_ID>
This change is for dedicated deployments with a single LoRA. You can still deploy multiple LoRAs on a deployment or deploy LoRA(s) on some Serverless models as described in the documentation.