Latest Updates: Batch API, 100,000 EU GPUs, Code Sandbox, Refuel acquisition, FLUX Kontext & more
Hi there 👋
We are back with another update! From new exclusive models to exciting product news, keep reading for all the updates to our product, research and tools.
New Models
🎨 FLUX.1 Kontext
Black Forest Labs has grown the FLUX model family! FLUX.1 Kontext [max] brings max performance and improved prompt adherence, while FLUX.1 Kontext [pro] is a unified model for fast, iterative image editing.
Both are now available on Together AI serverless and you can try FLUX.1 Kontext [pro] on Together Chat.
🐋 DeepSeek R1-0528
Ranking #4 on key benchmarks thanks to its strong reasoning capabilities, the latest update to DeepSeek-R1 is live on Together AI, both for serverless and dedicated deployments.
👾 Devstral Small 2505 & Magistral Small 2506
The Mistral AI team has released two new models!
Devstral Small is an agentic LLM that excels at using tools to explore codebases, editing multiple files and power software engineering agents.
Magistral Small is a 24B model with transparent step-by-step thinking, with strong benchmarks in math, coding, and multilingual reasoning.
🔎 Mxbai Rerank Large V2
With an industry-leading 57.49 BEIR score, mxbai-rerank-large-v2 is an exciting new rerank model by mixbread.
It’s a great fit for RAG systems, e-commerce search, code retrieval, and multilingual applications.
⭐ EXAONE models
We also added support for two models from LG AI Research !
EXAONE 3.5 is a 32B English/Korean model with SOTA long-context performance. EXAONE Deep excels at math, science, and complex reasoning tasks.
Product Announcements
🚀 Batch API
The Together Batch API is here! It allows queuing large, non-urgent LLM request workloads to be processed during off-peak times, which results in 50% cost savings.
Now available for 15 of the most popular open-source models, including DeepSeek-R1, DeepSeek-V3, Llama 4, Qwen3, and more!
💻 Code Sandbox & Code Interpreter
We launched Code Sandbox: customizable VM sandboxes to build full-scale development environments for AI. Powering AI-native startups like Blackbox AI and HeroUI.
For use cases that don't need a full dev environment, Code Interpreter lets you run LLM-generated Python code in a secure, isolated sandbox with a simple API call.
Company News
🌍 Bringing 100,000 GPUs to Europe
We’re expanding into Europe with our largest infra rollout yet! In partnership with Hypertec and 5C, we will deliver 100,000 NVIDIA Blackwell and future-generation GPUs totalling up to 2 gigawatts of AI-dedicated data center capacity.
Soon coming online with priority deployments in France, the UK, Italy, and Portugal.
🔥 Refuel acquisition
Together AI acquired Refuel , bringing Refuel’s technology directly into the Together AI Cloud to make it easier to build, deploy and improve the quality of complex agents over the entire lifecycle.
We have also added Refuel LLM-2 and Refuel LLM-2 Small, two models optimized for data tasks such as classification and structured data extraction.
Recommended by LinkedIn
Featured Content & Research
⚗️ Mixture-of-Agents Alignment (MoAA)
Our recent research presents a novel distillation framework that synthesizes the collective intelligence of multiple models into a smaller yet more efficient LLM.
MoAA outperforms GPT-4o as a teacher, boosting smaller models like Llama3.1-8B to rival models 10x their size!
🔎 Yet Another Quantization Algorithm (YAQA)
Quantized LLMs are fast, but can they preserve original model behavior? Our latest research, YAQA, answers that with a resounding yes.
YAQA directly minimizes the difference to the original model during quantization, reducing KL divergence by >30% over existing quantization algorithms, which translates to state-of-the-art results on downstream tasks.
⚡ Customized Speculative Decoding
Our Turbo Research team published their findings on using a custom speculator to yield ~1.3x faster inference and ~25% lower cost relative to Together AI’s state-of-the-art base speculator.
��� New webinars & videos
We have some exciting new & upcoming webinars on agents and training:
🔹 Webinar: How To Build a Coding Agent From Scratch
🔹 Webinar: Optimizing Training Workloads on GPU Clusters
🔹 Learning Together Ep.1: Matryoshka Principles for Adaptive Intelligence
🍳 New guides & cookbooks
Our team has also cooked up some new practical guides!
Check out our blog on Open Data Scientist Agent, our docs on OCR with vision models & Cline + DeepSeek V3, and don’t miss our cookbooks on Data Science Agent, Together + Arcade.dev, Batch Inference API, and Code Interpreter.
New AI Apps
💸 BillSplit
In our latest open-source demo app, we show off how to do structured OCR on Together AI.
Powered by Llama 4 vision with JSON mode running on Together AI, this app lets you easily split your restaurant bill. 100% free and open source!
Community Spotlight
🌐Sahara AI
We are thrilled to see our friends at Sahara AI launch the SIWA Testnet!
We're proud to power the scalable compute behind their dev platform, so anyone, anywhere can build & run real AI.
📽️ Hedra
We also want to congratulate Hedra for their Series A!
Their vision for fast, expressive video creation is powered by custom foundation models, and we're proud to provide the fully-optimized, scalable training and inference infrastructure that grows with them (no infra headaches required).
Stay Connected!
You don’t have to wait for our newsletter! Catch up on the latest news, model releases, and more by following us on X and LinkedIn.
Always stay in the know—subscribe to the LinkedIn newsletter to receive our future updates.
Software Developer | React | Next.js | Nest.js | TypeScript
3wImpressive
Director Premier Accounts
3wVery exciting news
Author | Business Advisor | Board Member | Agentic AI Leader | Cofounder
3wVery well done