Latest Updates: Batch API, 100,000 EU GPUs, Code Sandbox, Refuel acquisition, FLUX Kontext & more

Latest Updates: Batch API, 100,000 EU GPUs, Code Sandbox, Refuel acquisition, FLUX Kontext & more

Hi there 👋

We are back with another update! From new exclusive models to exciting product news, keep reading for all the updates to our product, research and tools.


New Models

🎨 FLUX.1 Kontext

Black Forest Labs has grown the FLUX model family! FLUX.1 Kontext [max] brings max performance and improved prompt adherence, while FLUX.1 Kontext [pro] is a unified model for fast, iterative image editing.

Both are now available on Together AI serverless and you can try FLUX.1 Kontext [pro] on Together Chat.

Article content

Try them out →

🐋 DeepSeek R1-0528

Ranking #4 on key benchmarks thanks to its strong reasoning capabilities, the latest update to DeepSeek-R1 is live on Together AI, both for serverless and dedicated deployments.

Try on Together Chat →

👾 Devstral Small 2505 & Magistral Small 2506

The Mistral AI team has released two new models!

Devstral Small is an agentic LLM that excels at using tools to explore codebases, editing multiple files and power software engineering agents.

Magistral Small is a 24B model with transparent step-by-step thinking, with strong benchmarks in math, coding, and multilingual reasoning.

Try Magistral Small →

🔎 Mxbai Rerank Large V2

With an industry-leading 57.49 BEIR score, mxbai-rerank-large-v2 is an exciting new rerank model by mixbread.

It’s a great fit for RAG systems, e-commerce search, code retrieval, and multilingual applications.

Try it out →

⭐ EXAONE models

We also added support for two models from LG AI Research !

EXAONE 3.5 is a 32B English/Korean model with SOTA long-context performance. EXAONE Deep excels at math, science, and complex reasoning tasks.

Try EXAONE 3.5 →


Product Announcements

🚀 Batch API

The Together Batch API is here! It allows queuing large, non-urgent LLM request workloads to be processed during off-peak times, which results in 50% cost savings.

Now available for 15 of the most popular open-source models, including DeepSeek-R1, DeepSeek-V3, Llama 4, Qwen3, and more!

Read announcement →

💻 Code Sandbox & Code Interpreter

We launched Code Sandbox: customizable VM sandboxes to build full-scale development environments for AI. Powering AI-native startups like Blackbox AI and HeroUI.

For use cases that don't need a full dev environment, Code Interpreter lets you run LLM-generated Python code in a secure, isolated sandbox with a simple API call.

Read our quickstart →


Company News

🌍 Bringing 100,000 GPUs to Europe

We’re expanding into Europe with our largest infra rollout yet! In partnership with Hypertec and 5C, we will deliver 100,000 NVIDIA Blackwell and future-generation GPUs totalling up to 2 gigawatts of AI-dedicated data center capacity.

Soon coming online with priority deployments in France, the UK, Italy, and Portugal.

Learn more →

🔥 Refuel acquisition

Together AI acquired Refuel , bringing Refuel’s technology directly into the Together AI Cloud to make it easier to build, deploy and improve the quality of complex agents over the entire lifecycle.

We have also added Refuel LLM-2 and Refuel LLM-2 Small, two models optimized for data tasks such as classification and structured data extraction.

Read announcement →


Featured Content & Research

⚗️ Mixture-of-Agents Alignment (MoAA)

Our recent research presents a novel distillation framework that synthesizes the collective intelligence of multiple models into a smaller yet more efficient LLM.

MoAA outperforms GPT-4o as a teacher, boosting smaller models like Llama3.1-8B to rival models 10x their size!

Article content

Read research →

🔎 Yet Another Quantization Algorithm (YAQA)

Quantized LLMs are fast, but can they preserve original model behavior? Our latest research, YAQA, answers that with a resounding yes.

YAQA directly minimizes the difference to the original model during quantization, reducing KL divergence by >30% over existing quantization algorithms, which translates to state-of-the-art results on downstream tasks.

Article content

Read research →

⚡ Customized Speculative Decoding

Our Turbo Research team published their findings on using a custom speculator to yield ~1.3x faster inference and ~25% lower cost relative to Together AI’s state-of-the-art base speculator.

Read research →

��� New webinars & videos

We have some exciting new & upcoming webinars on agents and training:

🔹 Webinar: How To Build a Coding Agent From Scratch

🔹 Webinar: Optimizing Training Workloads on GPU Clusters

🔹 Learning Together Ep.1:  Matryoshka Principles for Adaptive Intelligence

🍳 New guides & cookbooks

Our team has also cooked up some new practical guides!

Check out our blog on Open Data Scientist Agent, our docs on OCR with vision models & Cline + DeepSeek V3, and don’t miss our cookbooks on Data Science Agent, Together + Arcade.dev, Batch Inference API, and Code Interpreter.

Explore cookbooks →


New AI Apps

💸 BillSplit

In our latest open-source demo app, we show off how to do structured OCR on Together AI.

Powered by Llama 4 vision with JSON mode running on Together AI, this app lets you easily split your restaurant bill. 100% free and open source!

Try it out  →


Community Spotlight

🌐Sahara AI

We are thrilled to see our friends at Sahara AI launch the SIWA Testnet!

We're proud to power the scalable compute behind their dev platform, so anyone, anywhere can build & run real AI.

Check it out →

📽️ Hedra

We also want to congratulate Hedra for their Series A!

Their vision for fast, expressive video creation is powered by custom foundation models, and we're proud to provide the fully-optimized, scalable training and inference infrastructure that grows with them (no infra headaches required).


Stay Connected!

You don’t have to wait for our newsletter! Catch up on the latest news, model releases, and more by following us on X and LinkedIn.


Always stay in the know—subscribe to the LinkedIn newsletter to receive our future updates.


Abhishek Bhardwaj

Software Developer | React | Next.js | Nest.js | TypeScript

3w

Impressive

Like
Reply
Robert Horn

Director Premier Accounts

3w

Very exciting news

Like
Reply
Amahl Williams

Author | Business Advisor | Board Member | Agentic AI Leader | Cofounder

3w

Very well done

Like
Reply

To view or add a comment, sign in

More articles by Together AI

Others also viewed

Explore topics