Latest Updates: Batch API, 100,000 EU GPUs, Code Sandbox, Refuel acquisition, FLUX Kontext & more

Together AI

AI pioneers train, fine-tune, and run frontier models on our GPU cloud platform.

Published Jun 20, 2025

Hi there 👋

We are back with another update! From new exclusive models to exciting product news, keep reading for all the updates to our product, research and tools.

New Models

🎨 FLUX.1 Kontext

Black Forest Labs has grown the FLUX model family! FLUX.1 Kontext [max] brings max performance and improved prompt adherence, while FLUX.1 Kontext [pro] is a unified model for fast, iterative image editing.

Both are now available on Together AI serverless and you can try FLUX.1 Kontext [pro] on Together Chat.

Try them out →

🐋 DeepSeek R1-0528

Ranking #4 on key benchmarks thanks to its strong reasoning capabilities, the latest update to DeepSeek-R1 is live on Together AI, both for serverless and dedicated deployments.

Try on Together Chat →

👾 Devstral Small 2505 & Magistral Small 2506

The Mistral AI team has released two new models!

Devstral Small is an agentic LLM that excels at using tools to explore codebases, editing multiple files and power software engineering agents.

Magistral Small is a 24B model with transparent step-by-step thinking, with strong benchmarks in math, coding, and multilingual reasoning.

Try Magistral Small →

🔎 Mxbai Rerank Large V2

With an industry-leading 57.49 BEIR score, mxbai-rerank-large-v2 is an exciting new rerank model by mixbread.

It’s a great fit for RAG systems, e-commerce search, code retrieval, and multilingual applications.

Try it out →

⭐ EXAONE models

We also added support for two models from LG AI Research !

EXAONE 3.5 is a 32B English/Korean model with SOTA long-context performance. EXAONE Deep excels at math, science, and complex reasoning tasks.

Try EXAONE 3.5 →

Product Announcements

🚀 Batch API

The Together Batch API is here! It allows queuing large, non-urgent LLM request workloads to be processed during off-peak times, which results in 50% cost savings.

Now available for 15 of the most popular open-source models, including DeepSeek-R1, DeepSeek-V3, Llama 4, Qwen3, and more!

Read announcement →

💻 Code Sandbox & Code Interpreter

We launched Code Sandbox: customizable VM sandboxes to build full-scale development environments for AI. Powering AI-native startups like Blackbox AI and HeroUI.

For use cases that don't need a full dev environment, Code Interpreter lets you run LLM-generated Python code in a secure, isolated sandbox with a simple API call.

Read our quickstart →

Company News

🌍 Bringing 100,000 GPUs to Europe

We’re expanding into Europe with our largest infra rollout yet! In partnership with Hypertec and 5C, we will deliver 100,000 NVIDIA Blackwell and future-generation GPUs totalling up to 2 gigawatts of AI-dedicated data center capacity.

Soon coming online with priority deployments in France, the UK, Italy, and Portugal.

Learn more →

🔥 Refuel acquisition

Together AI acquired Refuel , bringing Refuel’s technology directly into the Together AI Cloud to make it easier to build, deploy and improve the quality of complex agents over the entire lifecycle.

We have also added Refuel LLM-2 and Refuel LLM-2 Small, two models optimized for data tasks such as classification and structured data extraction.

Read announcement →

Featured Content & Research

⚗️ Mixture-of-Agents Alignment (MoAA)

Our recent research presents a novel distillation framework that synthesizes the collective intelligence of multiple models into a smaller yet more efficient LLM.

MoAA outperforms GPT-4o as a teacher, boosting smaller models like Llama3.1-8B to rival models 10x their size!

Read research →

🔎 Yet Another Quantization Algorithm (YAQA)

Quantized LLMs are fast, but can they preserve original model behavior? Our latest research, YAQA, answers that with a resounding yes.

YAQA directly minimizes the difference to the original model during quantization, reducing KL divergence by >30% over existing quantization algorithms, which translates to state-of-the-art results on downstream tasks.

Read research →

⚡ Customized Speculative Decoding

Our Turbo Research team published their findings on using a custom speculator to yield ~1.3x faster inference and ~25% lower cost relative to Together AI’s state-of-the-art base speculator.

Read research →

�� New webinars & videos

We have some exciting new & upcoming webinars on agents and training:

🔹 Webinar: How To Build a Coding Agent From Scratch

🔹 Webinar: Optimizing Training Workloads on GPU Clusters

🔹 Learning Together Ep.1: Matryoshka Principles for Adaptive Intelligence

🍳 New guides & cookbooks

Our team has also cooked up some new practical guides!

Check out our blog on Open Data Scientist Agent, our docs on OCR with vision models & Cline + DeepSeek V3, and don’t miss our cookbooks on Data Science Agent, Together + Arcade.dev, Batch Inference API, and Code Interpreter.

Explore cookbooks →

New AI Apps

💸 BillSplit

In our latest open-source demo app, we show off how to do structured OCR on Together AI.

Powered by Llama 4 vision with JSON mode running on Together AI, this app lets you easily split your restaurant bill. 100% free and open source!

Try it out →

Community Spotlight

🌐Sahara AI

We are thrilled to see our friends at Sahara AI launch the SIWA Testnet!

We're proud to power the scalable compute behind their dev platform, so anyone, anywhere can build & run real AI.

Check it out →

📽️ Hedra

We also want to congratulate Hedra for their Series A!

Their vision for fast, expressive video creation is powered by custom foundation models, and we're proud to provide the fully-optimized, scalable training and inference infrastructure that grows with them (no infra headaches required).

Stay Connected!

You don’t have to wait for our newsletter! Catch up on the latest news, model releases, and more by following us on X and LinkedIn.

Always stay in the know—subscribe to the LinkedIn newsletter to receive our future updates.

Together We Build

15,333 followers

+ Subscribe

Abhishek Bhardwaj

Software Developer | React | Next.js | Nest.js | TypeScript

Impressive

Robert Horn

Director Premier Accounts

Very exciting news

Amahl Williams

Author | Business Advisor | Board Member | Agentic AI Leader | Cofounder

Very well done

See more comments

To view or add a comment, sign in

Sign in

Stay updated on your professional world

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

New Models

🎨 FLUX.1 Kontext

🐋 DeepSeek R1-0528

👾 Devstral Small 2505 & Magistral Small 2506

🔎 Mxbai Rerank Large V2

⭐ EXAONE models

Product Announcements

🚀 Batch API

💻 Code Sandbox & Code Interpreter

Company News

🌍 Bringing 100,000 GPUs to Europe

🔥 Refuel acquisition

Recommended by LinkedIn

Featured Content & Research

⚗️ Mixture-of-Agents Alignment (MoAA)

🔎 Yet Another Quantization Algorithm (YAQA)

⚡ Customized Speculative Decoding

��� New webinars & videos

🍳 New guides & cookbooks

New AI Apps

💸 BillSplit

Community Spotlight

🌐Sahara AI

📽️ Hedra

Together We Build

15,333 followers

More articles by Together AI

Latest Updates: Qwen3, DPO & Continued Training, Open Deep Research, Blackwell test drive & more

Latest Updates: Llama 4, Together Chat, On-Demand Dedicated Endpoints & Instant GPU Clusters

Special Issue: Next-Gen Reasoning with DeepSeek-R1, Reasoning Clusters and Free R1 70B Distilled Model

Latest Updates: Free Llama 3.3 70B, Fine-Tuning API, Serverless Multi-LoRA & Blackwell GPUs

Latest Updates: 36K NVIDIA GB200 GPU Cluster, New FLUX Tools, and Qwen2.5-Coder

Latest Updates: FREE Llama 3.2 Multimodal & FLUX.1 [schnell], NVIDIA H200s, and Enterprise Platform

Sign in

Others also viewed

Computer Using AI Agents (CUAs) Are Replacing Humans: How OpenAI's ‘Operator,’ Google's Mariner, and Anthropic's Claude Are Taking Over Digital Work

Building a VM with Native ZK Proof Generation in Rust

LangSmith

LLM-D, Supercharged HPA and GKE AI Labs

Scaling AI Infrastructure for LLMs: Best Practices for Mid-Sized Companies

A2A + MCP Stack: How SimplAI Turns Autonomous Agents into a Co-ordinated AI Workforce

Digixvalley Granite 3.0: open, state-of-the-art Enterprise Models

On Cursor's Hypergrowth to $300M ARR

A Guide to Building Llama 3.1 RAG Applications with TIR AI Studio

Dr. Pete’s Reflection on AI Day Melbourne 2024

Explore topics

�� New webinars & videos