An example of why I think current LLMs are enough to change lots of work even if they don’t get better, once we start integrating them with other systems

1mo

An example of why I think current LLMs are enough to change lots of work even if they don’t get better, once we start integrating them with other systems GPT-4 (now obsolete) went from 30% accuracy to 87% accuracy in clinical oncology decisions when acting like an agent & given access to medical tools and reference material. Paper: https://lnkd.in/gwQAcAfX

44 Comments

Barada Sahu

CEO @mason | AI for hyper-personalized storefronts

1mo

The frontier for agents is no longer tools but memory - statelessness only takes you so far. Tools will expand the ability of models but will also need adaptation for the model world - more context around usage, long tail APIs, verification simulators.

3 Reactions

Jill Galloway

CPO | AI Obsessed | Efficiency & Quality Are My Love Language

1mo

I agree. How it’s implemented always matters more than the tech itself.

1 Reaction

Steve Mordue

Semi-Pro Vibe Coder 🤖. Creator of RapidStart Apps 🚀. Co-Founder of Digital Labor Factory 🚀. US citizen living on a mountain in Brazil 🍹.

1mo

Agree, sadly the first paragraph went right over my head, so I'll take your word for it 😀

Lynne Thomson

Decoding AI: What every leader should know about AI

1mo

I've been saying for a while that GenAI is useful when it is baked into the software and systems people already use vs. startups' obsession with disruption. Great to see Ethan Mollick expanding on how that works for AI agents.

2 Reactions

Aki Kakko

Founder Alphanome.AI - AI Research Lab & Venture Studio

1mo

Soon the Era of Self-Driving.company is here.

TradeFlock

1mo

Ethan Mollick, thank you for sharing this insightful example of llms' potential impact! 🌟

Patrick McFadden

Founder, Thinking OS™ | The Governance Layer Above Systems, Agents & AI | Governing What Should Move — Not Just What Can

1mo

Ethan Mollick, this is the part the headlines will miss. Everyone will point to the 87% accuracy. But the real unlock wasn’t the model. It was the scaffolding around the model. The accuracy didn’t improve because GPT-4 got smarter. It improved because it got governed. • The agent was routed through protocol-based steps. • Its decisions were gated by tool checks, not guesswork. • And every response was traceable to a logic layer beyond the LLM. This is Thinking OS™ territory. Because intelligence without interpretive control is just high-confidence entropy. In high-stakes domains like clinical oncology, performance gains don’t come from better output. They come from decision constraint, escalation friction, and upstream adjudication — before the model gets to answer anything at all. The agent didn’t become brilliant. It just stopped hallucinating.

6 Reactions

Ville Murtonen

Human-centric Leader | AI Delivery Lead | Senior Advisor

1mo

Contrary I think they will go down in use, as people realize deterministic approaches will work much better with lower cost than using AI.

Matt Brophy

AI Ethicist | Associate Professor of Philosophy | AI Committee Chair for HBS at High Point University | Speaker for AI in Higher Ed

1mo

I wonder when it will become official malpractice for an oncologist not to be using AI for co-intelligence!

4 Reactions

Demetri Panici

Founder @ Rise Productive | Content Creator & Agency Owner

1mo

That oncology accuracy jump from 30% to 87% shows how LLMs get game-changing results through smart system integration, not just bigger models. What other fields do you think are about to see similar accuracy wins through better tool integration?

See more comments

To view or add a comment, sign in

Ethan Mollick’s Post

Explore topics