I have been thinking a lot about memory in the world of AI and agents. I think that's going to be one of the next frontiers for advancement -- and differentiation. It started simply enough. ChatGPT (and others) started allowing you to follow-up and ask questions or submit prompts and it would know what you were talking about. Example: First: "Write a limerick about vector search" Then: "How about one about Graph RAG?" Feels super-natural to be able to do that, because we're so used to it in our human conversations. But LLMs are naturally stateless (they sort of "wake up" with each prompt, do what the prompt asks and then wait for the next one). It's like the movie Groundhog Day. Of course, that was super-annoying, so the chat apps decided to add super-simple "memory". The way they did this is to put prior prompts/outputs into the context window, so the LLM knew not just what you were asking now, but what you had asked for before, and what its responses were. This reminds me of the movie "Memento" -- we're basically writing some of our "history" down into the context window, so we can remember it and resume where we left off. That's just the beginning though. We've gotten smarter about how we manage memory, how to work around the limitations of context window size, having the concept of "long term" persistent memory (not just for this session). I think ChatGPT is ahead in this game and people don't fully appreciate (yet) how big of a moat that could be. When ChatGPT, Claude, Grok and others are roughly the same, what's going to differentiate the quality of the output is going to be less about the quality of the underlying LLM and more about the quality of the memory. How well does it know you? How well can it bring back relevant memories to inform the output? Already, I find myself drifting towards ChatGPT more and more because it *feels* like it knows me better than Claude. This is a self-reenforcing loop. The more I use ChatGPT (vs. Claude and others), the better it knows me. As a consumer, what I'm hoping emerges is some sort of open standard so that memory is "portable" across systems in a secure, managed way. That way, *I* manage my memory (and choose which system I use to manage that), and then connect my memory to whichever AI systems I use. We could start with an open protocol/standard (similar to MCP for tool calling) for implementing memory. That way, there's a "standard" way to do it and all the AI apps/agents/clients are not all reinventing the wheel. I think that's going to be a massive opportunity for whoever pulls it off. I'm working on a piece of this in Agent.ai whereby we have shared memory, controlled by the user across the network of agents that run on the platform (there's 1,700+ public agents now). DISCLOSURES: I'm an investor in OpenAI, Perplexity, Mem0, LangChain and others I can't mention yet. I also own the OpenMem .com domain if someone has conviction and capital to go build something in this space.
This will be a game changer in AI's enhancement: 'open standard so that memory is "portable" across systems in a secure, managed way.'
chatgpt got that “i remember your coffee order” energy while others still asking for your name
I agree that memory is a concept that is a place for differentiation and a new area for innovation. In human terms, there are lots of different accepted forms and definitions of memory. Each type with it's own strengths and uses. I think the same is happening now with AI Memory. Different designs for different use-cases and applications, based on what needs to be remembered, for how long, and how portable. I think some repeatable designs and architectures are emerging and will start setting at least some standard design patterns, maybe even re-usable open source code, that work for some standard use-cases.
After looking at the memory list a few times I turn memory off. It remembers tons of irrelevant facts and old approaches that aren’t relevant anymore. AI makes enough mistakes and I dont want it remembering anything except what I want it to remember, which is a tiny bit of the overall context. Most of the benefit of an LLM is that I can reliably start a new chat with no bleed through of what it thinks it knows. I shotgun ideas, restart threads, try many approaches, select a candidate or two, work until it gets stupid and then I clear memory, feed it the important bits of the context and continue. What I do use is memory files, managed explicitly and visible to me. With Claude I create and store artifacts when I’m happy with a thread.
I think memory has to be enterprise asset like databases. It can have access control - but can not be in private workspace of an agent. I hope context engineering will mature and we will know how to retrieve right chunks of memory in LLM context just in time for next action. Even it might have different level of transience - I think ultimately everything will be persisted and masked or retrieved on demand.
Dharmesh Shah - trust is an important question here. What does the *end user* controlling memory look like? Do people actually care about local first? Or is governance, not storage, what matters to win trust
In ChatGPT I do see the effect of increasing memory in diff conversations and am still undecided if it always works on totally diff topics and conversations, but it is interesting. Vertical agents and applications will be able to use memory better imo in direct result-focused cases.
ChatGPT, context and memory is often what doesnt work well because.they are trying so hard it seems to me. ChatGPT has more memory loss than some other LLM's i am using. It just loses it even when you try to be very clear to 'remember' something. And in other situations you don't want to work with what ChatGPT already knows or remembers of you becuase you want to have answers in a different context. ChatGPT also has a big problem.with that; it seems to be u able to forget some things that it is storing in memory.
Dharmesh Shah Appreciate how clearly you outlined the evolution - from stateless prompts to persistent context to platform-native memory. What most miss is what you’re already circling: Memory isn’t just a retrieval or fidelity problem. It’s a 𝗴𝗼𝘃𝗲𝗿𝗻𝗮𝗻𝗰𝗲 𝘁𝗵𝗿𝗲𝘀𝗵𝗼𝗹𝗱. The moment agents begin accumulating long-range context across roles, time, and tools - you’re no longer managing history. You’re managing 𝘫𝘶𝘥𝘨𝘮𝘦𝘯𝘵 𝘤𝘰𝘯𝘵𝘪𝘯𝘶𝘪𝘵𝘺 across systems that were never designed to govern what should persist, replicate, or activate under pressure. Portable memory is an opportunity. But 𝗽𝗼𝗿𝘁𝗮𝗯𝗹𝗲 𝗰𝗼𝗻𝘀𝘁𝗿𝗮𝗶𝗻𝘁 will be the infrastructure that determines which memories should never reactivate and why. Once systems start thinking with memory, someone has to govern what they remember 𝘰𝘯 𝘱𝘶𝘳𝘱𝘰𝘴𝘦 - not just what they can recall. That layer is here. Some of us are already enforcing it upstream.
Venture Builder | Product, Strategy, Workflows/Automation
3dbut what if user wants to start a "fresh" session? Memory need to be manageable, not auto save everything. AI should be able to understand certain context "might" be repeated, then ask user "do you want to add this to your chat memory? Yes | No". And same way we have Temp sessions, Memory should be a toggle per each chat.