OpenAI’s former Head of Research, Bob McGrew

Advising organizations from Apple to the US Gov, & cited in the new G7 AI doc.
Get The Memo.


Jun/2025

Video: https://youtu.be/z_-nLK4Ps1Q1https://youtu.be/z_-nLK4Ps1Q
Sequoia Capital: The Breakthroughs Needed for AGI Have Already Been Made: OpenAI Former Research Head Bob McGrew
Speaker: OpenAI’s former Head of Research, Bob McGrew
Transcribed by: OpenAI Whisper via MacWhisper, formatted by Gemini 2.5 Pro.
Date: 17/Jun/2025

Bob: I think what’s really changed is that now that you have LLMs, you have this language interface to the robot so that now you can describe the tasks much more cheaply. And you have really strong vision encoders that are tied into that intelligence. So that gives the robots really a head start at doing generic tasks. We spent years solving one specific problem, teaching a robot to manipulate a Rubik’s cube. And now a company like, let’s say, Physical Intelligence can spend months solving a huge variety of problems like laundry folding, cardboard, and packing egg crates. And that’s something that they can only have because they’re building on top of existing frontier models and, you know, the entire tech and research stack that we’ve built over the last 10 years.

Interviewer: Welcome to Training Data. Today we’re excited to welcome Bob McGrew, former Chief Research Officer at OpenAI, for a fascinating look behind the scenes of frontier AI development. Bob talks about the trifecta we have in AI—pre-training, post-training, and reasoning—and explains why we may have already discovered all the fundamental concepts needed for AGI. You’ll learn why he thinks agents will be priced at the cost of compute, hence eroding traditional economic moats, and why even proprietary data will become less valuable when infinitely patient AI can recreate alternatives. Plus, Bob shares his contrarian take on where startup opportunities really lie, and why robotics is finally having its moment after years of being too early. Enjoy the show. Bob, thank you so much for joining us today.

Bob: Oh, it’s great to be here.

Interviewer: We’re at a really interesting time in AI development. We have a beautiful new trifecta: pre-training, post-training, and reasoning. Can you help us unpack what else is left? What alpha is there left in each?

Bob: So I think we’re going to continue to see capabilities increase. It’s going to continue to feel like it’s felt: super fast, super exciting over the last even five years. And I think it’s going to continue feeling like that. There’s not a wall here. But what is going to be different is that 2025 is going to be the year of reasoning. So it makes a lot of sense. Reasoning is a new technique. When you have a new technique, you know, there’s often an overhang of compute, of data, of algorithmic efficiency improvements that you can make. And so that’s something that, you know, if you look at just the incredible progress that we saw from the o1 preview back in September, and then six months later, we got to o3 in April. And then, you know, at the same time, we also see the diffusion of reasoning from OpenAI, who had been working on that for years, out to Google, DeepSeek, and Anthropic, again, just in a few months. And so this is really the right place where every lab is going to focus for the year.

And just as a sort of fun example of how low-hanging the fruit is right now, if you look at the most interesting difference between the o1 Preview and o3, the o1 Preview is not able to use tools. o3 can use tools as part of its chain of thought. And this is pretty obvious, right? When we were training o1, we knew that this was a thing that we wanted to do, but it was difficult to implement. It took time. And so, you know, that took six months to get done and released. The next step on reasoning is going to be a lot less obvious than that. It’s going to be a lot harder. And so, you know, as reasoning continues to mature, we’re going to see the overhang get eaten up and it’s going to start being slower and slower to make progress.

Interviewer: You said there’s not a wall. I think there’s this meme in the Twittersphere right now that pre-training is hitting a wall. Can you say more about that dynamic?

Bob: Yeah. And I think that’s great. That’s a great question here because pre-training is not going away. But what we’re seeing out of pre-training is that we are at the place where it’s working really well and we’re hitting diminishing returns. And so, you know, diminishing returns are baked in because the intelligence of a model is log-linear in the amount of compute that you’re using to train it, which means that you have to have exponential increases in compute to get each increment in intelligence. When you pre-train a model, that’s a giant training run. It takes all of your data center for a period of months. And when you go to pre-train the next model, you can’t really do it on the same data center. You can rely a little bit on algorithmic efficiency to make it better. But fundamentally, you have to wait until you get a new data center. And that’s not something you can do in six months, the way you can with improvements in reasoning right now. That’s something that takes years.

So that doesn’t mean pre-training is useless, though, because the real lever for pre-training in 2025 is improving architectures. So even though you’re working on reasoning, you want to improve pre-training so that you can have better inference time efficiency or so that you can have longer context or better use of the context. And when you’re doing that, you have to start back from the beginning, do pre-training on this new architecture, and then go through the whole reasoning process again. So that’s the role of pre-training now. It’s still important. It’s just doing something different in the pipeline.

Interviewer: Can you help us unpack what’s left in post-training?

Bob: Yeah. So post-training is pretty interesting because both pre-training and reasoning are about increasing intelligence. And there’s a very clear scaling law that you get where you put in more compute and you get out increasing intelligence. And post-training isn’t like that. Post-training is about model personality. And, you know, intelligence, it’s a sort of a “thin” problem, right? If you can get better at it, it turns out to be very generalizable and it applies to everything. So you can work on math and you find that it makes you better at legal reasoning. But model personality is a “thick” problem. You actually need a lot of human effort to think about what makes a good personality. How do I want this agent to act? And it’s much more of a training process, like you would go through over many years of interacting with people. And now it’s a very hard research problem to take that specification for what the agent is and turn it into an actual appealing personality. But when you think about post-training, I think about people like Joanne Jang at OpenAI or Amanda Askell at Anthropic, who really spend a lot of time crafting these model personalities. And they’re not research practitioners. Interesting. Right? They’re people with—they’re product managers or they’re people with a very deep understanding of human nature.

Interviewer: And are there more legs to the stool?

Bob: Well, okay. So I’m going to say something potentially controversial. And I think actually there aren’t. So I think if you go forward to 2030 or if you go forward to 2035 and you look back and you say, what were the fundamental concepts that you needed in order to create more and more intelligence? Maybe that’s AGI. Maybe it’s something different at that point. I think you’re going to come up with the idea of language models with Transformers, the idea of scaling the pre-training on those language models—so GPT-1 and GPT-2, basically—and then the idea of reasoning. And sort of woven throughout that, increasing more and more multimodal capabilities. And I think even in 2035, we’re not going to see any new trends beyond those.

And the reason I think this is if you go back to 2020, so GPT-3 has just been trained. You know, imagine yourself sitting, we’re at OpenAI, we haven’t released this thing, but we know something epochal has happened. And, you know, Dario Amodei, Ilya Sutskever, Alec Radford, you know, we’re all sitting there in the room looking at this thing. And it was fairly obvious internally what the roadmap was. We knew that at this point, going from GPT-3 to GPT-4 by increasing pre-training was absolutely critical. We could see that we needed to increase multimodality, ultimately ending in a model that could use a computer. We were beginning to make experiments with test-time compute. And in 2021, after the Anthropic people left, we really started developing the idea of reasoning at OpenAI.

And it’s funny, actually, if sometimes my friends ask me after Anthropic released its computer use model, they’re like, “Did you see that coming?” And I was like, “Well, we were working on that together back before they left.” One of the people that did that project went to Anthropic and the other one went to OpenAI and developed Operator. And it just took many years before the multimodality had matured enough to get to that point that was obvious to us way back then. And so that’s why I think we’re going to see from here on out, there’s very important scaling. There’s very important development and refinement of these ideas. That is extremely hard. It takes a lot of brainpower. It’s not going to be easy. But I think if we look back from 2035, we’re not going to see anything new and fundamental. I think I’m right. I kind of hope I’m wrong. It would be a lot more fun if I’m wrong. But I think we’ll have to see.

Interviewer: That’s a hot take. I’m glad we have it on the record.

Bob: Yeah. We’ll see in 2035.

Interviewer: That’s amazing. I’m curious about reasoning. It seems to me that OpenAI really leaned big behind this paradigm, probably before the others. And now everybody has reasoning models. What did you see in reasoning that caused you to lean so far in so quickly?

Bob: Well, I mean, effectively, it really was sort of this missing piece where with pre-training, the model has an intuitive sense of how to answer the question. But if I ask you to multiply two five-digit numbers, this is something that’s completely within your capability. But if I asked you to do it right now, you wouldn’t be able to do it because it is just a natural human capability to be able to think about something before we answer, to have a scratchpad, to be able to work through a problem. And that is something that, you know, the initial models, even GPT-3, really didn’t have. And so we began to see, you know, glimmers of this publicly, things like, you know, ‘thinking step-by-step.’ And the idea of having a chain of thought that you could train, that the model would learn itself how to guide a chain of thought, not just be guided by cloning from publicly available data on how humans think. That was very powerful. And we knew that it would be more powerful than pre-training because, in fact, your thoughts are inside your head; they are not something that the model has access to. And so almost all the data that’s out there is actually something that’s just the final process. But you don’t get to see that chain of thought. And so the model had to figure it out for itself. That’s why reasoning mattered.

Interviewer: You alluded to earlier that we probably still have to uncover more things in reasoning. Do you think we have a good sense of what those things are today? Or are we earlier in that R&D stage?

Bob: I think at this point with reasoning, if you are at the coalface, then you’re seeing a lot of ideas and refinements of things that you can do. I think we’ve gone past the point where if you’re on the outside, if you’re not at a frontier lab, you’re probably not seeing them anymore. And, you know, this is the same situation we saw where at one point academic labs could make huge amounts of progress. And then later, you know, I would begin to see academic papers and I’d think, “Oh, they rediscovered this thing that we found a long time ago.” And so, you know, now the level of effort that’s being put into this, I think, is actually quite intense. So there are definitely things to be discovered, but they’re not sort of simple ideas that you and I could talk about.

Interviewer: Cool. Switching gears a little bit. You tweeted recently about agents. I think a very, very interesting take, that agents will be incredibly powerful, but priced at the cost of compute due to competition. If that’s the case, where do you see the opportunities in new startups and companies that are now building agents?

Bob: Yeah. So, I mean, I think the thing about agents is people think, “Well, you know, I’m going to go develop an agent.” And they look at how much the job is worth out there by a human. So, you know, you want to develop an AI lawyer and you think lawyers get paid a lot of money. So I’ll be able to develop an AI lawyer and I’m going to be able to charge huge amounts of money for it.

Interviewer: Tens of thousands of dollars a month.

Bob: Exactly. Exactly. Right. But the reason lawyers are expensive is because their time is scarce. Because there are only so many people who have undergone that training. But by the time you’ve made an AI model out of it, well, now there’s effectively an infinite number of lawyers. And so it’s not scarce at all. And maybe you, with your AI lawyer startup, will be able to have a lead over other people. But it’s the same frontier model underneath. And some other startup can come in and compete that away. And so we should expect to see it priced at some opportunity cost over the cost of compute. Interesting. Because you’re changing. You now have a lot more supply, infinite supply of the highest capability intelligence in whatever domain you now have.

And on the one hand, there’s a story where they say, “Oh, this is bad because startups can’t make money.” But this is actually the future we want, right? We want ‘services that don’t require people’ to be extremely cheap. You want everyone to have access to a lawyer. What you want to be expensive and scarce are things that are actually about personal relationships. So, you know, maybe we won’t be asking the human lawyers to write contracts because agents will be doing that for us. But we’ll be asking them for sort of deep advice on, you know, how legal challenges affect the, you know, detailed challenges that I’m facing in my business. And I think that’s the world we want to live in.

Interviewer: Do you think application companies will make any money selling agents, though? Like, where would you tell us to invest?

Bob: Yes and no. So just to back up for a second, people often talk about where does the value accrue in the stack, right? Is it at the model layer? Is it at the application layer? And if you look at the model layer, it’s very competitive. Every company has a frontier model. Some of the frontier models can do things other frontier models can’t. But by and large, they’re all really very good. And if you’re an enterprise, you can swap them out very easily. And beyond the frontier, all the models that are answering the bulk of the questions are distilled. They’re very competitive. And so this isn’t a very good business to be in when you consider the cost of training the models.

So what’s the point of training models in the first place? It’s to give you an option. It’s to give the frontier labs an option on the valuable places in the application layer that are coming up. So, you know, ChatGPT, that’s a great business, right? There’s a lot of competition over that. I think probably it’s too late to replace ChatGPT. Maybe not. You’d have to do something very different. Coding, another place where all the Frontier Labs are eager to see right now. I think you can compete with the Frontier Labs, but you want to do something that’s different, something that involves more than just, you know, you talking to your computer, you doing some sort of personal productivity task on your computer. Something that involves other people, something that involves an enterprise. I think that the moats that you have for your business are going to be the same moats they always were. There’ll be network effects, brand, economies of scale. And so you want to find an agent that allows you to have those network effects, not just something that would be high-priced out in the world.

Interviewer: Are there particular domains that you think are maybe outside of the scope of what Frontier Labs want to innovate in and build in, that you think are interesting and that you’ve been mulling about? We’ve got scientists, lawyers, research analysts, agentic software engineers. What other domains have you been thinking about?

Bob: So personally, I’m very interested in robotics because I think robotics is something, I wouldn’t actually say it’s off the roadmap of the frontier labs right now, but I think it’s something that’s far enough away that to me it feels like where AI was a few years ago. And so I think this is a very good time to be a company like Figure or a company like Physical Intelligence or, you know, to start a new robotics company, maybe not one that’s competing with those two, but somebody that’s doing, you know, something different, something on its own. I think it’s, you know, at the end stages of being a research challenge and, you know, a matter of, you know, months or a small number of years away from being commercialized. So I think that’s really fun.

Interviewer: Why now? What do you think has changed? Like OpenAI famously had a robotics effort for a long time. What do you think has changed?

Bob: Well, you know, so in between Palantir and OpenAI, I actually wanted to start a robotics company myself. And I got to the point of teaching a robot to play checkers from vision back in 2016.

Interviewer: Wow.

Bob: Yeah, it was very, very cool.

Interviewer: It could pick up the checkers pieces?

Bob: It could pick up the checkers pieces and it could move them to a different place on the board.

Interviewer: Nice.

Bob: And my conclusion from this was that it was very fun and super cool and really far away from any form of commercialization. And when we pursued robotics at OpenAI, we didn’t pursue it for commercial motives. It was really a demonstration of the power of machine learning, and some of the ideas we had there later played into large language models. But I think what’s really changed is that now that you have LLMs, you have this language interface to the robot so that now you can describe the tasks much more cheaply. And you have really strong vision encoders, you know, that are tied into that intelligence. So that gives the robots really a head start at doing generic forms of tasks. So we spent years solving one specific problem, teaching a robot to manipulate a Rubik’s cube, and now a company like, let’s say, Physical Intelligence can spend months solving a huge variety of problems like laundry folding and cardboard and packing egg crates. And that’s something that they can only have because they’re building on top of existing frontier models and the entire tech and research stack that we’ve built over the last 10 years.

Interviewer: Yeah. I’m going to go back to this point you had on where the value is. And I really liked your framing that the foundation models kind of have an option on whichever parts of the application stack they want to own. How much of the application market do you think the foundation models will win?

Bob: I think I would look at this in a slightly different direction, which is if you’re a startup, where is it safe to play? And where is it that you’re going to get steamrolled by the frontier labs? And so I think the areas that I think are safe to play in are areas where you have to understand something very deeply outside the model. And so I think a lot of enterprise really has this flavor. So for example, Palantir’s AIP actually really fits this, where it’s not a model company, but it’s something that sits outside the model that interacts with the rest of the business. There’s another company I’m an investor in and an advisor to, called Distill, that builds AI systems that allow a business to sort of extract the context from within the business, feed that to the models, and then use that to make decisions. And so, you know, these are things that the frontier labs don’t want to do. The frontier labs see business problems as, “How do I train a model to do something new?” Yeah. And if you look at all these enterprises, each one of those is a very small problem. It’s not worth OpenAI’s or Anthropic’s time to train a model specifically for each one of them. If you flip the problem and you think about what is the system that goes around the models and how do I use the models to sort of ease the context in and get the outputs out, then suddenly that’s one problem. And I think it’s a big opportunity.

Interviewer: What are the specific use cases and problems that Distill and Palantir’s effort solve for those enterprise companies?

Bob: So a lot of times right now, what you see is you’re trying to automate some existing piece of work. And the easy cases are where that piece of work is in a regulated industry. And you’re working on something like healthcare. Maybe you’re interacting with insurance companies. And you have a workflow that is extremely scripted where the company cares a lot about fidelity to that workflow. And that doesn’t mean you can just say, “Hey, AI, go read the clinical guidelines and make these decisions.” But with a process of transformation, you can get it to the point where the AI can do that. And that’s sort of the low-hanging fruit.

And then the next level up, though, is that if you imagine you’re working on something that isn’t a regulated industry or that isn’t extremely scripted, and you want to ask someone, you want to automate some labor-intensive process, well, the first thing you have to do is make that legible. And if you go to someone and you ask them to describe their job, a lot of times, you know, their manager doesn’t know what they do. They don’t even really know what they do. They can give you examples, but they can’t say, like, “This is the workflow that I follow,” because in practice, they don’t follow a single workflow. Right. And so I think that is what a lot of these problems look like. And, you know, and for example, that’s actually what Distill does is to, you know, work with companies, help them, you know, take the data they have, interview the people with AI, systematize all of that and, you know, have it be something that an AI model can actually execute.

Interviewer: That’s really interesting. Huh. So this is also somewhat related to this other question I wanted to ask you about proprietary data. I was surprised to actually see you tweet this, but I’m very intrigued by this question that you posed, which was, how valuable will your proprietary data be compared to what your competitors’ infinitely smart, infinitely patient agents can estimate from public data? Can you unpack that for us a little bit?

Bob: Yeah. So, you know, a starting point for this is a few years ago, there was a lot of interest in training industry vertical-specific models, you know, that finance companies would say, “We’ve got all of this data that no one else has. And we’re going to train a finance model on top of GPTs or on top of Llama. And it’s going to be so much better.” And actually, all of those were worse than the next generation of GPT. Because the power of intelligence and the ability to synthesize new information was bigger than the power of sort of memorizing the old information that you have.

So that’s, I think, what this theme looked like a couple of years ago. But, you know, fast forward a year or two. You know, now the story is I have all of this proprietary data. I’ve accumulated it over years. And in some sense, for a lot of times, if that data is teaching the model a skill or if it’s meant to teach the model a skill, that data is sort of embodied labor. Someone worked through all these case studies. Someone called all of these customers and found out all of this information. Well, that embodied labor is now free. AI can do all those things. And so now there’s an opportunity. You can have AI call all those customers, do a big survey, find out what they know. You can have AI work through all the case studies, just a lot of chats with o3, right? And then now you’ve replicated that proprietary data, but without needing all of that work.

Interviewer: How do you square that with the value of real-world proprietary data? Say something like what Cursor gets from its developer community constantly or Tesla Autopilot over the last handful of years.

Bob: So I think those are in the middle because they’re really huge amounts of data. I think there are challenges sometimes to training on the data that you get from your users. A lot of times, you know, one thing models can’t do is if you train on data and you memorize data about a specific person, maybe that leaks out into the next person. So that’s a real challenge to using these kinds of proprietary data. I think there is a kind of real-world proprietary data that’s very useful, which is data, very specific data about very specific customers that they trust you to use on their behalf. So to give an example, my financial advisor knows a lot about me. She knows my entire portfolio, the kinds of objectives that I have, my risk tolerance, right? And she uses all of that information to give me a better outcome, which is what is the next asset I should buy. And she doesn’t do that… like the data doesn’t make her a better financial advisor. It doesn’t teach her skills, but it allows her an opportunity to use the skills she already has. And so that’s the place where I think proprietary data is really useful.

Interviewer: I want to switch gears a bit to coding. It feels like, you know, software engineering has just gone through this fast takeoff moment. And, you know, just judging from the pace of how quickly things are changing, you know, I think there’s at least a certain subset of the market that thinks, you know, the superintelligence takeoff probability is a lot higher than it used to be, than folks thought it was, just given how quickly coding has taken off. What’s your view of what’s happened in the coding space?

Bob: So I think, you know, on the one hand, coding has taken off very quickly. On the other hand, way back in January 2020, as soon as we saw GPT-3, we launched a project to train GPT-3 how to code. And so when you look at an exponential curve, the progress is actually the same the whole time. But the impact of that progress can become very nonlinear when it passes a threshold. And that’s what’s happened with coding in the last couple of years. And so my take on where coding will go is that you’re going to continue to see a mix of coding with the user in an IDE, you know, traditional Cursor-style work, and coding in the background as an agent, you know, something like Devin-style work. And that, you know, this is going to continue for a long time. It as, you know, a year or two maybe is a long time in AI adoption.

Interviewer: That’s forever in AI years.

Bob: But that, you know, if you think about something like “vibe-coding,” right? Like the story you hear with vibe-coding is you can, you know, if you have a PM, right, and you want to create a demonstration project. I think that you’re going to see PMs vibe-coding really cool prototypes, really cool demos that they can use to get user feedback. But then those things are going to get thrown away and they’re going to get rebuilt with professional software engineers. Because if you are given a code base that you don’t understand, this is a classic software engineering question. Is that a liability or is it an asset? And the classic answer is that it’s a liability. You have to maintain this thing. You don’t know how it works. No one knows how it works. The answer is it’s actually cheaper to rewrite it from scratch. And so we don’t yet have a way that we’re comfortable with agents being the ones that understand the code base right now. I think the liability has gone down, but it’s still net a liability. You need humans to do the design, to understand the code base at a high level so that when something breaks, when the project itself becomes too complicated for the AI to understand, you can have a human do a problem decomposition and break it down into problems that are small enough for the AI.

Interviewer: What do you think happens after that one or two years, though?

Bob: Oh, I don’t know. We’re going to have to find out.

Interviewer: I love your bifurcation, though, of, you know, on one side, agentic software engineers that handle tasks autonomously in the background, and on the other side, human programmers who code in an IDE with the help of AI. I don’t think that most of the mainstream actually believe that, realize that. Can you maybe unpack that for us a little bit? What would the agentic software engineers who handle these tasks autonomously handle? And then where do you see this other end of the spectrum go? Do they collide at some point? Do you think they remain separate things over time in the long term?

Bob: I think it is already a spectrum where, you know, the things that your agentic software engineers can do, you can say, “Well, fix a bug, do a refactor,” something that you know requires relatively little taste and has a clear outcome. Another great use case I’ve heard is, “Translate software from COBOL into Python.” Right? It’s very clear when you’ve done this correctly, but it’s a lot of work, it’s very boring, and you can’t get smart people who want to work on this and do a good job on it. On the flip side, if you’re doing something where it requires a lot of taste and taste in how it’s implemented, where there will be non-obvious consequences to how the implementation works. Maybe there are non-obvious performance consequences. Maybe there are non-obvious consequences in how the user interface is going to evolve, and therefore how that needs to change the abstractions deeper in the system. Those are places where right now we have no alternative but to have humans do that work. And I do think this is very interesting. Is there a way, you know, is there a sufficiently detailed spec or a sufficiently detailed architecture diagram that the agents can be writing for us? That means that when you take work from one agent and you put it into another agent, which could just be the same agent the next day with a different context window, that it’s able to actually make progress on the code base. So these are the kinds of questions I want to see the answers to over the next couple of years.

Interviewer: I love it. It’s exactly what we’re working on at Reflection.

Bob: Perfect.

Interviewer: Why are they called “Member of the Technical Staff?”

Bob: That’s a great, yeah, that’s a great question. So for a long time, this was true even before I joined OpenAI, by the way. I believe this was Greg Brockman’s idea. But we really wanted there not to be a distinction between engineers and researchers. If you look at a classic lab, a place like Google Brain, for example, where a lot of the people who started OpenAI came from, at the time and maybe still today there was a big differentiation between whether you had a PhD and you were a researcher or whether you were a software engineer and you did data, you did implementation. And it was bad because the researchers didn’t feel like they could get their hands dirty writing data code or writing implementation code. And you can’t understand the systems aspects of your research unless you’re writing the code.

If you think about what makes Alec Radford the genius researcher that he is, it’s that each time he does something, it’s that he looked very closely at the data. And he thought, “What are the possibilities of this data?” He wrote his own data scraping code from the very beginning. And so if you want to have someone who really understands the full stack, I think Paul Graham has this great analogy to painting where the resistance of the medium dictates the kind of painting that you’re able to make. Research is very much like that. It’s very much an artistic endeavor, and researchers themselves are artists and should act like artists. And so by not having that distinction, just by calling everyone Member of the Technical Staff, we were able to have a much more level playing field. And later that really came in handy when we had people who didn’t have PhDs. Many of the great researchers at OpenAI—you know, Aditya Ramesh, Alec Radford—you know, many of these people don’t have PhDs, and in fact learned their trade by working at OpenAI.

Interviewer: That’s a great answer. It’s a random throwaway question. I love that answer. So at a recent event, Sam Altman left us with some interesting fodder, which was how different generations use ChatGPT. He said if you’re old, you tend to use it as a Google replacement. If you’re in your 20s and 30s, you use ChatGPT as a life coach or a life advisor. And if you’re in high school or younger, then you’re using it as your operating system. How do you see people use ChatGPT around you? How do you have your kids use ChatGPT?

Bob: Yeah. So look, think about that operating system comment for a second. At the very highest level, the total addressable market for ChatGPT is every user intent that requires thought or action that you don’t want to do yourself. Anything that you wish got done, but you didn’t have to do is something that you might want to use AI for. And so there’s, I mean, if you think about that, there’s a version of that that feels very scary, right? It’s like people don’t do anything for themselves anymore. There’s a de-skilling. No one learns how to do hard things. We’re all just zombies watching our VR headsets, you know, watching movies.

But I don’t think that’s actually what people want out of AI. And I don’t mean this, this isn’t the world we want to live in. I think that’s true. But that’s not what I want out of my relationship with AI. And that’s not what I see people doing now. And partly, this is because the technology for ChatGPT as an operating system isn’t actually there yet. Pretty famously, you cannot use ChatGPT to control your iPhone. But it’s also not what people want. And so I see this with my son. He’s eight years old. He’s been using ChatGPT from a pretty young age. I used to ask him to test the models before they were released. And he always gave pretty good feedback, actually. And he spends a lot of time with ChatGPT. He knows it is not his friend. It is not his companion. It is an expert, someone he can talk to, to explain things to him. And if you are eight years old, having someone who can explain things to you correctly, in great detail, and with a lot of patience is a very valuable thing.

And so he has curiosity, he has enthusiasms. And one day he decided he wanted to be a coin collector. And so he collected all the coins in the house, sorted through all the ones that were from before 1970, went to ChatGPT, started typing and just asked, took pictures and just asked questions about every single one of the coins from before 1970. And he’s, you know, “What’s this worth? Well, what would make this worth more? You know, how can I test what… What is a mint mark?” You know, all these different questions. And if you think about this, this is something, you know, when I was a kid, I probably could have learned this. Like maybe there were books, there were magazines, maybe I could have looked at an encyclopedia. But all of this is just so accessible now. And it’s accessible to an eight-year-old. And so when we went on vacation, we took him to a coin shop. And the staff at the coin shop were just shocked at how much this eight-year-old knew. And the very detailed questions, he was saying, “Show me all your coins. No, I don’t want that one. I want one that has a San Francisco mint mark. I want one from this year. This is the year that they were all made out of silver.” And the coin shop owner was just very surprised. He doesn’t deal with kids that have that level of detail, at least not until now. And so this is, I think, actually what we want out of AI is that AI should make you an expert at the things you want to do. And it should remove the burden of doing the things, the boring things that you don’t want to have to do.

Interviewer: Yeah. On the topic of the next generation, how else are you preparing that next generation for all the capabilities to come in AI?

Bob: I think this is a super, super tough question. If you think about any particular field, you know, should you teach your son how to code, right? Like I think about my eight-year-old, you know, my daughter is writing essays. My eldest son is really excited about math. All of those things are going to be automated. And so it’s clearly not some specific skill that you have to teach them. I think there are really two things that I want my kids to understand. The first is the process of learning and figuring things out. So that’s the value in the math and the essay writing and the coding. It’s sort of this process of being able to do, of learning to learn. The second thing is having ideas and projects and the belief that you can do it and the ability to use whatever tools are at your disposal to figure it out. So this is agency, right? And so that’s where I think that’s the right way to have kids use AI right now.

And there’s always a trade-off. I’m often very torn between… So my eight-year-old uses ChatGPT for a lot of things, but I don’t let him use it to code because he’s trying to learn to code. And if he sees that he can just use it to code, then it’s going to be very hard for him to do that work to get all the way there. I don’t let my other kids use it to do their school assignments, of course. Why would you do that? But I want them to have those basics. And then once they have the basics, once they understand things one level down, to then be able to use it to extend their capabilities.

And here’s another fun story about my eight-year-old. Last week, he decided he wanted to build a project where the grandparents who are coming to visit could push a button and it would ring a buzzer in a different room and he could go bring them breakfast in bed. And he, uh, asked ChatGPT for help. I mean, he asked ChatGPT for help. It said, “Okay, you need jumper wires, you need, you know, two Arduino boards,” and you know, just a sort of list of things. And he asked a lot of questions, you know, “How is this going to work?” He asked it to give us a list of Amazon links for us to buy. I reviewed this, made sure he wouldn’t get electrocuted, and bought the items on Amazon for him. And now we’re putting it together. And my approach in this is I’m going to let him put it together, everything he can. I’m going to install the software because, you know, his computer’s locked down. He can’t install software. And this is going to be his project.

Interviewer: That’s amazing.

Bob: Who could have? None of us could have done that at eight years old. And he has learned so much in doing this. It’s not just that he outsourced it all to ChatGPT. Now he understands what Arduino is. He understands what the circuit board is. What is going, what happens when I hit this pin? Why is this pin named GPIO1? These are all, I mean, I don’t know the answers to these things either. So it’s really, it’s just this huge help that ChatGPT is able to do all these things for.

Interviewer: That’s amazing. Sparking curiosity and then agency. I love it.

Bob: And it’s also just the time to impact. And that just feeds more and more curiosity and agency.

Interviewer: Yeah, that’s right.

Bob: I mean, if you think back, you know, “Well, you want to do this project. Well, here’s a book on Arduino. And, you know, you’re going to have to write the code yourself. And, you know, what circuit boards am I supposed to use? I don’t even know how to do that.” You know, probably this project just dies on the vine. And, you know, there’s a truism in education theory that when someone asks a question, that’s the time when they’re ready to learn the thing that they’re asking the question about. And so you want to, you know, it’s worth going off script to answer someone’s question because you’re doing a huge service to them in teaching them that thing right then. And that’s, you know, now you have that. You have the ability to get your questions answered on demand at the right time for you when you are mentally ready to do it. Not when maybe you’re tired and you’re in school and you’re thinking about all sorts of other things. Just right then when you actually want to know the answer. And I think that’s hugely powerful.

Interviewer: So how else are you using AI in your daily life? ChatGPT, deep research, I’m sure, Howie AI for scheduling, maybe autopilot. What else?

Bob: So, yeah. So I pretty much exclusively use o3 at this point. Once you use a good model, I think it’s very hard to go back. I think I could probably use Gemini 1.5. I hear it’s really good. But of course, as we’ve talked about, if it’s good enough, why switch? And I use deep research about five times a week. And, you know, it’s hugely helpful. And I think even one time that it saves you a few hours of doing work sort of repays the cost.

Interviewer: Absolutely makes sense. What do you use deep research for?

Bob: It’s a mix. One answer is I’m batting around something with my kids. And it’s a question that no one has ever asked before, probably. And I want to know the answer. For example, what happens when you compress wood? You know, it starts off, it’s elastic compression, and then it starts deforming. And then you go a little further, and it becomes diamond. And then you go a little further from that, and it becomes a black hole. But actually, there are like a dozen steps. And so that’s a really fun topic to dive into. And just, you know, this is the kind of thing that would have been an XKCD comic 15 years ago and would have taken him weeks to figure out. And now you can get an answer just in a few seconds. Also, I use it when I’m thinking about a new domain or a new startup opportunity. Well, if I was interested in robotics, tell me everything there is to know about a particular company or about a particular market.

Interviewer: That’s our daily life.

Bob: Yeah. Yeah, yeah, yeah.

Interviewer: Any other new products?

Bob: Well, like you mentioned, I use an AI assistant for scheduling, which is great. I mean, I’m solo right now. I could hire an assistant, but it’s just actually more fun to do things myself. But calendaring, it’s really boring. And it’s just very nice and very pleasant to be able to CC an AI agent and have it do the calendaring for me.

Interviewer: I’d love to hear a little bit about managing OpenAI. The research org, it’s such a collection of insanely smart individuals, creative, I’m sure. And, you know, the feedback we have on you is exceptional in terms of, you know, what a fair and what a great manager and leader you’ve been for the organization. I guess what have been some of your lessons leading an organization like that?

Bob: So this sounds sort of boring, but like the core thing that you have to do as a manager is you have to really care about the people you’re managing. And this maybe isn’t relevant a lot of the time. A lot of the time as a manager, your day-to-day job, you’re coordinating, you’re helping people understand things, and loyalty doesn’t really matter that much. But there comes a time as a manager when you have to ask someone to do something hard. Early on in your career, this is when you have to ask someone to come in and work on Sunday when they’d rather be playing basketball. But later in their career, it’s working with someone and you have to ask them to give up a project they really care about and give it to someone else, or share credit for a research breakthrough that they know they could get to by themselves, but that a team of people—not just this one talented person, but two very talented people or three very talented people working together—could get done even faster.

And one thing I learned from working with Alex Karp at Palantir is that very talented people have superpowers, but they also have debilitating weaknesses. And for people who are at the very edge of these capabilities, they often don’t even understand what their weaknesses are. But it’s extremely apparent to everyone around them. And for me as a manager, it’s something that I could see very easily. And at this level of capability, when people fail, it’s almost always a form of self-destruction.

Interviewer: Wow.

Bob: That there’s a choice that they could have made. And I don’t mean little failures. I don’t mean like, “Oh, I had a bad day.” I mean, when someone makes a career-altering choice in a bad way, it’s almost always a matter of self-destruction. Because they had to do something that was very difficult for them. They had to confront something that was extremely scary for them to do. That to everyone else is kind of obviously the right answer. It’s obviously the right thing for the company. But it’s emotionally extremely hard for them. And if you, going back to being a manager, if you as a manager, if people know that you’re in it for yourself, when you tell them to do something, they won’t trust you. But if they know that you are doing what’s best for them, then when you tell them to do that thing that is super hard and extremely scary for them, sometimes you can help them across the chasm. And you can solve the problem and prevent them from doing something really stupid and end up with something that works out really well. And I hold this bar even for firing people. For me, it is always when I am talking to someone, I have to be talking to them, giving them advice, helping them do the thing that is best for them and for the company. Even if you’re firing someone, if they’re not going to succeed in this role and I have invested enough time to make sure that they won’t succeed in this role, then it is in their own best interests for me to tell them that they’re not succeeding and give them the opportunity to find somewhere else. Loyalty in the end is the thing that I think unlocks all of the other things that you want in management.

Interviewer: I really, really love that. There was a nuance there that you said in the middle around working with a ton of high-performing individuals who are really excited about a particular research direction that they want to break through. They know they can get there, potentially by themselves, essentially with one or two others. They all have a good dose of confidence, maybe sometimes ego. How do you actually get them or convince them to embrace that effort of working together to get there?

Bob: Yeah, it’s very hard. And I think this is actually one of the things that’s very different about a research lab from an engineering culture. Because in an engineering culture, it’s sort of an assumption that we’re all working together, we’re all building one product. But research often comes out of academia, which has this very negative culture of, you know, it’s a PI, you know, it’s his team, who’s going to be the first author, who’s going to be the last author, none of the other people in the middle matter. And we struggled with this a lot. And I don’t think there is any one answer. One thing we tried, which worked well for a time, we published some papers where we actually had OpenAI be the first author so that there wouldn’t be the fight over who’s the first author. That was, you know, one technique. It didn’t always, you know, we couldn’t always do that. It didn’t always make sense. But, you know, in the end, the key is really when you work with people, you understand there’s something they want and you have to find a way to give them the thing they want. And let them do the thing they want to do, the art that they’re trying to create, while also letting all the other people do that and having it all add up to one big whole and just spend time over and over again, making sure you’re solving that problem.

Interviewer: Security, I know, is an interesting topic to you. In an increasingly agentic world, what kinds of security issues do you think we should be aware of? And where do you see potential opportunities?

Bob: When I think about how AI impacts security, for me, the first order is the much easier ability to do offensive work than you could do previously. And so the number of threats has gone up. The time to execute on a threat has gone down. And so that then pushes the defense to be much more agentic. So there’s a company I’m an investor in, it’s called Outtake. I met the team. They’re a group of ex-Palantir folks. And we also ended up using them very successfully at OpenAI. And what they’ve done is that they have made an agentic stack for doing cybersecurity that uses very little human input. And I think this is, you know, right now we’re at a place where the models can actually do all of these things. You know, if there’s something that a human could do that’s sort of these one of these bulk operations, you know, if you can’t make the model do it, that’s your fault. It’s not the model’s fault. But the barrier then is that businesses and organizations aren’t set up to do this. They have to go change their business processes in order to make this happen. And so I think that’s an opportunity for startups where similar, you know, as big as, you know, the shift from web to mobile is just disrupting the existing businesses because it may be faster for you to replicate their technology and their distribution than for them to be able to change the way they operate to get rid of or reduce the number of humans they need.

Interviewer: Awesome. Bob, thank you so much for joining us. This has been a pleasure to have you here.

Get The Memo

by Dr Alan D. Thompson · Be inside the lightning-fast AI revolution.
Informs research at Apple, Google, Microsoft · Bestseller in 147 countries.
Artificial intelligence that matters, as it happens, in plain English.
Get The Memo.

Alan D. Thompson is a world expert in artificial intelligence, advising everyone from Apple to the US Government on integrated AI. Throughout Mensa International’s history, both Isaac Asimov and Alan held leadership roles, each exploring the frontier between human and artificial minds. His landmark analysis of post-2020 AI—from his widely-cited Models Table to his regular intelligence briefing The Memo—has shaped how governments and Fortune 500s approach artificial intelligence. With popular tools like the Declaration on AI Consciousness, and the ASI checklist, Alan continues to illuminate humanity’s AI evolution. Technical highlights.

This page last updated: 18/Jun/2025. https://lifearchitect.ai/bob/