GPT-5

👋 Hi, I’m Alan. I advise government and enterprise on post-2020 AI like OpenAI’s upcoming GPT-5, and Google’s ongoing Pathways models. You definitely want to keep up with the AI revolution this year. My paid subscribers (DeepMind, Microsoft, Google, Stripe, Samsung…) receive bleeding-edge and exclusive insights on AI as it happens.
Get The Memo.

Summary

Organization OpenAI
Model name GPT-5
Internal/project name
Model type Multimodal
Parameter count Estimate: 2T-5T (2,000B-5,000B)
Based on:
i. doubling of compute power (10,000 ➜ 25,000 NVIDIA A100 GPUs with some H100s).
ii. doubling of training time (~3-5 months ➜ ~ 6-10 months)
Dataset size (tokens) Estimate: 20T (20,000B)
Training data end date Estimate: Dec/2022
Convergence date Estimate: Dec/2023
Release date (public) Estimate: Mar/2024
Paper
Playground

GPT-5 Updates

29/Mar/2023: ‘i have been told that gpt5 is scheduled to complete training this december and that openai expects it to achieve agi. which means we will all hotly debate as to whether it actually achieves agi. which means it will.’

23/Mar/2023: Microsoft paper on GPT-4 and early artificial general intelligence: https://arxiv.org/abs/2303.10130

20/Mar/2023: OpenAI paper on GPT and employment: ‘We investigate the potential implications of Generative Pre-trained Transformer (GPT) models and related technologies on the U.S. labor market.’ https://arxiv.org/abs/2303.10130

13/Feb/2023: Morgan Stanley research note:

We think that GPT 5 is currently being trained on 25k GPUs – $225 mm or so of NVIDIA hardware…

The current version of the model, GPT-5, will be trained in the same facility—announced in 2020 [May/2020, Microsoft], the supercomputer designed specifically for OpenAI has 285k CPU cores, 10k GPU cards, and 400 Gb/s connectivity for each GPU server; our understanding is that there has been substantial expansion since then. From our conversation, GPT-5 is being trained on about 25k GPUs, mostly A100s, and it takes multiple months; that’s about $225m of NVIDIA hardware, but importantly this is not the only use, and many of the same GPUs were used to train GPT-3 and GPT-4…

We also would expect the number of large language models under development to remain relatively small. IF the training hardware for GPT-5 is $225m worth of NVIDIA hardware, that’s close to $1b of overall hardware investment; that isn’t something that will be undertaken lightly. We see large language models at a similar scale being developed at every hyperscaler, and at multiple startups.

Datacenter location

Dataset

OpenAI President, Greg Brockman (Oct/2022):

…there’s no human who’s been able to consume 40TB of text [≈20T tokens, probably trained to ≈1T parameters in line with Chinchilla scaling laws]



Get The Memo

by Dr Alan D. Thompson · Be inside the lightning-fast AI revolution.
Thousands of paid subscribers. Readers from Microsoft, Tesla, Google AI...
Artificial intelligence that matters, as it happens, in plain English.
Get The Memo.

Dr Alan D. Thompson is an AI expert and consultant, advising Fortune 500s and governments on post-2020 large language models. His work on artificial intelligence has been featured at NYU, with Microsoft AI and Google AI teams, at the University of Oxford’s 2021 debate on AI Ethics, and in the Leta AI (GPT-3) experiments viewed more than 2.5 million times. A contributor to the fields of human intelligence and peak performance, he has held positions as chairman for Mensa International, consultant to GE and Warner Bros, and memberships with the IEEE and IET. He is open to consulting and advisory on major AI projects with intergovernmental organizations and enterprise.

This page last updated: 1/Apr/2023. https://lifearchitect.ai/gpt-5/