Grok

Advising the majority of Fortune 500s, informing government policy, and in Sep/2024 used by Apple as their primary source for model sizes in their new model paper and viz, Alan’s monthly analysis, The Memo, is a Substack bestseller in 142 countries:
Get The Memo.


 

Summary

Organization xAI
Model name Grok-0 33B (Aug/2023)
Grok-1 314B MoE (Nov/2023)
Grok-2 (Aug/2024)
Grok-3 (Dec/2024)
Internal/project name
Model type Multimodal (text, vision)
Parameter count Grok-2 ≈ 600B parameters (Alan’s estimate)
Dataset size (tokens) Grok-2 ≈ 15T tokens (Alan’s estimate)
Training time (total) 25,000+ H100s…
See working, with sources.
Release date (public) 2023-2024
Paper
Playground https://x.com/i/grok
   

2024 optimal LLM highlights

Download source (PDF)
Permissions: Yes, you can use these visualizations anywhere, please leave the citation intact.

Grok Updates

14/Aug/2024: Grok-2 achieves MMLU-Pro=75.5=SOTA. Claude 3.5S MMLU-Pro=72.83.

Aug/2024: “Grok-2 has been tested on the LMSYS leaderboard under the name “sus-column-r.” At the time of this blog post, it is outperforming both Claude 3.5 Sonnet and GPT-4-Turbo.” [Alan: Grok is by Heinlein, Sixth Column is also by Heinlein: https://en.wikipedia.org/wiki/Sixth_Column]

22/Jul/2024: 100,000 H100s. “xAI team, X team, Nvidia & supporting companies getting Memphis Supercluster training started at ~4:20am local time [today, 22/Jul/2024]. With 100k liquid-cooled H100s on a single RDMA fabric, it’s the most powerful AI training cluster in the world!” & “training the world’s most powerful AI [Grok-3] by every metric by December this year [2024]” (Tweet)

22/Mar/2023: ‘Pause AI training so xAI can catch up!’ Letter signed by xAI founder and CEO: “we call on all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4. This pause should be public and verifiable, and include all key actors. If such a pause cannot be enacted quickly, governments should step in and institute a moratorium.” (FILI)

11/Mar/2023: xAI founded with 12 founding members + Musk:

  1. Igor Babuschkin: ex-DeepMind, OpenAI, CERN.
  2. Manuel Kroiss: ex-DeepMind, Google.
  3. Dr Yuhuai (Tony) Wu: ex-Google, Stanford, University of Toronto, ex-intern at DeepMind, OpenAI.
  4. Dr Christian Szegedy: ex-Google (over a decade).
  5. Prof Jimmy Ba: ex-University of Toronto, advised by Prof Geoffrey Hinton.
  6. Toby Pohlen: ex-Google, Microsoft.
  7. Ross Nordeen: ex-Tesla AI.
  8. Kyle Kosic: ex-OpenAI, OnScale, Wells Fargo.
  9. Greg Yang: ex-Microsoft Research, Harvard.
  10. Dr Guodong Zhang: ex-DeepMind, ex-intern Google Brain, Microsoft Research.
  11. Dr Zihang Dai: ex-Google, Tsinghua.
  12. Dr Dan Hendrycks (advisor): UC Berkeley, Center for AI Safety, ex-intern DeepMind

(— The Memo 17/Jul/2023)

Models Table

Summary of current models: View the full data (Google sheets)

Dataset

TBA

All dataset reports by LifeArchitect.ai (most recent at top)
Date Title
Aug/2024 What's in GPT-5?
Jul/2024 Argonne National Laboratory AuroraGPT (page)
Sep/2023 Google DeepMind Gemini: A general specialist
Mar/2022 What's in my AI? (GPT-1, GPT-2, GPT-3, MT-NLG, Chinchilla...)

Timeline to Grok

Date Milestone
9/Mar/2023 xAI founded.
22/Mar/2023 xAI founder and CEO disingenuously calls for pause on training all AI frontier models worldwide.
18/Aug/2023 (Five months after pause letter…)
Grok-0 33B announced.
3/Nov/2023 (Seven months after pause letter…)
Grok-1 314B announced.
7/Dec/2023 Grok-1 available on X.
17/Mar/2024 Grok-1 314B released on GitHub.
28/Mar/2024 Grok-1.5 314B announced.
12/Apr/2024 Grok-1.5V announced.
5/May/2024 Grok-1.5 314B available on X.
13/Aug/2024 Grok-2 released.
Dec/2024 Grok-3 expected.

AI Race

Download source (PDF)
Permissions: Yes, you can use these visualizations anywhere, please leave the citation intact.


AGI

Read more about Alan’s conservative countdown to AGI


Get The Memo

by Dr Alan D. Thompson · Be inside the lightning-fast AI revolution.
Bestseller. 10,000+ readers from 142 countries. Microsoft, Tesla, Google...
Artificial intelligence that matters, as it happens, in plain English.
Get The Memo.

Dr Alan D. Thompson is an AI expert and consultant, advising Fortune 500s and governments on post-2020 large language models. His work on artificial intelligence has been featured at NYU, with Microsoft AI and Google AI teams, at the University of Oxford’s 2021 debate on AI Ethics, and in the Leta AI (GPT-3) experiments viewed more than 4.5 million times. A contributor to the fields of human intelligence and peak performance, he has held positions as chairman for Mensa International, consultant to GE and Warner Bros, and memberships with the IEEE and IET. Technical highlights.

This page last updated: 16/Aug/2024. https://lifearchitect.ai/grok/