Open data in a new tab | 🔒Models Table Pro | Back to LifeArchitect.ai | Models Table is the definitive LLM data reference trusted by MIT,¹ Harvard,² Apple,³ Microsoft,⁴ and more…^†

Open data in a new tab | 🔒Models Table Pro | HF space | Back to LifeArchitect.ai

Models Table Rankings

2026 frontier AI models + highlights

Download source (PDF)
Permissions: Yes, you can use these visualizations anywhere, please leave the citation intact.

Older bubbles viz

2025Q1

Download source (PDF)
Permissions: Yes, you can use these visualizations anywhere, please leave the citation intact.

Oct/2024

Download source (PDF)
Permissions: Yes, you can use these visualizations anywhere, please leave the citation intact.

Feb/2024

Nov/2023

Download source (PDF)
Permissions: Yes, you can use these visualizations anywhere, please leave the citation intact.

Mar/2023

Download source (PDF)

Apr/2022

Download source (PDF)

More (Chinese models, Data ratio, Output of the intelligence explosion)

Open image

Open interactive chart

Data dictionary

Model (Text)
Name of the large language model. Sometimes uses filename syntax.

Lab (Text)
Name of the organization or group responsible for training or publishing the model. Sometimes lists a consortium such as “International”. Color highlights popular lab names.

Playground (URI)
URI pointing to a playground of the model, or HuggingFace repository for hosting weights.

Parameters (B) (Float)
Total number of parameters (weights) in the model. Using total weights for Dense, and total weights (not just active weights) for MoE.

Tokens trained (B) (Integer)
Total number of tokens (sub-words) used to train the model end-to-end, taking into account reported dataset, epochs, pretraining, and fine-tuning tokens.

Ratio Tokens:Params (Ratio)
Number of tokens trained per parameter. Chinchilla scaling ≥ 20:1. Color highlights RED=0–7, ORANGE=8–16, GREEN=17–499, DARK GREEN=500–9999.

ALScore (Float)
Quick and dirty rating of the model’s power. The formula is: Sqr Root of (Parameters x Tokens) ÷ 300. Any ALScore ≥ 1.0 is a powerful model in mid-2023. Color highlights centerpoint 15.

MMLU (Float)
Benchmark score 0–100 on Massive Multitask Language Understanding, released Sep/2020: MMLU paper.

MMLU-Pro (Float)
Benchmark score 0–100 on Massive Multitask Language Understanding Pro, released Jun/2024: MMLU-Pro paper.

GPQA (Float)
Benchmark score 0–100 on Google-Proof Q&A, released Nov/2023: GPQA paper.

HLE (Float)
Benchmark score 0–100 on Humanity’s Last Exam, released Jan/2025: HLE paper.

Training dataset (Text)
Rough guide of major datasets used to train the model. Note increasing use of synthetic data from 2023.

Announced (Date)
Date as month/year. All data sorted by this column descending.

Public? (Symbol)
Ternary: GREEN=publicly accessible (weights, API, playground…), YELLOW=video or scripted demo only, RED=held in lab and never released.

Paper / Repo (URI)
URI pointing to official paper, technical note, or model card. Sometimes shows link to GitHub repository.

Arch (Text)
Architecture: Dense versus Mixture of Experts (MoE).

Tags (Dropdown)
Reasoning=Reasoning model (binary). SOTA=State-of-the-Art model at launch (binary).

Notes (Text)
Any further comments or useful highlights.

List of models shown in the Models Table (text for indexing)

List of models shown in the Models Table

Models Table, list of 800+ large language models, as used by all major AI labs, including:

2026
Google DeepMind Gemma 4 12B, Microsoft Aion-1.0-Plan (14B), Microsoft Aion-1.0-Instruct (2B), Microsoft MAI-Code-1-Flash (30B), Microsoft MAI-Thinking-1 (1000B), Independent KeyLM-75M-Instruct, NVIDIA Cosmos 3 Super (64B), JetBrains Mellum2-12B-A2.5B-Thinking, Alibaba Qwen3.7-Plus (480B), NVIDIA Nemotron 3 Ultra (550B), MiniMax MiniMax-M3 (1000B), StepFun Step 3.7 Flash (198B), Liquid AI LFM2.5-8B-A1B, Anthropic Claude Opus 4.8 (5000B), Biohub ESMC 6B, OpenBMB MiniCPM5-1B, NVIDIA Gated DeltaNet-2 (1.3B), Cohere Command A+ (218B), Alibaba Qwen3.7-Max (2000B), Sapient Intelligence HRM-Text-1B, NVIDIA Nemotron-Labs-Diffusion-14B, Google DeepMind Gemini 3.5 Flash (500B), Zyphra ZAYA1-8B-Diffusion-Preview, Shanghai AI Laboratory/SenseTime Intern-S2-Preview (35B), Inclusion AI Ring-2.6-1T, Thinking Machines Lab TML-Interaction-Small (276B), Cactus-Compute Needle (0.026B), NVIDIA NVIDIA-Nemotron-Labs-3-Elastic-30B-A3B, Zyphra ZAYA1-74B-Preview, Subquadratic SubQ 1M-Preview, Multiverse Computing Llama 3.1 8B + CUA (quantum), Zyphra ZAYA1-8B, Genesis AI GENE-26.5 (30B), OpenAI GPT-5.5 Instant (300B), IBM MAMMAL (0.458B), Baidu ERNIE-5.1-Preview (800B), IBM Granite-4.1-30B, Mistral Mistral Medium 3.5 (128B), NVIDIA Nemotron 3 Nano Omni (30B), Poolside Laguna XS.2 (33B), Poolside Laguna M.1 (225B), DeepSeek-AI DeepSeek-V4-Pro (1600B), Independent talkie-1930-13b, Tencent Hy3 preview (295B), Inclusion AI Ling-2.6-1T, OpenAI GPT-5.5 (3000B), Independent Marul V7 (0.258B), Alibaba Qwen3.6-27B, Xiaomi MiMo-V2.5-Pro (1020B), Inclusion AI Ling-2.6-Flash (104B), IBM Granite-4.1-8B, Independent OpenMythos (0.77B), Moonshot AI Kimi K2.6 (1000B), Alibaba Qwen3.6-Max-Preview (2000B), Alibaba Qwen3.6-35B-A3B, xAI Grok 4.3 (500B), Anthropic Claude Opus 4.7 (5000B), OpenAI GPT-Rosalind (3000B), OpenAI GPT-5.4-Cyber (3000B), Alibaba Marco-Mini (17.3B), LG EXAONE 4.5 (33B), Meta AI Muse Spark (70B), TokenAI Horus 1.0 4B, PrismML Ternary Bonsai 8B, Anthropic Claude Mythos Preview (10000B), Alibaba Qwen3-4B (In-Place TTT), Google DeepMind Gemma 4 31B, Generalist GEN-1 (30B), Alibaba Qwen3.6-Plus (1000B), Arcee AI Trinity-Large-Thinking (400B), Independent Machina Mirabilis (3.3B), PrismML 1-bit Bonsai 8B, Liquid AI LFM2.5-350M, H Company Holo3-122B-A10B, Alibaba Qwen3.5-Omni-Plus (35B), Independent mr_chatterbox (0.34B), Z.AI GLM-5.1 (744B), NVIDIA Nemotron-Cascade-2-30B-A3B, Xiaomi MiMo-V2-Pro (1000B), UToronto MDM-Prime-v2 (1.1B), CMU Mamba-3 (1.5B), MiniMax MiniMax-M2.7 (230B), H Company Holotron-12B, Allen AI OLMo Hybrid (7B), Mistral Mistral Small 4 (119B), MiroMindAI MiroThinker-H1 (235B), 1Covenant Covenant-72B, NVIDIA Nemotron 3 Super (120B), Sarvam AI Sarvam 105B, MIT FINGERS-7B, OpenAI GPT-5.4 (3000B), YuanLabAI Yuan3.0-Ultra (1515B), OpenAI GPT-5.3 Instant (300B), Google STATIC (3B), Quiver Arrow 1.0 (32B), Alibaba Qwen3.5-27B, Liquid AI LFM2-24B-A2B, Inception Labs Mercury 2 (180B), Google DeepMind Gemini 3.1 Pro (3000B), Zyphra ZUNA (0.38B), xAI Grok 4.2 (500B), Prime Intellect INTELLECT-3.1 (106B), Anthropic Claude Sonnet 4.6 (1000B), Cohere Tiny Aya (3.35B), NVIDIA gpt-oss-puzzle-88B, Alibaba Qwen3.5-397B-A17B, JD Open Source JoyAI-LLM Flash (48B), Inclusion AI Ring-2.5-1T, Inclusion AI Ling-2.5-1T, MiniMax MiniMax-M2.5 (230B), Z.AI GLM-5 (744B), Nanbeige Nanbeige4.1-3B, Alibaba RynnBrain-30B-A3B, Anthropic Claude Opus 4.6 (5000B), Shanghai AI Laboratory/SenseTime Intern-S1-Pro (1000B), StepFun Step 3.5 Flash (196B), Independent Assistant_Pepe_8B, Arcee AI Trinity-Large (400B), Allen AI SERA (32B), Moonshot AI Kimi K2.5 (1000B), Z.AI GLM-4.7-Flash (30B), Google DeepMind MedGemma 1.5 4B, Microsoft FrogBoss (32B), NVIDIA EDEN (28B), Baichuan Baichuan-M3 (235B), DeepSeek-AI Engram (39.5B), Stanford SleepFM (0.091B), Independent TimeCapsuleLLM-v2-1800-1875 (1.2B), AI21 Jamba2 (52B), Liquid AI LFM2.5 (1.2B), MiroMindAI MiroThinker v1.5 (235B), TII Falcon-H1R (7B)

2025
DeepSeek-AI mHC 27B, MiniMax MiniMax-M2.1 (229B), IQuestLab IQuest-Coder-V1 (40B), SK Hynix A.X K1 (519B), LG K-EXAONE (236B), UZH Ranke-4B, Tencent WeDLM (8B), Upstage AI SOLAR Open (102B), Z.AI GLM-4.7 (355B), NVIDIA NitroGen (0.493B), Xiaomi MiMo-V2-Flash (309B), Google DeepMind FunctionGemma (0.27B), Google DeepMind T5Gemma 2 (4B), Google DeepMind Gemini 3 Flash (200B), NVIDIA Nemotron 3 Nano 30B-A3B, Allen AI Bolmo (7B), Consortium EuroLLM-22B, Inclusion AI LLaDA2.0 Flash (103B), OpenAI GPT-5.2 (3000B), ServiceNow Apriel-1.6-15B-Thinker, Motif-Technologies Motif 2 12.7B, Mistral Devstral 2 (123B), Nanbeige LLM Lab Nanbeige4-3B-Base, Tencent HY 2.0 (406B), MBZUAI K2-V2 (70B), Arcee AI Trinity-Mini (26B), Amazon Nova 2 Pro (200B), Mistral Mistral Large 3 (675B), DeepSeek-AI DeepSeek-V3.2-Speciale (685B), Zyphra ZAYA1-base (8.3B), DeepSeek-AI DeepSeek-Math-V2 (685B), NVIDIA Orchestrator-8B, Prime Intellect INTELLECT-3 (106B), Microsoft Fara-7B, Anthropic Claude Opus 4.5 (5000B), NVIDIA Nemotron Elastic (12B), Tencent GeoVista (7B), Allen AI OLMo 3 (32B), Google DeepMind Gemini 3 Pro (3000B), xAI Grok 4.1 (3000B), PleIAs Baguettotron (0.321B), Baidu ERNIE-5.0-Preview-1022 (2400B), OpenAI GPT-5.1 (3000B), NVIDIA TiDAR (8B), NVIDIA SONIC (0.04B), Tsinghua JustRL-Nemotron-1.5B, Baidu ERNIE-4.5-VL-28B-A3B-Thinking, Google DeepMind HOPE (1.3B), Moonshot AI Kimi K2 Thinking (1000B), Inclusion AI Ling-1T, Generalist GEN-0 (10B), ByteDance Ouro (2.6B), Wechat CALM (1.82B), Moonshot AI Kimi-Linear (48B), MiniMax MiniMax-M2 (230B), Cambridge/LBNL MACE-MH-1 (0.025B), DeepSeek-AI DeepSeek-OCR (3B), Microsoft UserLM-8b, Salesforce CoDA (1.7B), Samsung TRM (0.007B), IBM Granite-4.0 Small (32B), Z.AI GLM-4.6 (355B), Inclusion AI Ring-1T-preview, Anthropic Claude Sonnet 4.5 (1000B), Google DeepMind Gemini Robotics 1.5 (200B), Google DeepMind Gemini Robotics-ER 1.5 (30B), Google TimesFM-ICF (0.2B), Alibaba Qwen3-Max (1000B), Alibaba Qwen3-Omni (30B), DeepSeek-AI DeepSeek-V3.1-Terminus (685B), Perceptron Isaac 0.1 (2B), xAI Grok 4 Fast (3000B), Google DeepMind VaultGemma (1B), Alibaba Qwen3-Next-80B-A3B, MBZUAI K2-Think (32B), JHU mmBERT (0.307B), Baidu ERNIE X1.1 (424B), Baidu ERNIE-4.5-21B-A3B-Thinking, Kuaishou Klear-46B-A2.5B, Tilde AI TildeOpen-30b, Alibaba Qwen3-Max-Preview (1000B), Moonshot AI Kimi K2-Instruct-0905 (1000B), ETH Z√ºrich Apertus (70B), Meituan LongCat-Flash (560B), Baichuan Baichuan-M2 (32B), Microsoft MAI-1-preview (500B), xAI grok-code-fast-1 (800B), Nous Research Hermes 4 (405B), NVIDIA Jet-Nemotron-4B, DeepSeek-AI DeepSeek-V3.1-Base (685B), NVIDIA Nemotron Nano 2 (12.31B), Google DeepMind Gemma 3 270M, OpenAI GPT-5 (3000B), OpenAI gpt-oss-120b, OpenAI gpt-oss-20b, Anthropic Claude Opus 4.1 (5000B), Z.AI GLM-4.5 (355B), China Telecom Artificial Intelligence Research Institute T1 (115B), Shanghai AI Laboratory/SenseTime Intern-S1 (235B), StepFun Step 3 (321B), Alibaba Qwen3-235B-A22B-Thinking-2507, Kuaishou KAT-V1-200B, Kuaishou KAT-V1-40B, Alibaba Qwen3-Coder-480B-A35B-Instruct, Alibaba Qwen3-235B-A22B-Instruct-2507, Allen AI FlexOlmo (37B), LG EXAONE 4.0 (32B), Moonshot AI Kimi K2 (1000B), Reka AI Reka Flash 3.1 (21B), Mistral Devstral Medium (50B), xAI Grok 4 (3000B), Microsoft Phi-4-mini-flash-reasoning (3.8B), Google DeepMind T5Gemma (9B), Google DeepMind MedGemma 1 27B, TNG R1T2 Chimera (685B), Consortium Spectra 1.1 (3.6B), Apple DiffuCoder (7B), Tencent Hunyuan-A13B (80B), Inception Labs Mercury (90B), Microsoft Mu (0.33B), Google DeepMind Gemini Robotics On-Device (20B), ICONNAI ICONN-1 (88B), MiniMax MiniMax-M1 (456B), Mistral Magistral Medium (50B), EleutherAI Comma v0.1-2T, Xiaohongshu/RedNote dots.llm1 (142B), Google DeepMind Gemini 2.5 Pro 06-05 (400B), Xiaomi MiMo-7B-RL-0530, Google DeepMind DeepTransformers (1.3B), Google DeepMind Atlas (1.3B), DeepSeek-AI DeepSeek-R1-0528 (685B), Fractal Analytics Fathom-R1-14B, Alibaba QwenLong-L1-32B, Anthropic Claude Opus 4 (6000B), TII Falcon-H1 (34B), Google DeepMind Gemini Diffusion (40B), Google DeepMind Gemma 3n (4B), Alibaba ParScale (4.7B), OpenAI codex-1 (600B), TII Falcon-Edge (3B), Windsurf SWE-1 (50B), Prime Intellect INTELLECT-2 (32B), Huawei Pangu Ultra MoE (718B), Mistral Mistral Medium 3 (50B), IBM Granite-4.0-Tiny-Preview (7B), Amazon Nova Premier (470B), Microsoft Phi-4-reasoning-plus (14B), IBM Bamba-9B-v2, Alibaba Qwen3-235B-A22B, Alibaba Qwen3-0.6B, Baidu ERNIE X1 Turbo (200B), Baidu ERNIE 4.5 Turbo (200B), Microsoft MAI-DS-R1 (685B), Google DeepMind Gemini 2.5 Flash Preview (80B), OpenAI o4-mini (200B), OpenAI o3 (600B), Microsoft BitNet b1.58 2B4T (2B), IBM Granite 3.3 8B Instruct, Zhipu AI (Tsinghua) GLM-4-0414 (32B), AI Singapore SEA-LION v3.5 70B R, OpenAI GPT-4.1 (300B), Google DeepMind DolphinGemma (0.4B), ServiceNow Apriel-5B, ByteDance Seed-Thinking-v1.5 (200B), Huawei Dream 7B, NVIDIA UltraLong-8B, Together Deepcoder-14B-Preview, Huawei Pangu Ultra (135B), NVIDIA Nemotron-H-56B-Base, NVIDIA Llama-3.1-Nemotron-Ultra-253B, Meta AI Llama 4 Behemoth (2000B), Meta AI Llama 4 Maverick (400B), Meta AI Llama 4 Scout (109B), Google DeepMind Sec-Gemini v1 (400B), DeepSeek-AI DeepSeek-GRM-27B, Featherless AI Qwerky-72B, Deep Cogito Cogito 70B, Google DeepMind Agentic-Tx (200B), Google DeepMind TxGemma (27B), Google DeepMind Gemini 2.5 Pro Preview (400B), DeepSeek-AI DeepSeek-V3 0324 (685B), NVIDIA Llama-3.3-Nemotron-Super-49B-v1, LG EXAONE Deep (32B), Mistral Mistral Small 3.1 (24B), Baidu ERNIE 4.5 (424B), Baidu X1 (424B), Allen AI OLMo 2 32B, Cohere Command A (111B), Google DeepMind Gemini Robotics (200B), Google DeepMind Gemini Robotics-ER (30B), Google DeepMind Gemma 3 (27B), Reka AI Reka Flash 3 (21B), Alibaba QwQ-32B, AI21 Jamba 1.6 (398B), AMD Instella-3B, Alibaba Babel-83B, IBM Granite-3.2-8B-Instruct, Cohere C4AI Command R7B Arabic (7B), OpenAI GPT-4.5 (4500B), Tencent Hunyuan T1 (389B), Tencent Hunyuan Turbo S (389B), Microsoft Phi-4-multimodal (5.6B), Microsoft Phi-4-mini (3.8B), Inception Labs Mercury Coder Small (40B), Alibaba QwQ-Max-Preview (325B), Anthropic Claude 3.7 Sonnet (400B), Moonshot AI Moonlight (16B), Figure S2 (7B), Figure S1 (0.08B), Baichuan Baichuan-M1-14B, Arc Institute Evo 2 (40B), Perplexity R1 1776 (685B), xAI Grok-3 (3000B), Mistral Mistral Saba (24B), Barcelona Supercomputing Center Salamandra (40B), Nous Research DeepHermes 3 Preview (8B), Shanghai AI Laboratory/SenseTime OREAL-32B, Google DeepMind Gemini 2.0 Pro (200B), Stanford s1-32B, OpenAI o3-mini (70B), Mistral Mistral Small 3 (24B), Allen AI Llama-3.1-Tulu-3-405B, Alibaba Qwen2.5-Max (325B), SambaNova EvaByte (6.5B), ByteDance UI-TARS-72B, ByteDance Doubao-1.5-pro (200B), Moonshot AI Kimi k1.5 (500B), DeepSeek-AI DeepSeek-R1 (685B), OpenAI GPT-4b, Kyutai Helium-1 (2B), Shanghai AI Laboratory/SenseTime InternLM3 (8B), MiniMax MiniMax-Text-01 (456B), Berkeley Sky-T1-32B-Preview, NVIDIA Cosmos Nemotron 34B, NVIDIA Cosmos 1.0 (14B), Prime Intellect METAGENE-1 (7B), Rubik’s AI Sonus-1 Reasoning (405B)

2024
Renmin YuLan-Mini (2.4B), DeepSeek-AI DeepSeek-V3 (685B), LinkedIn EON-8B, OpenAI o3-preview (600B), RWKV RWKV-7 Goose (0.4B), International ModernBERT (0.395B), IBM Granite 3.1 8B, IBM Bamba-9B, OpenAI o1-2024-12-17 (200B), TII Falcon 3 (10B), Cohere Command R7B (7B), Cohere Maya (8B), Meta AI BLT (8B), Meta AI Large Concept Model (7B), Microsoft Phi-4 (14B), Google DeepMind Gemini 2.0 Flash exp (30B), International Moxin-7B, Cerebras 1T, Shanghai AI Laboratory/SenseTime InternVL 2.5 (78B), Meta AI Llama 3.3 (70B), LG EXAONE-3.5 (32B), Ruliad Deepthought-8B, SAIL Sailor2 (20B), PleIAs Pleias 1.0 (3B), OpenAI o1 (200B), Amazon Nova Pro (90B), Consortium EuroLLM (9B), Nous Research DisTrO 15B, Prime Intellect INTELLECT-1 (10B), Alibaba QwQ-32B-Preview, OpenGPT-X Teuken-7B, Allen AI OLMo 2 (13B), CMU Bi-Mamba (2.7B), Moonshot AI k0-math (100B), Alibaba Marco-o1 (7B), Allen AI T√úLU 3 (70B), OpenAI gpt-4o-2024-11-20 (200B), DeepSeek-AI DeepSeek-R1-Lite (67B), XiaoduoAI Xmodel-LM (1.1B), Mistral Pixtral Large (124B), Fireworks f1 (405B), Alibaba Qwen2.5-Coder (32.5B), TensorOpera Fox-1 (1.6B), Tencent Hunyuan-Large (389B), AI Singapore SEA-LIONv3 (9.24B), AMD AMD OLMo (1B), Hugging Face SmolLM2 (1.7B), Cohere Aya-Expanse-32B, Anthropic Claude 3.5 Sonnet (new) (400B), IBM Granite 3.0 8B, IBM Granite-3.0-3B-A800M-Instruct, aiXcoder aiXcoder-7B, NVIDIA Llama-3.1-Nemotron-70B, Mistral Ministral 8B, 01-ai Yi-Lightning (200B), Zyphra Zamba2-7B, NVIDIA nGPT (1B), Inflection AI Inflection-3 Pi (3.0) (1200B), Inflection AI Inflection-3 Productivity (3.0) (1200B), Liquid AI LFM-40B, Salesforce SFR-LLaMA-3.1-70B-Judge, BAAI Emu3 (8B), NVIDIA NVLM 1.0 (72B), China Telecom Artificial Intelligence Research Institute Unnamed 1T, China Telecom Artificial Intelligence Research Institute TeleChat2-115B, AMD AMD-Llama-135m, Meta AI Llama 3.2 90B, Meta AI Llama 3.2 3B, Allen AI Molmo (72B), Google DeepMind Gemini-1.5-Pro-002 (200B), Alibaba Qwen2.5 (72B), Microsoft GRIN MoE (60B), Google DeepMind Data-Gemma (27B), OpenAI o1-preview (200B), Jina AI Reader-LM (1.54B), Mistral Pixtral-12b-240910, DeepSeek-AI DeepSeek-V2.5 (236B), 01-ai Yi-Coder (9B), Allen AI OLMoE-1B-7B, Consortium PLLuM (20B), Salesforce xLAM (141B), Magic LTM-2-mini (20B), Cartesia Rene (1.3B), Google DeepMind Gemini 1.5 Flash-8B, Aleph Alpha Pharia-1-LLM-7B, Stanford TTT-Linear (1.3B), AI21 Jamba 1.5 (398B), Microsoft phi-3.5-MoE (42B), Microsoft phi-3.5-mini (3.8B), NVIDIA Minitron-4B, Sarvam AI sarvam-2b, xAI Grok-2 (400B), LG EXAONE 3.0 (7.8B), TII Falcon Mamba 7B, Writer Palmyra-Med-70B, Writer Palmyra-Fin-70B, Zyphra Zamba2-small (2.7B), NVIDIA Minitron-8B, Mistral Mistral Large 2 (123B), Meta AI Llama 3.1 405B, OpenAI GPT-4o mini (8B), Mistral NeMo (12B), Mistral Codestral Mamba (7B), Mistral Mathstral (7B), Microsoft SpreadsheetLLM (1760B), Consortium Spectra (3.9B), DeepL next-gen (7B), Hugging Face SmolLM (1.7B), Vectara Mockingbird (9B), Google DeepMind FLAMe (24B), StepFun Step-2 (1000B), H2O.ai H2O-Danube3-4B, Microsoft Causal Axioms (0.067B), SenseTime SenseNova 5.5 (600B), Kyutai Helium 7B, Shanghai AI Laboratory/SenseTime InternLM2.5 (20B), BAAI Tele-FLM-1T, Renmin YuLan-Base-12B, Baidu ERNIE 4.0 Turbo (200B), Google DeepMind Gemma 2 (27B), OpenAI CriticGPT (3B), Apple 4M-21, EvolutionaryScale ESM3 (98B), Huawei PanGu 5.0 Super (1000B), Anthropic Claude 3.5 Sonnet (400B), DeepSeek-AI DeepSeek-Coder-V2 (236B), International DCLM-Baseline 7B 2.6T, NVIDIA Nemotron-4-340B, Apple Apple On-Device model Jun/2024 (3.04B), UCSC MatMul-Free LM (2.7B), Galileo Luna (0.44B), Alibaba Qwen2 (72B), Alibaba Qwen2-57B-A14B, Kunlun Tech Skywork MoE 16x13B (146B), CMU Mamba-2 (2.7B), International MAP-Neo (7B), LLM360 K2 (65B), Mistral Codestral (22B), Cohere Aya-23-35B, 01-ai Yi-XLarge (2000B), 01-ai Yi-Large (1000B), Meta AI Chameleon (34B), Google DeepMind LearnLM (1500B), Cerebras Sparse Llama 7B, Google DeepMind Gemini 1.5 Flash (8B), OpenAI GPT-4o (200B), TII Falcon 2 11B, Fujitsu Fugaku-LLM (13B), 01-ai Yi 1.5 34B, Microsoft YOCO (3B), DeepSeek-AI DeepSeek-V2 (236B), Independent ChuXin (1.6B), RWKV RWKV-v6 Finch (7.63B), ELLIS xLSTM (2.7B), IBM Granite Code (34B), Alibaba Qwen-Max (300B), Google DeepMind Med-Gemini-L 1.0 (200B), Microsoft TinyStories (0.033B), BAAI Tele-FLM (52B), Alibaba Qwen-1.5 110B, Snowflake AI Research Arctic (480B), SenseTime SenseNova 5.0 (600B), Apple OpenELM (3.04B), Microsoft phi-3-medium (14B), Microsoft phi-3-mini (3.8B), Meta AI Llama 3 70B, Zyphra Zamba 7B, Amazon HLAT (7B), Hugging Face Idefics2 (8.4B), Reka AI Reka Core (300B), Microsoft WizardLM-2-8x22B (141B), EleutherAI Pile-T5 (11B), Hugging Face Zephyr 141B-A35B, Cohere Rerank 3 (104B), OpenAI gpt-4-turbo-2024-04-09 (70B), Tsinghua MiniCPM-2.4B, Apple Ferret-UI (13B), Mistral mixtral-8x22b (141B), SAIL Sailor (7B), MIT JetMoE-8B, Tsinghua Eurus (70B), Cohere Command-R+ (104B), Silo AI Viking (33B), Nous Research OLMo-Bitnet-1B, International Aurora-M (15.5B), Apple ReALM-3B, Alibaba Qwen1.5-MoE-A2.7B, xAI Grok-1.5 (180B), AI21 Jamba 1 (52B), MosaicML DBRX (132B), Stability AI Stable Code Instruct 3B, Sakana AI EvoLLM-JP (10B), Rakuten Group RakutenAI-7B, Independent Parakeet (0.378B), RWKV RWKV-v5 EagleX (7.52B), Apple MM1 (30B), Covariant RFM-1 (8B), Cohere Command-R (35B), DeepSeek-AI DeepSeek-VL (7B), Fudan University AnyGPT (7B), Stability AI Stable Beluga 2.5 (70B), Inflection AI Inflection-2.5 (1200B), SRIBD/CUHK Apollo (7B), Anthropic Claude 3 Opus (2500B), NVIDIA Nemotron-4 15B, Unbabel TowerLLM (7B), Google DeepMind Hawk (7B), Google DeepMind Griffin (14B), Microsoft BitNet b1.58 (70B), SambaNova Samba-1 (1400B), Cohere Aya-101 (13B), Hugging Face Cosmo-1B, Silo AI Poro (34.2B), ServiceNow StarCoder 2 (15B), ByteDance 530B, ByteDance 175B, Mistral Mistral Small (7B), Mistral Mistral Large (300B), Reliance Hanooman (40B), Apple Ask (20B), Reka AI Reka Edge (7B), Reka AI Reka Flash (21B), Google DeepMind Gemma (7B), Google DeepMind Gemini 1.5 Pro (200B), Alibaba Qwen-1.5 72B, Meta AI MobileLLM (1B), BRAIN GOODY-2 (70B), ChatDB Natural-SQL-7B, AI Singapore Sea-Lion (7.5B), Google TimesFM (0.2B), Allen AI OLMo (7B), NVIDIA Audio Flamingo (1B), Cerebras FLOR-6.3B, AIWaves.cn Weaver (34B), Mistral miqu 70b, iFlyTek iFlytekSpark-13B, iFlyTek Xinghuo 3.5 (Spark) (200B), Apple MGIE (7B), Meta AI CodeLlama-70B, RWKV RWKV-v5 Eagle 7B, LMU MaLA-500 (10B), Cornell MambaByte (0.972B), DeepSeek-AI DeepSeek-Coder (33B), Tencent FuseLLM (7B), Adept Fuyu-Heavy (120B), OrionStar Orion-14B, Shanghai AI Laboratory/SenseTime InternLM2 (20B), Zhipu AI (Tsinghua) GLM-4 (200B), DeepSeek-AI DeepSeekMoE (16B), DeepSeek-AI DeepSeek (67B), Tencent LLaMA Pro (8.3B), Writer Palmyra X (72B), SUTD/Independent TinyLlama (1.1B), JPMorgan DocLLM (7B)

2023
Cambridge MACE-MP-0 (0.00469B), Allen AI Unified-IO 2 (7B), Microsoft WaveCoder-DS-6.7B, Huawei YunShan (7B), Huawei PanGu-Pi (7B), Wenge YAYI 2 (30B), BAAI Emu2 (37B), Google DeepMind MedLM (340B), Upstage AI SOLAR-10.7B, Deci DeciLM-7B, Mistral Mistral-medium (180B), Mistral mixtral-8x7b-32kseqlen (46.7B), Together StripedHyena 7B, Nexusflow.ai NexusRaven-V2 13B, Google DeepMind Gemini Ultra 1.0 (1500B), CMU Mamba (2.8B), Berkeley/JHU LVM-3B, Alibaba SeaLLM-13b, Perplexity pplx-70b-online, Meta AI SeamlessM4T-Large v2 (2.3B), Google DeepMind Q-Transformer (0.06B), IEIT Yuan 2.0 (102.6B), EPFL MEDITRON (70B), Microsoft Transformers-Arithmetic (0.1B), Berkeley Starling-7B, Inflection AI Inflection-2 (1200B), Anthropic Claude 2.1 (130B), Allen AI T√úLU 2 (70B), NVIDIA Nemotron-3 22B, NVIDIA Nemotron-2 43B, Microsoft Orca 2 (13B), Microsoft Phi-2 (2.7B), Microsoft Florence-2 (0.771B), Google DeepMind Mirasol3B (3B), NTU OtterHD-8B, Samsung Gauss (7B), xAI Grok-1 (314B), xAI Grok-0 (33B), 01-ai Yi-34B, OpenAI GPT-4 Turbo (70B), Google DeepMind MatFormer (0.85B), Kunlun Tech Skywork-13B, Moonshot AI Kimi Chat (100B), Jina AI jina-embeddings-v2 (0.435B), Adept Fuyu (8B), Baidu ERNIE 4.0 (1000B), Hugging Face Zephyr (7.3B), Google DeepMind PaLI-3 (5B), NVIDIA Retro 48B, Apple Ferret (13B), XLANG Lab Lemur (70B), KAUST/Shenzhen AceGPT (13B), Reka AI Yasa-1 (21B), Google DeepMind RT-X (55B), Waymo MotionLM (0.09B), Wayve GAIA-1 (9B), Alibaba Qwen (72B), Meta AI Llama 2 Long (70B), Hessian AI/LAION LeoLM (13B), Mistral Mistral 7B, Microsoft Kosmos-2.5 (1.3B), Baichuan Baichuan 2 (13B), ThirdAI BOLT2.5B, Deci DeciLM (5.7B), IBM MoLM (8B), NUS NExT-GPT (7B), Microsoft Phi-1.5 (1.3B), Apple UniLM (0.034B), Adept Persimmon-8B, BAAI FLM-101B, TII Falcon 180B, Tencent Hunyuan (100B), Independent phi-CTNL (0.1B), IBM Granite (13B), Inception AI Jais (13B), Meta AI Code Llama 34B, Hugging Face IDEFICS (80B), NVIDIA Raven (11B), AzaleAI DukunLM (13B), Microsoft WizardLM 70B, Boston University Platypus (70B), Stability AI Japanese StableLM Alpha 7B, Stability AI Stable Code 3B, Stanford Med-Flamingo (8.3B), LightOn Alfred-40B-0723, Together LLaMA-2-7B-32K, Google DeepMind Med-PaLM M (562B), Cerebras BTLM-3B-8K, Stability AI Stable Beluga 2 (70B), Stability AI Stable Beluga 1 (65B), Shanghai AI Laboratory/CUHK Meta-Transformer (2B), Meta AI Llama 2 (70B), (Undisclosed) WormGPT (6B), Anthropic Claude 2 (130B), IDEAS/DeepMind LongLLaMA (7B), Tsinghua xTrimoPGLM (100B), Salesforce XGen (7B), 360 cn Zhinao (Intellectual Brain) (100B), Reka AI Yasa (7B), Microsoft Kosmos-2 (1.6B), Google AudioPaLM (8B), Inflection AI Inflection-1 (120B), Microsoft Phi-1 (1.3B), Shanghai AI Laboratory/SenseTime InternLM (104B), Meta AI BlenderBot 3x (175B), Microsoft Orca (13B), ETH Z√ºrich PassGPT (0.124B), Google DeepMind DIDACT (5B), Magic LTM-1 (7B), OpenAI GPT-4 MathMix (1760B), Cambridge/Tencent PandaGPT (13B), TII Falcon (40B), Refact 202305-refact2b-mqa-lion (1.6B), UW Guanaco (65B), Meta AI LIMA (65B), Asus/TWS Formosa (FFM) (176B), Salesforce CodeT5+ (16B), Google PaLM 2 (340B), ServiceNow StarCoder (15.5B), MosaicML MPT (7B), Inflection AI Pi (60B), NVIDIA GPT-2B-001, Amazon Titan (200B), Microsoft WizardLM 7B, MosaicML MPT (1.3B), Stability AI StableLM (7B), Databricks Dolly 2.0 (12B), EleutherAI Pythia (12B), Berkeley Koala-13B, Character.ai C1.2 (20B), Bloomberg BloombergGPT (50B), LAION OpenFlamingo-9B, Nomic GPT4All-LoRa (7B), Cerebras Cerebras-GPT (13B), Huawei PanGu-Sigma (1085B), Google CoLT5 (5.3B), Google DeepMind Med-PaLM 2 (340B), OpenAI GPT-4 Classic (1760B), Stanford Alpaca (7B), AI21 Jurassic-2 (178B), Together GPT-NeoX-Chat-Base-20B, Microsoft Kosmos-1 (1.6B), Meta AI LLaMA-65B, Fudan University MOSS (16B), Writer Palmyra (20B), Aleph Alpha Luminous Supreme Control (70B), Meta AI Toolformer+Atlas 11B+NLLB 54B, Amazon Multimodal-CoT (0.738B), Microsoft FLAME (0.06B)

2022
Google DeepMind Med-PaLM 1 (540B), Meta AI OPT-IML (175B), Anthropic RL-CAI (52B), Baidu ERNIE-Code (0.56B), Google RT-1 (0.035B), OpenAI ChatGPT (gpt-3.5-turbo) (20B), OpenAI text-davinci-003 (175B), Together GPT-JT (6B), RWKV RWKV-4 (14B), Meta AI Galactica (120B), DeepMind SED (0.42B), BigScience mT0 (13B), BigScience BLOOMZ (176B), Microsoft PACT (1B), Google Flan-T5 (11B), Google Flan-PaLM (540B), Google U-PaLM (540B), NVIDIA VIMA (0.2B), Tsinghua OpenChat (13B), Wechat WeLM (10B), Tsinghua CodeGeeX (13B), DeepMind Sparrow (70B), Google PaLI (17B), NVIDIA NeMo Megatron-GPT 20B, Microsoft Z-Code++ (0.71B), Meta AI Atlas (11B), Meta AI BlenderBot 3 (175B), Tsinghua GLM-130B, Amazon AlexaTM 20B, OpenAI 6.9B FIM, Google ‚Äòmonorepo-Transformer‚Äô (0.5B), Huawei PanGu-Coder (2.6B), Meta AI NLLB (54.5B), AI21 J-1 RBG (178B), BigScience BLOOM (tr11-176B-ml), Google Minerva (540B), Microsoft GODEL-XL (2.7B), Yandex YaLM 100B, Allen AI Unified-IO (2.8B), Google LIMoE (5.6B), Independent GPT-4chan (6B), Stanford Diffusion-LM (0.3B), Google UL2 20B, DeepMind Gato (Cat) (1.2B), Google LaMDA 2 (137B), Meta AI OPT-175B, Allen AI Tk-Instruct (11B), Meta AI InCoder (6.7B), TII NOOR (10B), Sber mGPT (13B), Google PaLM-Coder (540B), Google PaLM (540B), Meta AI SeeKeR (2.7B), Salesforce CodeGen (16B), LightOn VLM-4 (10B), DeepMind Chinchilla (70B), EleutherAI GPT-NeoX-20B, DeepMind Perceiver AR (1B), Meta AI CM3 (13B)

2021
Baidu ERNIE 3.0 Titan (260B), Meta AI XGLM (7.5B), Meta AI Fairseq (1100B), DeepMind Gopher (280B), Google GLaM (1200B), Anthropic Anthropic-LM 52B, DeepMind RETRO (7.5B), Aleph Alpha Luminous (70B), Microsoft DeBERTaV3 (1.5B), Google BERT-480 (480B), Google BERT-200 (200B), Coteries Cedille FR-Boris (6B), Microsoft/NVIDIA MT-NLG (530B), Google FLAN (137B), Cohere Command xlarge (52.4B), Baidu PLATO-XL (11B), Allen AI Macaw (11B), Salesforce CodeT5 (0.7B), OpenAI Codex (12B), AI21 Jurassic-1 (178B), Meta AI BlenderBot 2.0 (9.4B), EleutherAI GPT-J (6B), Google LaMDA (137B), Huawei/Sberbank ruGPT-3 (1.3B), Google Switch Transformer (1571B)

2020
Google BIGBIRD-ETC large (0.345B), OpenAI GPT-3 (175B), Meta AI Megatron-11B, American Express Transformer++ (0.212B), Google Meena (2.6B)

2019
Google T5 (11B), NVIDIA Megatron-LM 8.3B, Meta AI RoBERTa (0.355B), OpenAI GPT-2 (1.5B)

2018
Google BERT (0.34B), OpenAI GPT-1 (0.117B), Allen AI ELMo (0.094B), Fast.ai ULMFiT (0.034B)

2017
Google Transformer (big) (0.213B), Google Transformer (base) (0.065B)

The sheet also shows a set of Chinese models, including:

Baidu Wenxin Yiyi, iFLYTEK Sibichi, Dachang Data Mooc, Huawei Cloud Daoyi Tianwen, Chongqing University MOSS, Zhixin Technology ChatGLM, Qingmang Qingmang, Qingmang+Guangcone, Qingmang-Wang, Intengine Daoyi Tianwen, Q&A Track Mountain University Bense, Shell BELLE, Baichuan Intelligence baichuan, OpenBMB CPM, Intengine Yingjie: Qingyuan, OpenMEDLab, Yunhezhi Shanhai, Beijing North University TechGPT, Zhizhongwen Shenzhen Jiwei, Lü Ying, Chinese Academy of Sciences Enhanced Dal Liu, Ideal Technology TigerBot, IDEA Research Institute Xiaozhe Technology MindBot, Shanghai Jiao Tong University K2, Baiyulan, 360 Zhineng, Yijian, Duxiaoman Qianyan, Doctoral Engineering Technology Research Institute ProactiveHealthGPT, Heihei, Huru SoulChat, Wenzi Technology Anima, Peking University Law Artificial Intelligence Research Institute ChatLaw, Xiangde Technology Co., Ltd. Muyuan, Horgos MiniMax, Tencent Cloud Tencent, Race Technology+Chongqing Replay Network Race Type XPT, Institute of Computing Technology, Chinese Academy of Sciences Baima, Beijing Language University Bangbang, SenseTime Ririxin, National Supercomputing Center in Tianjin Tianjin Tianyuan, Guoke Technology No Weight, Saisen, Race Technology+Tianjin University Haihe·Mint, Bian Sheng Electronic LightGPT, Telecom Zhike Xingyin, Xiamen Yunji Xiamen YunGPT, Zhizhuyan Jingshi, TAL MathGPT, Shugan Space Great Wall, Ideal Technology Dadao Dao, Huisheng Intelligence Zhixin, China Internet Zhigong, Chuangye Black Horse Tianqi, Together Technology Bowen, NetEase Youdao Yuchuan, NetEase Youdao Wangyan, Weiding Tianji, Zhihu Zhihu Zhihu, Yixing Network Science Uni-talk, Luwen Education Luwen, Zhongke Chuangda Magic Cube Rubik, Tencent Pao Pao, Douyin Vision Dou Tian, Leyan Technology Leyan, Didi Intelligence Xianxiang, Zhizi Engine Metaverse, Douyin Technology Douyin, Microhuan Intelligence Ronggu, Evernote Elephant GPT, Hummingbird Unity Hummingbird, Universe Leap Grace, Aomen Nuomen Kang Jianuo, Shuzu Technology SocialGPT, Cloud from Technology Congrong, Dianke Daxiao Xiao Ke, Agricultural Bank of China Xiaomi ChatABC, Tencent Fusion Tianlai AllMe, Taijiu Cloud Ensespers FFM, Yiyi Technology medGPT, Chaos Science MindGPT, Lingjing Multi-AI Dongni, Changhong IT Changhong Totem, Child King KidsGPT, Zhongke Wendao Daoyi, Didi Technology Lanzi, JD Jixing, ChatJD, Zhizuan Intelligence Huajun, H3C Baitian Cloud House, Tencent Blue Whale Tencent Brain·Brain Sea, Ushi Technology Huimu, China Unicom Yuxiang, Meituan Technology Dahuangfeng, Zitian Power Technology Darwin, Really Smart Zhao Bin, Jiadu Technology Jiadu Zhiyin, Smart Environment Research Institute Smart, Xinyun Research Institute Science EmoGPT, EduChat, Yandao Intelligent ArynGPT, Tencent WAI, Northwestern Polytechnical University Huawei Technology Ziguang·Observation, Singularity Intelligent Singularity OpenAPI, Lenovo Technology Lenovo, Shanghai University of Science and Technology DoctorGLM, Xuannengao Zhimei Couple System, Hong Kong University of Science and Technology Robin, Shengang Communication Source, China Mobile Datian, China Telecom TeleChat, Rongyun Cloud Fanke, Yuntian Lifly Tianshu, Smart Technology CityGPT.

Reasoning Models 2024Q3–2025Q1

Reasoning Models • 2024Q3–2025Q1

1
Mertens, M., & Thompson, N. (2026). Is there ‘secret sauce’ in large language model development? MIT. https://futuretech.mit.edu/publication/is-there-secret-sauce-in-large-language-model-development
2
Zhang, H., Jin, J., Syrgkanis, V., & Kakade, S. (2026). Prescriptive scaling reveals the evolution of language model capabilities Harvard University; Stanford University. https://arxiv.org/abs/2602.15327
3
Fu, Y., Anantha, R., Vashisht, P., Cheng, J., & Littwin, E. (2024). UI-JEPA: Towards active perception of user intent through onscreen user activity Apple; Stanford University. https://machinelearning.apple.com/research/ui-intent
4
Ben Abacha, A., Yim, W., Fu, Y., Sun, Z., Yetisgen, M., Xia, F., & Lin, T. (2024). MEDEC: A benchmark for medical error detection and correction in clinical notes Microsoft; University of Washington. https://doi.org/10.18653/v1/2025.findings-acl.1159