Or: What to expect when you’re expecting artificial superintelligence
ASI begins
Not achieved
Partially achieved
Achieved
ASI definition, checklist sheet and full text
Definition
AGI = artificial general intelligence = a machine that performs at the level of an average (median) human.
ASI = artificial superintelligence = a machine that performs at the level of an expert human in practically any field.
Checklist sheet
Checklist full text for indexing only

Image generated by AI for this analysis (Imagen 3). Image generated in a few seconds, on 10 November 2024, text prompt by Alan D. Thompson via Imagen 3: ‘ASI checklist, flatui large icon showing sheet of blank curved dropdown menus and RAG checkboxes, pastel colors on light background, simple’
Legal archive (Nov/2024): web.archive.org
PHASE 1: EARLY ASI, DISCOVERY, AND SIMULATION
1. Recursive hardware self-improvement achieved
2. Recursive code self-optimization achieved
3. First major simulation of a suggested improvement; convinces majority of humans
4. First major new mathematical proof
5. First major mathematical conjecture resolved
6. First new discovery (i.e. a new theoretical concept)
7. First new physical invention (i.e. a new tool)
8. First new element added to the periodic table
9. Novel computing materials developed (i.e. beyond silicon)
10. First 1M humans receiving personalized education from AI (no human assistance)
11. First commercial bi-directional brain-machine interface (BMI) for enhanced cognition
12. First mental health condition resolved
13. Mental wellness: Majority of mental conditions and disorders resolved
14. Majority of physical conditions able to be resolved by AI
15. First 100M fully autonomous surgeries performed by AI (no human assistance)
16. Optimized biology at birth becomes standard (1M+ people)
17. First new type of energy discovered
18. First new type of energy harnessed
19. First new type of energy storage
20. Energy production and storage exceeds energy consumption (Earth)
21. New and previously unrecognizable type of transport developed
PHASE 2: GOVERNANCE AND ECONOMIC TRANSFORMATION
22. AI-run company reaches US$1B valuation
23. AI-run company reaches US$1T valuation
24. First region with universal basic income or UBI-equivalent available for all
25. First country with UBI-equivalent available for all
26. Jobs become optional for practically all humans
27. Traditional economics surpassed; money deflates in value
28. First region primarily governed by AI
29. First country primarily governed by AI
30. Geographical/legal borders are non-existent
31. Crime eliminated (linked to mental wellness + UBI)
32. Integrated international governance by AI
PHASE 3: PHYSICAL WORLD INTEGRATION
33. 1M fully autonomous humanoids in the workplace
34. 1M fully autonomous humanoids in the home
35. 1B fully autonomous humanoids in the home
36. First commercial full dive virtual reality equivalent
37. Waste management optimized; no more trash
38. Environmental issues resolved and environment optimized
39. Fully autonomous cars standard; humans banned from driving
40. Housing optimized; humans no longer want to buy houses
41. BMIs allow humans to ‘telepathically’ communicate with each other
42. BMIs allow humans to have a 1,000+ IQ equivalent (compared with 2024)
43. BMI adoption reaches 1M users
44. BMI adoption reaches 1B users
45. New state of matter engineered
46. First planet other than Earth optimized/terraformed
47. First planet other than Earth colonized
48. All planets in solar system utilized
49. Faster-than-light travel developed
50. Life satisfaction increases from 4/10 to 10/10 equiv (SWL, OECD Better Life, World Happiness) for majority
Indicators & justifications
(most recent at top, skip to bottom ↓)
| Date | Checklist item | Summary | Links |
|---|---|---|---|
| Dec/2025 | 🟠 #4, 🟠 #6 |
Prof Stephen Hsu (Michigan State University) reports that the ‘main idea’ for his new paper accepted in Physics Letters B ‘originated de novo [anew] from GPT-5.’ The AI proposed a novel research direction—applying Tomonaga-Schwinger integrability conditions to state-dependent quantum mechanics—and derived the core equations. ‘GPT-5, Gemini, and Qwen-Max were used extensively to perform calculations, find errors, and generate the finished paper.’ | Paper, announce |
| Nov/2025 | 🟠 #4, 🟠 #5, 🟠 #6 |
OpenAI and collaborators (including Prof Timothy Gowers) publish “Early science acceleration experiments with GPT-5”. The paper documents the model producing “complete new proofs” and “four new results in mathematics,” including: – Solving Erdős problem #848 (combinatorial number theory). – Resolving the COLT open problem on dynamic networks. – Proving a conjecture on subgraph counts in trees (inequalities for star/path/wye counts). – Deriving improved lower bounds for online algorithms (convex body chasing). – Discovering novel mechanistic insights for T-cell immune responses and thermonuclear burn propagation in fusion physics. |
Paper |
| Nov/2025 | 🟠 #1, 🟠 #6, 🟠 #7, 🟠 #9, 🟠 #14, 🟠 #17, 🟠 #18, 🟠 #29, 🟠 #33 |
INFO only: US President signs Executive Order launching the ‘Genesis Mission,’ described as a ‘Manhattan Project’ for AI. ‘The Genesis Mission will build an integrated AI platform to harness Federal scientific datasets — the world’s largest collection of such datasets, developed over decades of Federal investments — to train scientific foundation models and create AI agents to test new hypotheses, automate research workflows, and accelerate scientific breakthroughs’. The mission tasks the Department of Energy (DOE) with using AI to ‘solve the most challenging problems of this century,’ targeting: (i) advanced manufacturing (#33); (ii) biotechnology (#14); (iii) critical materials (#9); (iv) nuclear fission and fusion energy (#17, #18); (v) quantum information science; and (vi) semiconductors and microelectronics (#1, #9). Collaborators include Google, OpenAI, Anthropic… |
Executive Order, official site |
| Nov/2025 | 🟠 #4, 🟠 #5, 🟠 #6 |
Google DeepMind and Prof Terence Tao publish “Mathematical exploration and discovery at scale” detailing the agent: AlphaEvolve (discovery), Gemini Deep Think (proof generation), and AlphaProof (formal verification). (This is the full paper release previewed in Apr/2025.) The AI agent was tested on 67 long-standing open problems and “discovered improved solutions in several,” including: – Improving the lower bound for the Kissing Number in 11 dimensions (592 to 593). – Finding an “elusive configuration” for the “no isosceles triangles” grid problem (112 points vs. 110). – Discovering new constructions for the Kakeya and Nikodym problems. |
Paper, Announce [Tao], GitHub, Colab |
| Nov/2025 | 🟠 #6, 🟠 #14 |
Kosmos, an ‘AI Scientist’ system [using Claude Sonnet 4 and Claude Sonnet 4.5], automates data-driven discovery, performing the equivalent of 6 months of human research in a single 12-hour run. The paper details 7 discoveries, including 4 novel contributions to scientific literature: – Statistical Genetics: Discovery 4: SOD2 as a driver of myocardial fibrosis in humans (establishing additional, novel support for existing discoveries). – Statistical Genetics: Discovery 5: Cis-regulation of SSR1 by a protective GWAS variant in Type 2 Diabetes in humans (establishing additional, novel support for existing discoveries). – Data Science: Discovery 6: Temporal ordering of disease-related events in Alzheimer’s Disease (independently develops a new analytical method). – Neuroscience: Discovery 7: Mechanism of entorhinal cortex vulnerability in aging (a novel, clinically-relevant discovery not previously identified by human researchers). |
Paper |
| Nov/2025 | 🟠 #4, 🟠 #6 |
Prof Timothy Gowers used GPT-5 to prove a useful mathematical statement he needed for his research. The AI produced a “nice proof” in ~20 seconds, a task Gowers estimated would have taken him ~1 hour. He noted, ‘we have entered the brief but enjoyable era where our research is greatly sped up by AI but AI still needs us.’ | Announce |
| Oct/2025 | 🟠 #34 | 1X launches NEO, the first commercially available bipedal humanoid robot designed specifically for the home. Starting at $20k outright or $499/month. Features a built-in large language model (LLM), “bio-mechanics” (muscle-like anatomy) ensuring safety around humans, and uses end-to-end embodied AI to learn domestic tasks through observation and natural language. ‘Designed to be helpful… it learns and improves over time.’ | Announce, archive |
| Oct/2025 | 🟠 #26 | BNY reports having over 100 “digital employees” with human managers, performance reviews, and email logins. These agentic systems [BNY Eliza uses GPT & Gemini, 26/Jun/2025] perform tasks from payment remediation to code repair. CEO: “We think of it as a superpower”. | Article, archive |
| Oct/2025 | 🟠 #2 | Prof Jürgen Schmidhuber: ‘Our Huxley-Gödel Machine [GPT-5 backbone] learns to rewrite its own code.’ The HGM coding agent ‘evolves by self-rewrites’ and operationalizes self-improvement by editing its own codebase, achieving human-level performance matching the best human-engineered agents on SWE-bench Lite. | Paper, code |
| Oct/2025 | 🟠 #4, 🟠 #6 |
Prof Ernest Ryu from UCLA used GPT-5 to solve and provide “genuinely novel” insights and the “key successful steps” for the final proof of a long-standing open problem in convex optimization (convergence of Nesterov ODE trajectories). GPT-5 ‘produced the final proof argument…. In my view, this result is already publishable in a respectable optimization theory journal.’ | Announce, unrolled, chatgpt.com thread |
| Oct/2025 | 🟠 #6, 🟠 #14 |
Google’s C2S-Scale 27B (Gemma) model discovered a novel drug candidate that reveals a new potential pathway to make “cold” tumors visible to the immune system for cancer therapy. The prediction was confirmed in vitro. | Announce, paper |
| Oct/2025 | 🟠 #4, 🟠 #6 |
Prof Paata Ivanisvili from UCI found that GPT-5 Pro discovered a mathematical counterexample disproving a long-standing theory about majority functions (listed on the Simons Institute open problems page). It beat the best known majority method on a benchmark case. | Announce, paper pending |
| Sep/2025 | 🟠 #4, 🟠 #6 |
Prof Scott Aaronson, on the quantum version of NP: ‘This is the first paper I’ve ever put out for which a key technical step in the proof of the main result came from AI—specifically, from GPT-5 Thinking.‘ | Announce, paper |
| Sep/2025 | 🟠 #5 | ‘We propose the Gödel Test: evaluating whether a model can produce correct proofs for very simple, previously unsolved conjectures… On the three easier problems, GPT-5 produced nearly correct solutions; for Problem 2 it even derived a different approximation guarantee that, upon checking, refuted our conjecture while providing a valid solution… GPT-5 may represent an early step toward frontier models eventually passing the Gödel Test… ‘ | Paper |
| Sep/2025 | 🟠 #4, 🟠 #5 |
‘Using Gauss, we have completed a challenge set by Fields Medallist Terence Tao and Alex Kontorovich in January 2024 to formalize the strong Prime Number Theorem (PNT)… [Gauss] completed the project after three weeks of effort [where humans took 18 months to begin]. Gauss can work autonomously for hours, dramatically compressing the labor previously reserved for top formalization experts… a new paradigm — verified superintelligence and the machine polymaths that will power it.’ | Announce, repo |
| 17/Aug/2025 | 🟠 #6 | GPT-5 uncovers previously missed metabolomic insights (1,300 analytes, 250 samples) in under 5 minutes: ‘GPT-5 did a better job in under five minutes… uncovered several discoveries we completely missed… No [the paper was not in the training corpus], we just published it, and raw data was [published after] analysis.’ | Announce |
| 14/Aug/2025 | 🟠 #29 | Albania considers creating ministry run entirely by AI: “Why do we have to choose between two or more human options if the service we get from the state could be done by AI? Societies will be better run by AI than by us because it won’t make mistakes, doesn’t need a salary, cannot be corrupted, and doesn’t stop working.”
Albania joins several countries using AI for governance, including: |
Article |
| 1/Aug/2025 | 🟠 #5 | Mathematician Dr Michel van Garrel on full version of the Gemini 2.5 Deep Think model entered into the IMO competition: ‘a mathematical conjecture that was made by some people some years ago, they didn’t manage to prove it back then, they checked many cases and then they just left it as a conjecture. I asked the statement of the conjecture to Gemini Deep Think. And it seems like it proved it right away with a completely different method. When I was thinking about solving that question, I was thinking about maybe three different things, three different ideas. But it seems that Deep Think was thinking about 20 or 100. Many, many different possibilities and then pursuing them.’
|
Video, announce, previously resolved by van Garrel (paper) |
| 15/Jun/2025 | 🟠 #43 | ‘[China] designated separate pricing items for BCI technologies, including “Invasive BCI Implantation Fee” and “Invasive BCI Removal Fee.” Once local authorities align with and implement these guidelines, BCI medical service fees will have a standardized basis.‘ | Announce |
| 13/Jun/2025 | 🟠 #13, 🟠 #14 |
‘otto-SR [in clinical reviews, Gemini 2.0 Flash ➜ GPT-4.1 ➜ o3-mini-high] generated newly statistically significant conclusions in 2 reviews and negated significance in 1 review. These findings demonstrate that LLMs can autonomously conduct and update systematic reviews with superhuman performance, laying the foundation for automated, scalable, and reliable evidence synthesis.‘ | Paper |
| 19/May/2025 | 🟠 #6, 🟠 #7, 🟠 #8, 🟠 #9 |
Microsoft Discovery [o1, o3] finds a new, safer, PFAS-free immersion coolant. |
Paper, video |
| 8/May/2025 | 🟠 #2 | Lead engineer and PM for Anthropic Claude Code (agentic coding tool/CLI agent) says that the agent was written and optimized by Claude: ’80-90% Claude-written code, overall.’ | Video timecode |
| 20/Apr/2025 | 🟠 #28, 🟠 #29 |
‘The United Arab Emirates aims to use AI to help write new legislation and review and amend existing laws.’ | FT |
| 11/Apr/2025 | 🟠 #2 | Google DeepMind: ‘We have actually done some work in this area [of AI designing its own reinforcement learning algorithms]. It’s work we did a few years ago, but it’s coming out now [AlphaEvolve, announced May/2025, using Gemini 2.0]. What we did was to build a system that, through trial and error, through reinforcement learning itself, figured out what algorithm was best at reinforcement learning. It literally went one level meta, and it learned how to build its own reinforcement learning system. Incredibly, it actually outperformed all of the human reinforcement learning algorithms that we’d come up with over many, many years in the past.’ | Video timecode, announce, paper |
| 31/Mar/2025 | 🟠 #4, 🟠 #5 |
‘Potts model is solved exactly for arbitrary q, based on using OpenAI’s latest reasoning model o3-mini-high’. | Paper |
| 12/Mar/2025 | 🟠 #6 | First AI-written paper passes human peer review, accepted for scientific publication. Sakana AI (Japan): ‘The AI Scientist-v2 [originally based on GPT-4o-2024-05-13] came up with the scientific hypothesis, proposed the experiments to test the hypothesis, wrote and refined the code to conduct those experiments, ran the experiments, analyzed the data, visualized the data in figures, and wrote every word of the entire scientific manuscript, from the title to the final reference, including placing figures and all formatting.’ | Paper |
| 16/Jan/2025 | 🟠 #9 | ‘We have synthesized a novel material, TaCr2O6, whose structure was generated by MatterGen [a generative AI tool based on o1 and o3]… If similar results can be translated to other domains, it will have a profound impact on the design of batteries, fuel cells, and more.’
|
Microsoft, paper with images |
| 30/Oct/2024 | 🟠 #2 | ‘Today, more than a quarter of all new code at Google is generated by AI [Gemini], then reviewed and accepted by engineers.’ | |
| 8/Jul/2022 | 🟠 #1 | ‘The latest NVIDIA Hopper GPU architecture has nearly 13,000 instances of AI-designed circuits.’ [Note: This is an example of Narrow Superintelligence only; genuine ASI must be general.] |
NVIDIA |
Download source (PDF)
Permissions: Yes, you can use these visualizations anywhere, please leave the citation intact.
A note from Alan
I’m now in my fifth decade of watching most of humanity shrink away from intelligence. It seems that there’s been a visceral fear of smarts since the dawn of time, perhaps increasing after the first IQ test was developed 120 years ago.
Following research in AI and psychology, my post-graduate work focused on gifted education and spirituality, leaning on pioneers like Professor Leta Hollingworth (the namesake of my Leta AI project using the 2020 GPT-3 model) and Professor Miraca Gross. Working with many of the world’s most visible prodigies, I was (and continue to be) endlessly fascinated by the ‘auxiliary’ effects of high intelligence. Effects like increased moral sensitivity, an advanced sense of humor, and so much more.
After achieving artificial general intelligence (AGI, a machine operating at the level of a median human), we are poised to rapidly leap through the singularity and artificial superintelligence (ASI): a system whose intelligence surpasses that of the brightest and most gifted human minds.
In an ideal world, ASI’s milestones would be visible instantly. But we live here on Earth, with all our imperfections. Here are the first 50 things I’m looking forward to; my favourite upcoming ASI milestones. It is not exhaustive, I’ve left out animal translation and immortality and many other things. It was designed as just one lens through which to view the ASI milestones. It’s likely that they’ll be checked off differently to the order shown here, and certainly with some new ones that my unaugmented human brain can’t yet imagine…
It will take some time to reach each of these ASI milestones, but technology won’t be the limiting factor. Instead, watch many leaders scream ‘Who moved my Cheese?‘ as this brief period of turmoil unfolds, even as we evolve to our unimaginably utopic world.
Join me here and in my bestselling analysis The Memo as we check the boxes of unfolding superintelligence.
All my best,
ASI models
So, which large language model systems would be considered to be artificial superintelligence?
It is likely that any model with a primary score at >50% on HLE, and a secondary score at >90% on GPQA is an ASI system.1HLE, Humanity’s Last Exam, avg human score estimate=0%, avg human score by chance/random guessing=25%, ceiling=70%.
GPQA, Graduate-Level Google-Proof Question & Answer Benchmark, domain expert score=65%, ceiling=80%.
Read more:
Mapping IQ, MMLU, MMLU-Pro, GPQA, HLE
IQ testing AI
No large language model system met this criteria up to mid-2025.
The Models Table tracks 600+ large language models and related details including HLE and GPQA scores:
Models Table Rankings
Open the Models Table in a new tab
Quotes about superintelligence
OpenAI CEO to British physicist David Deutsch, 24/Sep/2025:
Sam: ‘Einstein and general relativity… I think that’s one of the most beautiful things humanity has ever figured out. Maybe I would even say number one. And Einstein had a story, we knew what he was working on. If in a few years GPT-8 figured out quantum gravity and could tell you its story of how it did it and the problems that it was thinking about and why it decided to work on that, but it still just looked like a language model output, but it really did solve it, would you call… like, then would you (I appreciate that you keep a list of things you were wrong about. I do too!) Would that be enough to convince you?’David: ‘I think it would, yes. It’s crucial here.’
Sam: ‘All right. I’ll take you up on that. I agree to that as the test.’
Meta CEO, 30/Jul/2025:
‘…we have begun to see glimpses of our AI systems improving themselves… As profound as the abundance produced by AI may one day be, an even more meaningful impact on our lives will likely come from everyone having a personal superintelligence that helps you achieve your goals, create what you want to see in the world, experience any adventure, be a better friend to those you care about, and grow to become the person you aspire to be.’
OpenAI CEO to the US Federal Reserve, 22/Jul/2025:
‘What if AI gets so smart that the President of the United States can’t do better than following ChatGPT-7’s recommendation, but can’t really understand it either? What if I can’t make any better decision about how to run OpenAI, and I just say, “You know what? I fully hand it over. ChatGPT-7, you’re in charge. Good luck.” That might be the right decision in any individual case. But it means that society has collectively transitioned a significant part of decision-making to this very powerful system that’s learning from us, improving with us, evolving with us, but in ways we don’t totally understand.’
Benjamin Mann (ex-OpenAI, founder Anthropic), 20/Jul/2025:
‘…if you just think about like 20 years in the future where we’re like way past the singularity, it’s hard for me to imagine that even capitalism will look at all like it looks today. Like if we do our jobs right, we will have safe, aligned superintelligence. We’ll have, as Dario says in Machines of Loving Grace, “a country of geniuses in the data center,” and the ability to accelerate positive change in science, technology, education, mathematics, like it’s going to be amazing. But that also means in a world of abundance where labor is almost free and anything you want to do, you can just ask an expert to do it for you, then what do jobs even look like? And so I guess there’s this like scary transition period from where we are today, where people have jobs and capitalism works, and the world of 20 years from now where everything is completely different.’
Alan D. Thompson, The Memo, 28/May/2025:
‘In mid-2025, I think it’s likely that we are already living through the early stages of the singularity, with new inventions and discoveries already being made by large language model systems in labs around the world.From 2025, it is reasonable to assume that novel solutions, inventions, concepts, materials, products, media, businesses, and ‘things’ are in fact being conceptualized by agentic large language model systems, even if the role of AI is deliberately obscured, and these breakthroughs are attributed to humans alone.’
OpenAI Chief Scientist, Jakub Pachocki, 12/May/2025:
‘I definitely believe we have significant evidence that the models are capable of discovering novel insights.’
Adam Unikowsky, 8+ Supreme Court wins, former clerk to Scalia, 17/Jun/2024:
‘Claude is fully capable of acting as a Supreme Court Justice right now…I frequently was more persuaded by Claude’s analysis than the Supreme Court’s… Claude works at least 5,000 times faster than humans do, while producing work of similar or better quality…’
Google DeepMind founder, Dr Demis Hassabis, 24/Feb/2024:
‘Suddenly the nature of money even changes… I don’t know if company constructs would even be the right thing to think about… We don’t want to have to wait till the eve before AGI happens… we should be preparing for that now.’
Pre-ASI model capabilities (achievements)
Alan’s UBI thought experiment
Text for indexing
It pays out $10,000 per day, tax free. It comes with a simple set of post-ASI rules:
– You may spend all the money every day if you wish, but you can’t save it, and you can’t invest it.
– You can’t give it away, because effectively everyone else has it too.
– You can’t donate it, as ASI has resolved all previous philanthropic causes completely.
At the same time, ‘money’ has deflated significantly, meaning a loaf of bread is now about 2c, vehicle hire is 50c/hour, housing is $10/week lease (vehicle and property ‘purchasing’ is no longer a meaningful concept), and a 6-star overseas holiday can be had for $100/day.
1. What would you do?
2. What would your priorities be?
3. How would you structure a typical day?
4. What would a meaningful life look like?
Alan’s ASI prompts
Or: Ask a better question, get a better answer. A problem well stated is already on the way to being solved…
Alan’s May/2025 ASI prompt (Grok 4 Heavy livestream)
■
Video
Jun/2025 (link)
Further reading
Further reading from The Memo
Although the scale of these models has increased dramatically, the technology behind them has not changed substantially since 2020, with GPT-3. Beginning with that model’s capabilities, it was clear what was happening, and I issued a raft of warnings internationally:
19/Sep/2021: ‘AI is outperforming humans in both IQ and creativity in 2021’ https://lifearchitect.ai/outperforming-humans/
20/Jul/2021: ‘AI fire alarm’ https://lifearchitect.ai/fire-alarm/
Jan/2022: ‘AI report submission to UN’ https://lifearchitect.ai/un/
25/Oct/2023: ‘Leaders guilty of negligence’ https://lifearchitect.ai/leaders-guilty-of-negligence/
2021–present: The Memo editions…
Here are a few ‘gold standard’ views of some of the logistics around ASI:
2017: Max Tegmark: https://lifearchitect.ai/agi-achieved-internally/
16/Mar/2021: Sam Altman: https://moores.samaltman.com/
~2022: Worldbuild winners (many): https://worldbuild.ai/winners/
17/May/2024: Anthropic: https://lifearchitect.ai/my-last-three-years-of-work/
3/Apr/2025: AI 2027: https://ai-2027.com/race
6/Nov/2025: Romeo@LessWrong: https://www.lesswrong.com/posts/yHvzscCiS7KbPkSzf/a-2032-takeoff-story
Get The Memo
by Dr Alan D. Thompson · Be inside the lightning-fast AI revolution.Informs research at Apple, Google, Microsoft · Bestseller in 147 countries.
Artificial intelligence that matters, as it happens, in plain English.
Get The Memo.
Alan D. Thompson is a world expert in artificial intelligence, advising everyone from Apple to the US Government on integrated AI. Throughout Mensa International’s history, both Isaac Asimov and Alan held leadership roles, each exploring the frontier between human and artificial minds. His landmark analysis of post-2020 AI—from his widely-cited Models Table to his regular intelligence briefing The Memo—has shaped how governments and Fortune 500s approach artificial intelligence. With popular tools like the Declaration on AI Consciousness, and the ASI checklist, Alan continues to illuminate humanity’s AI evolution. Technical highlights.This page last updated: 5/Dec/2025. https://lifearchitect.ai/asi/↑
- 1HLE, Humanity’s Last Exam, avg human score estimate=0%, avg human score by chance/random guessing=25%, ceiling=70%.
GPQA, Graduate-Level Google-Proof Question & Answer Benchmark, domain expert score=65%, ceiling=80%.
Read more:
Mapping IQ, MMLU, MMLU-Pro, GPQA, HLE
IQ testing AI












