GPT-4: OpenAI. (PDF) Note: TBA.

Transformers (DeepMind): Phuong & Hutter. (2022). Formal Algorithms for Transformers. (PDF)

Gato (DeepMind): Reed et al. (2022). A Generalist Agent. (PDF)

Connecting LLMs to robots (Google): Ahn et al. (2022). Do As I Can, Not As I Say: Grounding Language In Robotic Affordances. (PDF)

Chinchilla scaling (DeepMind): Hoffman et al. (2022). Training Compute-Optimal Large Language Models. (PDF)

PaLM: Pathways Language Model (Google Research): Chowdhery et al. (2022). PaLM: Scaling Language Modeling with Pathways. (PDF)

Google Pathways: An Exploration of the Pathways Architecture from PaLM to Parti

GPT-NeoX-20B (EleutherAI): Black et al. (2022). GPT-NeoX-20B: An Open-Source Autoregressive Language Model. (PDF)

MT-NLG (Microsoft/NVIDIA): Smith et al. (2022). Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model. (PDF)

LaMDA (Google): Thoppilan et al. (2022). LaMDA: Language Models for Dialog Applications. (PDF)

Fairseq (Meta AI): Artetxe et al. (2021). Efficient Large Scale Language Modeling with Mixtures of Experts. (PDF)

Gopher (DeepMind): Rae et al. (2021). Scaling Language Models: Methods, Analysis & Insights from Training Gopher. (PDF)

Yuan 1.0 (Inspur AI): Wu et al. (2021). Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning. (PDF)

Macaw (Allen AI/AI2): Tajford & Clark. (2021). General-Purpose Question-Answering with Macaw. (PDF)

Jurassic-1 (AI21 Israel): Lieber et al. (2021). Jurassic 1: Technical details and evaluation. (PDF)

Blenderbot 2.0 (Facebook): Komeili et al (2021). Internet-Augmented Dialogue Generation. (PDF)

Wudao 2.0 (BAAI): Zou & Tang et al. (2021). Controllable Generation from Pre-trained Language Models via Inverse Prompting. (Note: As of July 2021, this is the latest Wudao 2.0 paper showing extract of WDC-Text. Full paper TBA.) (PDF)

Wudao 1.0 (BAAI): Yuan & Tang et al. (2021). WuDaoCorpora: A super large-scale Chinese corpora for pre-training language models. (PDF)

PanGu Alpha (Huawei): Zeng et al (2021). PanGu-Alpha: Large-scale autoregressive pretrained Chinese language models with auto-parallel computation. (PDF)

The Pile v1 (EleutherAI): Gao et al. (2020). The Pile: An 800GB Dataset of Diverse Text for Language Modeling. EleutherAI. (PDF)

What’s in my AI? A Comprehensive Analysis of Datasets Used to Train GPT-1, GPT-2, GPT-3, GPT-NeoX-20B, Megatron-11B, MT-NLG, and Gopher

Common Crawl: Dodge et al. (2021). Documenting the English Colossal Clean Crawled Corpus. (PDF)

GPT-3 (OpenAI): Brown et al. (2020). Language Models are Few-Shot Learners. OpenAI. (PDF)
This is the comprehensive 22/Jul/2020 arXiv preprint @ 75 pages/6.8MB with all sections and appendices. Note that the final release to NeurIPS camera-ready paper deadline 22/Oct/2020 @ 25 pages/1.3MB (some sections removed, no appendices) is not as comprehensive.

GPT-2 (OpenAI): Radford et al. (2019). Language Models are Unsupervised Multitask Learners. OpenAI. (PDF)

GPT-1 (OpenAI): Radford et al. (2018). Improving Language Understanding by Generative Pre-Training. OpenAI. (PDF)

RoBERTa (Meta AI): Liu et al. (2019). RoBERTa: A Robustly Optimized BERT Pretraining Approach. (PDF)

BERT (Google): Devlin et al. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. (PDF)

Fine-tuning (Howard): Howard & Ruder. (2018). Universal Language Model Fine-tuning for Text Classification (PDF)

Transformer (Google): Vaswani et al. (2017). Attention is all you need. Google. (PDF)

AI Winter caused by scientists: Olazaran. (1996). A Sociological Study of the Official History of the Perceptrons Controversy. (PDF)

The Turing Test: Turing, A. M. (1950). Computing Machinery and Intelligence. Mind 49: 433-460. (PDF)

First steps in AI by Turing: Turing, A. M. (1941-1948).
Guinness, R. (2018). What is Artificial Intelligence? Part 2

Re-discovering ‘Intelligent machinery’ by Alan Turing

(1948). ‘Intelligent machinery’ by Alan Turing, prepared/typed by ‘Gabriel’


OpenAI 2022: Johnson, S. (2022). A.I. Is Mastering Language. Should We Trust What It Says? The New York Times Magazine. (PDF)

Inside OpenAI and Neuralink offices: Hao, K. (2020). The messy, secretive reality behind OpenAI’s bid to save the world. MIT Technology Review. (PDF)

Ethics and data quality guidance

Nick Bostrom: (2022). Propositions Concerning Digital Minds and Society. (PDF)

Aleph Alpha: Andrulis, J. (2022). Ethics and bias in generalizable AI. (PDF)

Societal impacts (Anthropic): Ganguli et al (2022). Predictability and Surprise in Large Generative Models
(See also Anthropic’s work on reverse engineering transformer language models)

Foundation models (GPT-3, Wudao 2.0… 211-page report with 114 authors via Stanford AI): Bommasani et al (2021). On the Opportunities and Risks of Foundation Models (PDF – large – 16MB)

Parrots: Bender et al (2021). On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? (Note: Banned by Google.) (PDF)

GPT-3 quality: Strickland, E. (2021). OpenAI’s GPT-3 Speaks! (Kindly Disregard Toxic Language). IEEE. (PDF)

GPT-J quality: HN discussion (2021). A discussion about GPT-J, Books3 creation, and the exclusion of datasets like Literotica and the US Congressional Record…(PDF) (Original HN link)

GPT-4Chan: “…some dark corners of the web like 4Chan that are already sometimes unfortunately part of the pre-training of these large language models (maybe to try to remove them/mitigate them?).” – Clem Delangue, co-founder and CEO at Hugging Face.

See also my 2021 paper: Integrated AI: Dataset quality vs quantity via bonum (GPT-4 and beyond).

Sam Altman (2021). Moore’s Law for Everything. (PDF)

OpenAI to USPTO: AI IP (2019). Comment Regarding Request for Comments on Intellectual Property Protection for Artificial Intelligence Innovation. (PDF)

GPT-2 Ethics: Solaiman, I., et al. (2019). Release Strategies and the Social Impacts of Language Models. (PDF)

Animal rights (as potential guidance for AI rights): Cambridge. (2012). The Cambridge Declaration of Consciousness (CDC). (PDF)

Intergovernmental and governmental guidance

UN/UNESCO: AI ethics Recommendation on the ethics of artificial intelligence (2021). (PDF)

AI + the UN Sustainable Development Goals: 2030Vision (2019). AI & The Sustainable Development Goals: The state of play. (PDF)

AI Ethics: WHO (2021). Ethics and governance of artificial intelligence for health. (PDF)

AI Ethics: European Commission (2019). Ethics Guidelines for Trustworthy AI. (PDF)

Australian Govt AI: (2021). Australia’s AI Action Plan: June 2021. (External PDF)

International AI Strategies: The team at hosts the most comprehensive list of all international AI strategies, from Australia to Vietnam. (External link)


Spinning Up: OpenAI (2022). Spinning Up Documentation Release. (Note: This is a study guide for learning about LLM.) (PDF)

Wudao usage agreement: BAAI (2021). Data Usage Agreement of Zhiyuan Enlightenment Platform. (Note: Translated to English.) (PDF)

