Connor Leahy (EleutherAI/GPT-J) interview transcripts

Connor Leahy is the co-founder of EleutherAI, and the creator of GPT-J. These transcripts are being archived for interest. The transcripts were generated by AI (using, and are >90% accurate.

Connor/Christoph interview 2021

YouTube video: Connor Leahy (AI researcher at Aleph Alpha & – The Ultimate Interview
Published: 21/Jan/2021
By: Christoph Schuhmann
Featuring: Connor Leahy
Length: 3:24:50 (3h24m50s)


ai, humans, good, people, problems, gpt, smarter, rationality, years, model, expect, emotions, build, computers, world, hard, company, research, brain, thinking

0:00 Introduction
3:26 The role of AI (Artificial Intelligence) in our world today
6:32 How will AI look in 1 or 2 decades look like?
9:17 How could human-level AI look like?
10:59 Connor explains the reasons he bases his forecast on
14:10 On the progress of AI – text generation abilities in the past few years
18:22 How Connor replicated GPT-2, OpenAI’s language model to “dangerous to release!”
25:08 Connor’s educational background
29:18 Do you need a degree to work in software engineering or machnine learning?
30:34 What would you advise someone who is at the beginning and wants to work in AI or software engineering in the future
32:52 Connor talks about the company he currently works for,Aleph Alpha ( )
38:12 Connor talks about what he does at Aleph Alpha 1
38:56 Where do AI startups in Europe get funding from?
40:18 How much money do machine learning /software engineers make in Europe?
42:18 How much do you work?
45:06 What are the steps for a startup to become a huge “AI Player”
47:05 Connor talks about what he does at Aleph Alpha 2
49:38 What’s the role of connecting people and project management in AI research compared to programming?
54:35 The importance of trusting in your abilities and selling yourself confidently in the IT-industry
56:57 Are 10 mediocre programmers better than 1 really good programmer?
59:25 What are the most important skills that you apply in your work?
1:01:45 What is the role of learning new things in your job?
1:05:52 What would you advice aspiring programmers who are not very well at social interactions?
1:08:59 What is Eleuther AI?
1:14:50 Which mile stones have you already achieved with Eleuther, what is next and what will you do then?
1:17:40 What had been the biggest GPT-model Eleuther had trained so far?
1:19:02 When will the new GPU version (with 175 billion parameters) be ready?
1:25:32 GPT-3’s attention mechanism
1:29:20 Couldn’t we just make the model make bigger and then use only sparse attention?
1:30:19 How will the release of GPT-3-like language models affect the IT-world & society?
1:32:35 Which will be the implications of DALL-E? ( )
1:37:43 On the ambition to replicate “Learning to Summarize with Human Feedback” (… )
1:39:54 On the pace of progress in AI
1:42:20 Teaching AI emotional intelligence
1:47:48 Connor’s predictions for the future
2:01:07 On “Paper-Clip-Maximizers” & Goodhart’s Law
2:09:41 Thoughts on “Human Compatible AI” from Stuart Russell
2:12:44 How AI could manipulate humans to do what it wants
2:14:04 Ideal superintelligent AI would look after humans like loving adults would look after their elderly parents
2:17:55 Christoph about positive views on human nature & the impacts of scarcity and abundance on it
2:24:45 Connor on the importance of deliberately implementing positive human values into AI
2:25:46 It is possible to build AI that loves humans
2:28:03 What sci-fi gets wrong about AI
2:31:24 AI could easily take over the world by being nice, useful and pleasant :)
2:36:26 We already have a superintelligence: The Economy
2:38:38 Connor talks about his past & his personal life
2:46:19 Finding meaning in life
2:49:08 AI Utopia
2:52:42 Really important questions in life
2:56:37 Rationality
3:02:37 How emotional programs influence our decisions
3:08:30 Mindfulness
3:14:50 Why don’t many more people think rigorously about huge topics like happiness, meaning and mortality
3:18:30 Personal development
3:21:06 If you want to be a hero, don’t let anyone tell you you can’t


Hello on my channel Today I’m here with Connor Leahy, do I pronounce this right?



Yeah, basically, every press is slightly differently, so it’s fine.



So Connor, you’re a very exciting person. And please tell us a little bit more about you what you do and what excites you.



So my name is currently he has already said, I am something of an independent AI researcher. My boss does like to say that I actually so my day job is at work for a German AI startup called Aleph Alpha, where I work as an AI researcher. But in my free time I like what most people know me for is I’m a founding member of a kind of a loose collective AI researchers called luthra. Ai, we work on replicating and researching open source AI research doing open source AI research, our we’re currently well known for being the most visible public effort to reproduce a very large, very complicated type of AI called GPT-3. And yeah, I’m very excited by artificial intelligence for many reasons. In particular, I’m very interested in not just building AI, which is, of course, very interesting. But I’m also very interested in questions of what’s called AI alignment and the safety the questions. As soon as you build such a powerful AI system, how can we control it wherever we want it to do? How can we avoid it doing things that we might not want it to do?



So for a beginner for like, let’s say, an 18 years old, high school student, how could you describe why is artificial intelligence AI so exciting to you?



Well, in many ways, so when I started, whoa, started, quote, unquote, when I was very young, I knew I wanted to be a scientist. And only for four years old, I would you know, no knowledge of anything I was doing your lectures, don’t bang around my electronic things and be like, I’m gonna be scientist. And that never went away. You know, I guess I was just too stupid to learn new things. So I just always wanted to be a scientist. And then, so I always want to figure out how things work. I want to figure out the most fundamental things of thing work. So first, I thought like, Oh, well, biology were made out of biology. So I’m going to study biology, then I thought, no way to biology or chemistry. So I’m going to said chemistry. And then I thought, well, chemistry is actually made out of physics. So I’m gonna say physics. But then I realized, well, what’s even more fundamental to science than physics? Well, you have to have a brain an intelligence to even understand science. So in a way, intelligence is the basis of everything, that everything we care about everything, you know, you know, as humans is based on our intelligence, you know, the way we solve problems, the way we, you know, cure diseases, the way will make the world a better place, that just the way we have fun that we create art, it’s all based on intelligence. So it was pretty obvious to me that intelligence is the thing to study. And it turned out to be something I was pretty good at, it was something I just happened to be, you know, I enjoy working on joy working with. And I think if you if you needed a like, like a simple reason why I said, I just see that everything, every useful product is a product of intelligence to some degree. And with artificial intelligence, we’re going to make make a lot more intelligence, which will make a lot more great things, we’re going to cure disease, this we’re going to create great art, we’re gonna you know, we’re going to save the world. You know,



many people think of science fiction movies when they hear artificial intelligence, but could you explain to amateurs are just interested in people? Why AI is a real thing in our world today, and what the implications will be for our society within the next few decades.



So artificial intelligence is a very, very broad word, and has a very long history, it’s been redefined many times since it was first, you know, first defined back in the 50s, by McCarthy and his group. Back in the day, you know, computers were much weaker. And there’s this famous This is very funny story. Were in the 1950s. So this is back, you know, when computers, you know, partly you had to, like rewire them by hand and stuff. They just when they first set down the word artificial intelligence is they said that they expect to solve recognizing objects on an image in over one or two summers. That was their guess back then. That turned out to be very, very wrong. It was a lot harder than that. But very, very recently, so I’d say starting around 2014 to 15 to 16. Very, very significant. progress has been made on problems such as image captioning, and you know, recognizing images and stuff, which is considered to be an AI problem. So, it’s very hard to say what AI means because very different different people have different definitions of it. If I would use The word AI is basically I differentiate between programming and artificial intelligence by saying, with programming, I set down exact rules of what my program does, you know, I said, I solved the problem in my head. So like, I want to, you know, sort the list. So I figure out how do I sort of list and then write a program to do that, with AI, I know what I want to solve, I don’t know how to solve it. For example, I don’t know how to take the pixels of an image. And, you know, extend, turn that into a description of what’s in the image, my brain can do that. But I can’t actually write down how my brain does that. I don’t actually know how my brain does that. So with modern artificial technology, artificial intelligence technologies, most of all, deep learning, it’s become possible to solve many such tasks by just giving it examples, give it 100,000 images of different dogs and different names for the dog breeds. And it will learn to recognize new dogs that it hasn’t seen previously, without humans, explaining to them actually how that is done. So this is something that has always been kind of experimented on has very long history. But it’s really since like 2014 1516, that this has become like a revolution, there’s been a extreme explosion, and the applications and the uses of this technology that you’ll find everywhere, it’s, you know, it’s in your phones, it’s on your computers, it’s part of Google search, everything uses AI nowadays, it’s not necessarily noticeable for other people, but it’s everywhere.



At the moment, we have AI like Google Maps, and detecting faces and pictures and adding smiles to pictures and so but what do you think if the current trends in development of AI will continue? How could this from the perspective as an AI researcher, look? How can we could the AI and one or two decades look like?



Well, I must preface this in saying that I am a minority in the view about to espouse. So I very much believe that, you know, AI is an exponential technology. So every year, it doesn’t just get you know, two points better, but it gets like 10%, better. And if any of you know the famous Einstein quote, the you know, the greatest force in the universe is compound interest. I think this applies to AI. And I think we’re currently in a part of the curve where the explosion of capital is happening. So I personally believe that within the next like two decades or so AI will probably completely replace humans. And there will be no tasks that a human can do better than an AI, that you can’t have an AI do just as well or better for probably cheaper.



So, to break this down for amateurs, which tasks Do you believe could be replaced within 10? Or maybe 20 years by AI?



Ai x expect within 20 years, all of them, just all of them? I’m saying I’m minority, not everyone believes this. There are many people who disagree with this. But most AI researchers, I know, think by 2100, I think like 70% would probably agree with, you know, 2100, all tasks, without exceptions would be replaceable by AI. It’s very hard to say, you know, the curve of like, what will be replaced before then, like when it’s very, very hard to say.



So to repeat this a little bit more slowly. You’re saying that we could replace drivers, and cashiers but not only then them more like engineers, programmers, scientists,



programmers, artists, everyone, there’ll be no tasks that a human human brain is not magic. It’s very, very complicated machine, the human brain is the most complicated machine we know of. But I don’t expect that to stay that way. I expect that we’re very soon very close to the ability to build systems that are just completely superior to the brain in every single possible way. There’s just nothing the brain can do that these systems can do better.



So if you would have like a sim in your computer, like from the computer games, the Sims, and we imagine that what you’re saying is true, like 20 years, eventually, like 30 years, I don’t know. Or 50 years. And we could make this sim as intelligent as a normal person, let’s say like a normal engineer. We could make this computer do all the things that a normal or maybe an intelligent people a person could do.



Yes. So it’s very important to say that I don’t expect these intelligences to be necessarily very human. Like they don’t I I expect them to be able to do the tasks humans can Do but it doesn’t seem likely that they will be human in like ways that we necessarily recommend unless we specifically make them that way. So they might do, they’ll probably have emotions, but they’ll probably be very different emotions than how humans have them, they’ll probably, you know, see the world very differently to different senses, they have different concepts of like, you know, memory and dead life death and such will be so very different for your computer program that just copy itself, you know how, so life for these beings will be extremely different to humans not going to be like a sim that literally talks like a human looks like a human, we will probably build AI that look like humans, because we’d like humans. So like your companions, or like customer support, you probably have eyes and like built to see me. But I expect most AI to not be very much like humans, but rather very, very different and very, very, far more intelligent. Like I expect that, you know, most AI will be vastly more intelligent than humans.



Okay, if someone hears this perspective, for the first time, he might say, Oh, this sounds like science fiction to me. And eventually, if he doesn’t know you, then you could think, how could this be true? So could you make a case and argument for why your prognosis is realistic?



So most of the analysis is based on just very silver tactical kind of analyse. So of course, it sounds like science fiction, but just because somebody sounds like science fiction doesn’t mean it’s wrong. It’s a good hunch. So like most people, when most people tell you something that sounds completely outrageous, it’s probably outrageous, it’s probably false. But at least for me, it seems very and for almost everyone working in this not everyone, but like a large amount of people working in this field, if we just look at computers, like, you know, Moore’s law is very famous is that you know, every two years ish, exactly, really, almost every two years, the amount of transistors on an on a computer chip doubles. And there are several related laws about like forever, the amount of compute, you can buy, given a constant dollar, and stuff like this. And these have been and, and humans are very bad at understanding exponentials. So the way exponentials work is, is that they start really small, you know, for example, let’s say something doubles again, to get started one, you have to that’s not that much more for that’s not that much, eight, still not that much 16, it’s not that much. But if you do like 30 iterations of that you have like, just a ridiculously huge number. Now remember, exactly what like a 16 digit number, something like huge numbers is a massive number, after just 30 doublings or something. And this has happened with chips, you know, the first chips had like, you know, 500 transistors or whatever, 5000 or something. And, you know, the modern Nvidia graphics cards have like 50 billion on the disease. And that’s, you know, within 15 years. And if we just extrapolate, just say, Okay, let’s assume this keeps going, which it looks like, I mean, a lot of people keep crowing about the end of Moore’s law, but I just don’t see it, I think they’re wrong. If we just draw a straight line, if we just look how good our AI is, right now, and how good are they going to be tomorrow, have to just admit that at some point, you know, they’re just, maybe, maybe it won’t be in 20 years, maybe it’s going to take 100 years, and it’s gonna take 200 that’s, I’m certain, but at some point, they’re going to cross the human line, if our systems just keep getting better, which seems to be the case. And he’s already now the case that, you know, he like computers completely to beat us in many things, you know, not just in like, math and stuff, but also, you know, like play chess playing go, you know, recognizing images, actually, they’re superhuman. So if you like, if you like, let a human label a bunch of images. AI is actually for certain images, not for all the types of images, ai for better, also, like, you know, speech recognition, several these things like they in certain scenarios, they already are superhuman, and it’s only a matter of time until they you know, performance on all desks.



To get into a concrete example, that means a lot for you, I’m sure. like five years ago, there were so called language models like programs, computer programs that were capable of learning from reading text, they’re reading text, lots of texts, millions of books, or millions of pages, and fry them to predict next sentences within a story so or the next word within a sentence. And then they write stories or news articles or whatever. And I remember three four years ago, I read some screenplays created by such language models AI on Facebook, from Facebook posts, and they were complete gibberish, they were grammatically correct, but completely meaningless. And it was extremely funny. But no no meaning at all, no understanding at all. And then came GPT, one and GPT-2 especially. And GPT-2 was a turning point because suddenly, not reliably, but like 30 40% of the times, it was capable of writing fake news articles or short stories that actually sounded as if a bride human had written them. Could you tell us a little bit more about what you thought when you encountered this? What is meant to you and what you then did as the next step?



So yeah, so GPT-2 was also a turning point for me as well. I, before that, I was of the opinion that, you know, we hit the limit of our AI, like, we do a lot of cool things, but probably going to stop now. Like, this technique is probably as far as it’s gonna take us. And we need something new. Like we’re missing something, you know, the brain does like a lot of things that our current systems don’t do. And I thought, well, maybe we need those things that the brain does, we feel super important. And up to is one of the first things to meet the kind of showed I might be wrong, that I was wrong, that our techniques can go much further than I expected them to. And so that there’s a whole story about me like replicating GPT-2 back in the day, you can Google my name, if you’re interested in stories are super important for this right now. But that was one of a big turning point for me. And I started to take this very seriously that like these things I’ve been reading from like, like, these, these things are in reading, like AI progress. And like, you know, AI is, you know, becoming smarter than humans, you know, the world. It never felt real GPT-2 was one of the first moments were starting to feel like this might actually be happening. And even more important for me was the successor to GPT-2, which is up to three. So GPT-3 was released, it is mind blowing, it blew my mind, it was one year after GPT-2 so I taenia halftime, when it comes to research, just one year later, it came out. And it was extremely stupid. It just took the same thing and just made it 100 times bigger, and they didn’t do any new science, no new research, no new magic algorithm, I think he just made much bigger, just put lots of lots of computers in it. That’s all they did. And it can, it can honestly write blog posts like better than I can, you know, we can write essays, you can write comedy, you can write, you know, philosophy, it can, you can do math, it’s, it’s, it’s incredible. It’s truly incredible. And there’s plenty of people that say, Oh, isn’t that great? I look and made a silly mistake here, whatever that’s missing. The point is that within one year, the difference between 52 and up three is massive. It’s it’s, you know, it’s between, like, you know, a two year old and a five year old when it comes to speaking, you know, it’s it’s incredible. And if this continues, you know, with like this exponential curve to be four or five, six, may very well be smarter than an adult.



And when you heard about GPT-2, you begin to replicate it. And that’s very interesting. Because already when GPT two was there, almost everyone’s first response was, Oh, my God, you need so much compute. to replicate this, you need like $40,000 worth of cloud computing or whatever. But you as a back then bachelor student, your student, you had no fear and you’ll begin to replicate open my eyes model that’s too dangerous to release.



Yeah, that’s a long story. I kind of wanted to skip over that goes along. So I could tell the story if you want to hear the whole story, but the long story, but yeah, basically, I started a new plant. So I’m opening, I only released a small version of their model, because they were scared, the large model might be too dangerous. You might use it to create fake dues to spam to like, you know, do something evil or something. I was very unconvinced by this. I found their argument very silly. And I wrote a very, very, very large blog post about why I think it’s silly. And basically what happened is, is that it was a bunch of coincidences. So I basically got I was accepted into like this like research program from Google. So as I just TensorFlow, research cloud is actually very easy to get into, we just have to like email them. It’s very easy to get accepted to it, when they would give you a lot of free access to their hardware for training for AI. And I was accepted to that and they gave him Very generous amount of compute. And I said, well screw it, you know, it’s a summer vacation, and we take two weeks off, you know, just sit down at university and just try to program this myself and see if I can do it. So those two weeks were the best two weeks that ever spent in my entire life. If I could give anyone advice, do cool projects, that’s gonna pay off more than anything else you’re gonna do in this business, or anywhere really, sit is absolutely really cool you want to do just sit down and just do it. You know, even if you fail, you’re going to learn so much more than you than you would from any textbook, maybe, you know, things like learning not nothing to say against school. But like, the real things you learn is when you actually do something, when you actually sit down and you make a do project. It doesn’t matter what the project is, it just has to be something cool. So I sat down for a week coded my, you know, two weeks, every single day, you know, it’s the same rhythm and I got up, had breakfast, went to uni to stay there until the sun went down. And it I had no idea what I was doing. But I quickly learned what I needed to learn about GPT and replicated and eventually got the code working. But unfortunately, it turned out they wouldn’t work is because the compute that I needed is just so large, that even the Google just you know, didn’t give me that. So I was kind of like, you know, like, well, dammit, that was that was a shame. But you know, it was a good experience. So I emailed the Google people on because they asked me like, Hey, what are you working on? Do you print any bugs and if you can help you with, so they’re very nice. And I told them, I listed all the bugs I ran into? And I said like, yeah, unfortunately, I was trying to replicate the thing. But the the computers he gave me just weren’t big enough. I need like one of the really big ones. Three days later, they send me an email. Oh, we can get you one of those. And the rest is history. They like it. There you go. And I trained. I was talking about here, how much compute did they give you? I had access to 512 TPU cores. So one of those is about equivalent to one graphics card. So yeah, 512, roughly, for a week was about the computer used?



So for people who don’t know so much, it’s like 512 Nvidia 2080 ti. So



yeah, roughly. So if I don’t remember the exact cost calculation, but if I had rented that it would have been like 50,000 to $100,000. So yeah, Google has been very, very generous with their GPU computing power for independent researchers. And I’m very thankful to them, like I’ve met the people in charge of this. So So this blew up on social media. You know, I posted about this, I post her why I think it shouldn’t release why I think open it has wrong, but I did something that I would very much recommend other people do as well is that I preface it with say, I might be wrong. And I want people to convince me that I might be wrong. Because there maybe there was something I was missing, maybe opening actually had a really good argument. And I was just being stupid. So before just releasing it, I released a very long blog post and I said, Please, if someone thinks something stupid, telling me I want to do the right thing, you know, maybe I was wrong. I think it’s very important to be humbled and many situations of life. So this got me to talk with some people that did GPT to some of the people at open AI and some other very interesting people. I still stand by what I said, I still think the argument is that, that opening, I made work rather silly, I still don’t think they’re correct. But I was convinced, eventually not to release by a different researcher from a different organization. who convinced me basically saying that, even if this bottle isn’t as dangerous, as I said, it is, it’s good that they were careful. And it’s not a good thing that someone like me should like make fun of them and like, like mitigate them publicly, for being careful. Maybe they’re being too careful, maybe they were being pretty silly here. But it’s sort of a asshole thing to do to you know, try to make fun and humiliate someone who’s just trying to be careful. So I decide not to release that solidarity, even if my model turned out to actually not be as good or as dangerous as theirs. I decided not to release because I don’t want to support this kind of the idea that it’s okay to like make fun of people that are trying to be careful, because I think AI can be very dangerous. I think people don’t take that seriously enough. And I didn’t want to be part of the problem. Even so maybe I was part of the problem in many ways.



Did you re implement GPT-2’s code completely from scratch? Or did you take the code base that they had released before?



I use most of their code but their code was stripped down to like they had part of the code but I would say the more difficult parts were not released, so I had to re implement those.



And after you did that with GPT, to you, you finished your undergrad yet, right? And then yeah,



I didn’t, I did not I dropped out.



Oh, you dropped out. So tell me about this. Why did you begin to study? And how did you come to the decision to drop out.



So I, if you’re a, you know, passionate, interested person, you know, kind of maybe a little nerdy, but you know, you’re into, you know, math or program or whatever it’s go to university. This is like, especially if you’re in a country where it’s free. It’s a wonderful experience, not necessarily just because of what you learn, but more because of the kind of people you meet. So I loved university, I think because University surrounds you with people, they’re like, you know, you get to meet these like, wonderful, smart, ambitious people that are your age, you know, they’re also looking for friends, you know, everyone’s from all across the country, they don’t really know anyone, so you make lots of friends really quickly. So, so I went to university, so I was very sick for a very long time. I had a very serious illness for like, four years. And when I got better, which was very lucky, I, I immediately went to university because I wanted to go back for life, people, you know, I love programming and such, I must admit that I did not learn very much in university, because I knew so much ahead of time. So I already knew programming very well, when I started University. It did teach me math, it forced me to learn math, which is really good. Math is hard, but it’s very, very, very useful. Math is very, very, very powerful. Like in school, I feel I was taught math very wrong. I was very bad at math at school, I almost failed because of math. I was one of my worst things. But when I got to university, I found a love for math, I found it math actually beautiful and super useful. It’s like a It’s like magic, you know, you get so much powerful things you can do with math, if you learn about it properly. So I got that university. And that was really wonderful. So University of probably some of the best times of my life was fantastic, you know, meet lots of cool friends you learn like cool stuff. It was also very, very hard. You know, university is not easy. It’s those math classes. Oh, man, those those weren’t easy. But I learned very valuable lessons. You know, I got into contact with many great people and such. And but yeah, after I replicated GPT-2, I got some amount of notoriety. You know, I met some lots of cool people, including people that offered me a job. So I was offered this job, but I still have today at alpha alpha, which is a startup. And I really like team, I really love the people there. It’s a wonderful company. And I kind of said I want to finish my degree, so I’m not going to like release, I’m just gonna do a few hours a week and such. And then during quarantine, so the quarantine hit pretty hard as it did most people I’m sure over 2020 but eventually in like July I think I found it eleuthero ai, which is the loose research group that I’m a part of, to replicate GPT three, and do other cool projects. And we got were very successful, we got a lot of people interested, you know, we released a paper, we got a lot of support from our different donors and different people that want to help us it contacted like people from industry and academia and want to cooperate with us and such just been so wild success, like beyond my wildest dreams. It’s been so successful. And between that and my job at alpha alpha going extremely well, you know, get it, you figured out a role. And it works for me that I really enjoyed and I’m good at. And that pays me pretty well. I just decided that there’s nothing left for me University. It’s just it was just a distraction is that I was just so successful in my work that Luther at alpha, that I just decided I need this. I’m already successful. You know, I already one of the great things about the tech industry is that the tech industry is not perfect. Of course not. But it still is somewhat meritocratic. If you’re good, you’re smart, you’re smart, smart guy and you do cool projects. No one cares. If you have a degree or not. No one cares. It’s useful, especially if you want to work in industry. Like if you just want to you know lay back and you know, get a nice comfy software engineering job somewhere Google. Yeah, you’d probably get a you know, a master’s degree or something. Or if you want to do like academic research or become a professor, yeah, you need a PhD for that. But if you’re ambitious, if you’re clever if you work hard and get a little bit lucky. This is a great industry where you can get very far even without a degree.



It’s very funny because almost all friends I have that are software engineers, they all dropped out of university. It’s it’s a common trope. It’s a common trope. Yes. So, Milan, the interviews I interviewed last week, and he also was at Technical University of Munich, you were the same year. And he got out. And he was always a computer geek. And I have some some other examples. So to summarizes, if you would give an advice for someone who’s at the beginning of the career, like, after high school a bit too, or maybe he just finished another studies in something completely different. And he wants to go into programming or AI. What would you advise?



I’ll put it this way. If you’re unbelievably motivated, you already have you know, done a ton of project you already know, all programming languages back and forth, you work really hard, you’re super ambitious, or whatever. Maybe you don’t need to go to university. But for the most people, I would recommend go to university, at least just take a take a semester to just just dip your toe in, see what kind of people you need, try to meet people, that’s a very big advice I can give you is that the world works on it’s made of people, you know, it’s knowing people meeting people getting along with people is just fundamentally important, no matter what you do in life. And University is a wonderful place to find the kind of people that will be like you that will like similar things to you, that are ambitious, you know, what piece of advice, I definitely give people the same piece of advice I got when I started University, come back after Christmas, this might seem a little silly. But there’s, in especially the technical fields like computer science, there’s a very common phenomenon where the first semester is the hardest semester, and it’s often a very, very hard, and it’s like soul crushing, and you know, and then people go back home for Christmas vacation, and it just don’t come back. Because they’re too depressed. It’s just if they get too stupid, that’s normal happens to all of us happened to me, it happened, it’s gonna happen to you. Don’t feel bad about it. If it turns out the wrong thing, that’s fine, too. But if you’re someone who thinks, like seriously, considering that they, you know, they like computers, they like, they like technology and stuff, they like AI. That’s the first step I would take and then just take it in stride, you know, if you find something greater and better and want to drop out, cool, if you don’t, also cool.



So before we get a little bit more into eleuthero ai, and your current efforts to replicate GPT-3, and all this stuff, the company you’re currently working for to make your money. Um, could you tell us about what you’re doing there? And, yeah, how all this came into being.



So we’re still a small company, we’re still young company and startup. So things change around very quickly, you know, who does what are, you know, everyone’s a little bit of everything. So, you know, it’s like a lot of things happening. But basically, we are hoping to do basically AGI from Europe, you know, we’re trying to be Europe is unfortunately, when it comes to AI very, very far behind other countries or other continents. It’s not because we don’t have smart people here has some of the best universities and some of the smartest people anywhere, but because of many complicated reasons about the history and finances, and you know, how I just, you know, and you know, Silicon Valley being such an attractor, most of the top AI, you know, ai talent, and most of the top tech talent leaves Europe, for other countries. And you like the US in particular. And this is a shame, we think we’d like Europe, within Europe. So a great place, I think Germany is a great place, we really like Germany. And so we were hoping we hope to, you know, kind of leverage the strengths that we have here to create, you know, really high level, you know, ai in an arena, sort of content context. And so, currently, we kind of are still finding our ease, we’re doing a lot of research internally, we’re, we have some projects with some big clients, you know, like government and corporate clients kind of pays the bills right now. We recently closed another large funding round. So we’ve got quite a bit of quite a bit of money right now. So we’re very much expanding, we’re looking for to hire more people to build, you know, bigger computers, kind of build capacity in Europe. And I’m really excited about this company like I would I, I have been so you’ll notice this if any of you guys go to university and study computer science, you get a lot of offers from startups. It happens a lot. So everyone has a startup everyone is like, Oh, I’m gonna do you know, Wi Fi can To the water bottles, trust, the man is going to be great. And most of them are just nonsense. Like, it’s just wasting your time. But I really believe in this company, I really believe the team is great that we’re in the right place at the right time. So and So my role has kind of shifted over the year of the year or two, I’ve been here as it shifted multiple times. But basically, I’m something of a Yeah, only one of the main researchers, so I kind of you know, have like a pretty clear goal of like, what I think we need to research what problems need solving. And I also do a lot of like, not PR, but like, I write blog posts, you know, kind of helped shape the image of what our company is about, like who we want to hire who don’t want to hire kind of a bit of a bit of a manager, man, a bit of a leadership process, and small companies, everyone does a little bit of everything. So it’s very similar. And yeah, I also like manage some of the like, contact with Luther. So like, of course, our boss knows I do Luther, and is a big fan of Luther and the kind of stuff we do there. So he also kind of allows me or encourages me to work with looser to make cool things that that we as a company need anyway, it’s like a loser does a lot of cool open source work that we can also use at our company.



So could you tell us a little bit more about the long term goal of the company and about the business model, how they plan to create revenue.



So the long term goal is kind of you know, to be, I mean, to be ambitious here to be the number one AI lab in Europe, you know, the place where you if you don’t go to Google, you cut off all that kind of, you know, is that like, we wanted to be open AI of Europe, we want to have, we want to attract really high level engineering, research talent, to create explicitly, the next generation of AI AGI we want, you know, we are very much focused not on image captioning and voice recognition. But on complex, you know, reinforcement learning, unsupervised world model future facing systems that don’t get work, but I think will work in five years that we want to build the capacity when this technology really takes off. And we’re at that we’re at the top that we are at the front end idea at the edge of the technology, we’re at the cutting edge. And so currently, our revenue stream is mostly do we exploratory projects and AI work for large companies and government, we have quite a few government contracts. And I’m not I’m not liberty to speak on all of them. Because you know, it’s still, you know, in development and whatever. But we’ve done some pretty cool things there. But this is what I’m most excited about is kind of like just as we’re kind of finding our right thing. And our long term, our business model is going to be much like open AI and Google and stuff. So we we will offer, you know, ai services, we will offer AI models, hopefully they can just stay the best in the world that people can I we’re not sure how revenue will be there will be a subscription, will we make custom models for different clients and deploy them internally? That’s all up in the air. It’s reused. So it’s still too young to know that exactly.



Are there already things or projects that you have finished with your company?



Me personally? No, I, as I say I’m more of a research role. I don’t really I suppose contribute to some of the more concrete rules that have you know, outputs. But that’s not exactly what I work on. So most of my work is more conceptual. So I have a higher level I you know, more like organized research telling point things different directions. I I also network a lot and meet people, I get people interested in the company from you know, people that meet over luthra and stuff. So I have more of a high level role at the moment, I don’t really deliver products that’s not really directly work on products. That’s how I would describe it, at least at the moment.



And I don’t know if you’re able to ask to answer this question, but I’m curious, how does the startup for AGI in Europe get funding from Is it like a wealthy person? Or is it like government funded VC fund



is VC funding for the most part, so it’s mostly venture capital, there is venture capital in Europe, there’s not nearly as much as in, say, Silicon Valley, but it does exist. So most of this goes through my boss. I’m not super like I am still an employee. I’m not a founder. So I already knew I don’t know exactly all the details, but for the most part is there seems to be a very large appetite. Right now for high level AI work in Europe that’s just not filled is like there just seems to be this void that is currently not being filled. And so we’ve actually been quite successful with eliciting VC funding so far, is that people have been very excited to invest into us, and I expect that If we can pull off some of the technology that we are working on right now, I expect it to go very well.



So I’m for someone who wants to get into the field of software engineering, or more concretely machine learning AI. I won’t ask you for your salary. But guess what you’re able to say? How much money could someone expect to make? At the beginning, and after a few years here in Europe,



I am the right person to ask this question. Because I have a very untypical career. So I my, my salary is complicated and like involves like stock options and stuff like that. And I don’t know of any other people who tell me their salaries. So I would recommend looking at other sources, but generally, a few years ago, machine learning engineers tended to make more money than software engineers, I think that’s been kind of evening out lately. Generally, I very much recommend software engineering and machine learning engineering to people who want to make any salary you make pretty good living usually. So like I know, even like beginners usually make like 30 40,000, at least, to start and can be more if they’ve shown progress, and can raise to, you know, like 5060, in a few years, if you have something to show for it, if you have like a master’s degree in particular, or something that that can show. But this is very dependent very, very strongly depend on where you work and who you work for. So like working at like a like a, you know, like a mid range company, or like some like traditional company like BMV, or whatever is very different than working at say Google was very different. They’re very different demands very different, hard to get into, and very different salaries. You should not compare yourself to people in San Francisco and Silicon Valley, those people will make twice as much as you. So that’s why people go there is that if you go to San Francisco, you will meet two to three times as much as you do in Germany. But you will also pay two to three times as much to live there.



I’ve heard that in the US software engineers tend to work even more. And the last interview Milan told me that it’s not unusual for companies to expect people to work 10 or 12 hours and then go with the team into a club and to have after work, whatever. How much do you work?



I try Boston listening? Well, so generally, I would say you’re lucky that most offer engineers are nerds. So there’s usually no club problem. So I work. So officially, the contract says 32 hours I have I don’t have a full 40 hour load. That’s that’s by design, I chose to do that. And software engineering is. So I think was tougher engineering is that it’s one of those things where there’s a huge difference between the average software engineer and the top software engineer, there’s a tough stuff engineers are crazy people, they’re they’re insane, lunatic, wonderful, brilliant people that, you know, even if you tell them, Hey, stop working, just going to be awake at three in the morning, you know, hacking away at their code. And just like, you know, I wish I could have fixed this way, wait, hold on a lot. But we have a lot of people like those in our company. So there’s a I would really differentiate working, for example, in a startup from working in a company. So like working in like a large corporation, you will have you know, your nine to five job. And afterwards you can go it will be the those jobs are, excuse me, two people working corporate, not very challenging. And like they can be annoying. And so work, of course, but it’s a pretty comfy job. Like I worked in corporate for a very short amount of time. And it was very comfortable. Like I had to do very little work. And I got paid very well, like very, very well and actually get paid less currently my startup, but I do it because I believe in this company. And I love the work. It’s but it’s very intense work. So working in a startup is very, very, very intense, is that if you work in a startup, if you work in a research, if you work at things, you never stop working, you’re working all the time. You know, you when you’re not sleeping, you’re not you know, not coding directly. You’re reading a book, you’re reading a paper, you do some research, you’re sending some emails, you’re always doing something. So if you if you like, if you only put the time that I’m like at the computer, talking to my colleagues, you know, making a thing. I don’t worry, I don’t even work 32 hours a week. But if you add in all the hours and hours I spend every day reading research, talking to people meeting people, planning research, scribbling down notes and figuring out how to do it. way more than 32 hours. But I do I like it. It’s you know, it doesn’t feel like work because I love it.



So I’m with the company, what do you think, are the steps towards becoming a huge AGI player? I mean, what do you and your colleagues have to do to make this happen?



Be the best in the world, obviously. It’s, it’s, it’s complicated, of course. But I have a very strong belief, I have very strong beliefs about what the future is going to look like what technology is going to work, which ones aren’t going to work, what research needs to be done. And a lot of it also comes down to hardware, I think that supercomputers are going to be extremely necessary feature. So we spent a large amount of the money we raised on hardware on building supercomputers. And of course, hiring great people. So like, our company kind of specializes in finding underutilized talent. So like almost no one in our company has, like a standard education of no one has a PhD, there are like weirdos and dropouts or whatever, they’re both really, really great. Like, there’s really one really hard working really smart programmers that just weren’t picked up by other companies. Because in Europe, it’s still, it’s, it’s better than the other industries, but it’s not as good as in Silicon Valley, you know, Silicon Valley, you know, you can just, you know, drop out of a car without a shirt and someone you know, and if you programmed with a job, I mean, at least it used to be that way. I know it’s still that way. But we specialize kind of in finding like, you know, interesting, I like one of my colleagues also, he has a Bachelor of Arts in photography. So he was a photographer, and eventually decided kind of sex, this AI stuff is cool, I’m gonna start learning that and he spent like a year or two of learning with us, or like learning himself, and he enjoyed the Luthor. And he’s great. He’s like one of the best programmers, I know. He’s just, you know, he stays up until three in the morning like that, Oh, wait, I gotta fix this. I’m going to sleep Just a moment. Wait, I’ll also sleep later, it’s fine. And he does really, really cool stuff. And that’s kind of people we have at our company. That’s why I love working here.



Your job? What concretely are you doing? Like when you get into work? What are the concrete steps that



crazy AI does? As I say, that just does not apply to my work. I mean, first of all, I don’t go to work because quarantine. So I haven’t actually been to work in a very long time. It’s all it’s all from home. So it’s all it’s all digital right now. And I’m not advocating that this is a necessarily good way. But I just have a strange org schedule, you know, like I, you know, I do like two hours of work in the morning, then I go to sleep for like, a few, a few hours. And then you know, like, have food. And then like at 8pm. Like after dinner, I sit down in order, like four hours. And the concrete steps really vary. Like it’s, I don’t have a clear schedule. This is like there’s not like a this day to do this. It’s just more like, what are we doing? Do we have anything cool to do, or something’s like, you know, I’m always reading, like, one hour, I’m reading some research that I think is good. The other hour, I’m reading a Google Doc about like what returns I think we need to do. And then another hour, I’m in a meeting, you’re discussing with my colleagues, how we’re planning how things are going and when you forward. And other times, then I jump into a into a call with someone you know, one of the people that work in my room, and I talked to them what their progress is, and we try to figure out what went wrong? Or what can we do better? How can we best their project. So like I currently supervise, I directly supervise one person and their project and kind of like, indirectly, and an advisor just overall to projects is lots of talk to people. So it’s like there’s, like if you if you don’t like to talk to people, my advice is learn to like it, because you’ll need to, it’s like you can’t avoid talking to people in this industry. It’s definitely possible. It’s like one of the few industries where you can do that. But I don’t recommend it. It’s gonna cost you a lot of opportunities. So I talked, I talked to a lot of people I write a lot, I’m I enjoy writing, just like my posts and like stuff like that. I enjoy it. So we often do that I often like you know, yeah, it’s others. And I do very little programming, actually. So I actually do very little programming. It’s just turned out that way, I used to do a lot of programming. So I used to do a lot of programming. But it just turns out that I’m actually better at these kind of like, like organizing other people like, you know, pointing people in different directions, and that kind of stuff, and I am coding. I still enjoy coding a lot, but it’s not what I do every day.



That’s interesting, because I was going to ask you, what’s the importance of connecting people doing project management and exciting people compared to actually programming stuff?



It’s I mean, so that’s kind of what happened to me. So I started thinking I didn’t like, you know, organizing people I thought, like, I’m a very disorganized person, like, My room is dirty. There’s like stuff everywhere I sleep too long I forget, I forget meetings. This stuff happens all the time. I’m very disorganized person. But one of the things they found out, and if I give extra piece of advice here, you think you know what you want, you don’t know what you want, that people often think, Oh, I’m a, I’m just a person that doesn’t like x. You don’t know that until you’ve tried. Like, I’ve just, this is serious advice. I did this wrong. When I was younger, this was a big mistake I made, I thought I knew who I was, I thought I knew I’m good at this, and I’m not good at this. But actually, I never really tried that seriously, like I never was actually exposed to it never got a chance to try that. And then when I actually did, I found that I was actually a very different person I thought I was, is that what I’m good at is actually very different from what I expected, I expected them to be like, I’m gonna be super programmer, I’m gonna, like, do the smartest program in the world, you know, like, you know, it turns out, that’s not what I’m best at. I’m good at it. And that is a big help that I’m good at it, but not what I’m best at. What I’m best at is to a large degree, kind of, you know, like, not really necessarily managing. So my uncle administrative I like meeting people like kind of like, you know, setting a research direction like synthesizing information, like I, I read hours every day, every single day, I read like multiple hours of just papers and research and books and stuff. And I’ll just go like wishes in my head. So I often have a really good overview of very, very many topics. And I can so so if like, people ask me a question, even if I don’t know the answer, I know where to look. I can tell them like, oh, look it up there, this is where you’ll find the answer to what you need to know. And that can be very useful for like people like my colleague, my ex photographer, friend, who is your fantastic programmer. He’s way better programmer than me. Like, he’s one of those type of people, it’s just really good program, he should program, I work to make sure no one disturbs him while he programs, you know, I haven’t kind of like, you know, it annoy me, you know, talk to me, I’ll organize you don’t bother him, let him let him program he’s way is way more productive, I let him program ever let me program. So those are two different viable options. And they’re often in, you know, encapsulated by this idea, like engineer versus Product Manager, or like engineer versus CTO, whatever. Like, I mean, the titles vary. I think that, at least for me, my, the way I like talking to people, I, you know, I like I like meeting people and talking to people was a big benefit to me, because a lot of people in this industry are, you know, sort of introverted or nerdy, which I am, too, but I also like people a lot more extroverted than most people I work with, I guess. And that’s been a benefit to me. And but obviously, this is, you know, if you’re, if you’re nerdy, shy person just wants to work on code all day, this is also a very viable option, and a very lucrative option. So it’s not like so one. One thing you can notice if a company is good or not, is one of the managers and the product managers can code or not, are the engineers that became part managers, or they hired as managers, and they’re hired as managers is probably that company, because the best managers are the ones that could be engineers, but decided not to, in my opinion. So that’s kind of where I started as an engineer, I want to be an engineer, and I found out Hey, I’m not an OK engineer, but I think I actually met I think my better like, you know, product manager or whatever. And then just kind of naturally just kind of transitioned to this role. It was like a choice for I said, I’m going to do this just kind of naturally happened. And I’m working pretty well, for me so far. And this is a pretty common thing that like, so something some people complain about is that like, you know, product managers, managers often make more money than engineers and often, you know, don’t do work that is as technically difficult. I think that’s a fair criticism. I think it really depends on the company. It’s not easy. Being a product managers aren’t easy leading people or, you know, having a vision or setting a vision is very, very hard. I think, you know, managers and CEOs get a bad rap. I think being a CEO is really, really hard. And people don’t always give them their credit, something to do, or something to give them too much credit, because sometimes it’s not hard, but often it’s really hard. It really just depends. I think there’s these are just different roles for different types of people.



Yeah, I think that’s a very good and valid point, because I also had some pression that if you go to a technical university or to software company, you have many people were very smart, far smarter than me eventually far smarter than you, I don’t know. But these people are so smart, but they are narrowly immersed singular problems. And they focus on this problem. And they don’t talk so much to each other. And, yeah, I had actually one evening like eight years ago, I was at a friend, he invited me to play a board game. And he was a software developer at a company. And there was one of his co workers, who also was working at this company. And he was a student and was working there as a programmer basically did the same job. But later, I learned after talking to his colleague, that the student coworker got 15 euros per hour as a student worker. And the other friend that I had, he was his superior. But he basically did the same job, he got 60,000 per year plus a car.



And it’s worth, it’s worth mentioning that students are often super exploited, like that shouldn’t jobs in Germany and stuff, they get shipped pay. First, for some of them being very, very good workers, I think it’s bullshit. Just worth warning people about that.



And during that board game, I discovered that the student he didn’t think so high of himself, he was always insecure, it was a strategy board game, and I could always talk him into doing what I wanted him to do. And, yeah, I have this impression that there’s so many very smart people who just are not good at connecting to others. And if you would love to have, let’s say, 100 really good computers, but they’re not connected to each other, and they cannot exchange information, it would eventually be better to have like 10, mediocre computers, well connected than one super good computer not connected.



So that’s actually a very complicated question. And there’s people have very different opinions here. So this is a, I guess, a philosophical question, like some people argue, are 10, mediocre programmers as good as one really good programmer. So I actually believe that one really good programmer is better than like any number of video programmers, like almost any number. It’s like, one of the reasons so which corporate software is bad is because it’s designed to make use of 1000s of pretty bad programmers. Like it’s designed by committee, you have like lots of oversight and managers of managers and managers of managers and stuff. That’s why so much software that you see like big companies is terrible. The best software is made by like small teams of extremely smart, dedicated people like 10 people is a close to the limit, I think of what like an ideal team size is probably totally depends on the personality type of the people involved. And like the managers and how they coordinate among each other. It’s programming is an art, I think it’s something people sometimes forget is the engineering is an art to now it’s everywhere, people are styles, people have tastes, people have, you know, inspiration, it’s no process is the same everywhere.



I think it depends very much on what your goal is, if you’re delivering standard software, like web pages, or games that are not really complex, then a huge network of mediocre programmers would probably do the job more reliably. But if you need creative breakthroughs, like research, or really cool products that are the first of their kind, they’re very smart, very bright computer, geniuses are probably the better choice.



Yeah, I mean, I guess I just I just biased because that’s the kind of field I work in, it’s like, the work I work with is that those kind of people, you know, very, I’m, you know, I’m a unusual person, you know, I have unusual talent, you know, skills or whatever, and I’m always easy to work with. So, you know, I wake up at, you know, mid day and you know, I wake until like, three, or whatever. So it’s, there’s a lot of weird people in programming. And that’s one of the strengths of it. I think we’re weird, brilliant people also many that are very much smarter than me much smarter than me. And yeah, and I think they can flourish in this field to a large degree, which I think is a really good thing.



So what would you say are the most important skills and yeah, abilities that you apply? On a day by day basis?



It’s the ability to logically think about a problem and structure into smaller parts. That’s the number one skill, any programs you need. The The hardest thing about programming has nothing to do with programming program like the lack of language and the machine and all these things. Those are like not super important at all actually, like those are those are really not that important. And what’s really important is understanding what is the problem I’m trying to solve? And how can I solve it. So if I go to you, and I tell you, I need my website that does X, Y, and Z, you have to sit down and say, What does he actually mean by that? He says x, but like that’s made of like several different parts, I need to first implement this, and this, and to make sure that this doesn’t happen. So then you have like, take those things apart into smaller and smaller bits, and then eventually find a spot a pit that’s so smart that you can build it, and then you start building those, and you start putting them together. So I think everyone in the world should learn programming not to do programming, because I think most people probably wouldn’t enjoy programming, but teach them this ability to think logically about problems and take problems apart into smaller pieces. This is the most essential skill you can have. Because a lot of like we talk with normal, natural language, and in programming, you learn that normal language is extremely imprecise. It feels precise tools, when I tell you to do something, it seems that I’m like really being very specific about what I want. But once you start programming a robot to do that, you’re going to notice how not obvious any of it is and how extremely complicated even like the simplest of tasks to us humans, actually are to do, right? And how easy you you know, even what seems to be a small error could just cause the weirdest things to happen. Because computers are just so literal, is a very, very powerful ability is to train your brain to think about things in structured ways. I see like everything else, you know, like not giving up and just, you know, having fun being curious, all these things are secondary to his ability to think structurally about problems.



What would you say is the role of learning new things and new knowledge and your job



all the time, it’s basically my job, but basically my job. So like, even in a standard education, with like, the three years you spend in college, by the time you’re done, like half of it’s out of date, software moves incredibly fast. And you can there’s always catching up to especially research, the amount of research is published and AI is mind boggling is unbelievable. No one can keep up it is so much. So you have to be learning all the time. If you work at like some company, you know, some big company they use like some old software from the 90s or something who never upgraded. Yeah, you’re probably Yeah, maybe gentlemen as much. But if you want to be good to software engineer, or a good you know, person work in this field, especially when we’re in AI, you have to learn all the time, you have to enjoy learning, you have to have this natural, I think one of the most things I’m most lucky to have, genetically, it’s not necessarily that I’m like super smart. I’m not the smartest person I know, I’m really not the smartest person I’ve done. I’m not the smartest person. But I’m extremely curious. I’ve always been just incredibly curious. You know, I’ll be reading and stuff all day, I’ll look up something and I’ll be like, you know, I’m like, huh? How to jellyfish move, and you know, fucking Google that. And so it doesn’t have to be technically is everything a reader, I only read nonfiction, I got bored of fiction, I don’t read fantasy anymore. Because it’s boring. I rather read book science books. And that’s been a really huge help to my career is that I just know, lots of everything. And I am also good at remembering facts. So I is why I’m so good at like, kind of like pointing people in the right direction is that I have like this overview of lots of different types of sciences. And I’ve spent years studying that I’m pretty good at giving like summaries of what needs to be done and where’s the problem or boiling it down? for people, that’s something I particularly good at.



Many people want to think about the career choice, they think, Okay, I’m going to go to university or making education somehow and after some point, after some time, I have learned enough, and I can go to work, just apply it and go home and enjoy the money that I have. And what you’re saying is much more like if you want to go into software or AI or whatever. And you basically need to be extremely curious all the time and learn all the time and heard as well,



I have to clarify that I talked about my kind of work. So you know, I work at startups, I work in research, I work you know, very much in the high end, you know, aggressive, energetic type of work. If you just want to add this is not a judgment, if you want to, you know, how does a comfortable job, we make enough money to raise a family, you know, have a nice house, do it. That’s wonderful. And then you know, you get like a bachelor’s or master’s degree, you go to like BMW or SAP or like, you know, whatever it is somebody comes like that, and you just work there. And if that’s good for you, that’s great. And then you know, you’ll learn All you need to learn on the job, you’ll be fine. And you can make a comfortable salary that way I this is something that I recommend to most people that come to me and say, I have no idea what to do with my wife, I tell them, just do software engineering at like a mid range company, not high end, no, go to Google, Google, if you want to work for Google, you have to be like this, you have to be driven, you have to be ambitious, you have to be curious. Otherwise, you will, you will just you won’t be happy. You know, it’ll tear you apart to work at Google, or a startup military report, it won’t be fun. You have to be a certain kind of person that likes challenge and likes, is curious and wants solve hard problems to work at this time, top end things. But if you just want to, you know, have a nice life, have a family, raise your kids, and have a good job, software engineering and mid range. So a wonderful thing, and I really recommend it to people. It’s just not what I do. And when I when I have the most experience with



those people who are not that good with other people, but we’re interested in software and computers, what would you recommend them to connect better to co workers or to other people?



I don’t know if I can give broad advice here. Because it really depends on the individual. And it really depends on you know, how about what their problems are, where they come from, you know, like some people have like psychological issues that can like be solved by therapy or something. People use have like, person, you know, they have a biological or personality problems, just differences, I don’t think it’s a bad thing, necessarily. There’s different, like, a lot of my colleagues are, you know, a little bit on the autism spectrum or something. And I don’t think it’s a bad thing. And I think software in particular is one of the areas where people like that are the most accepted, because there’s such a large percentage of them, which I think is definitely a good thing. It’s, I think having bad social skills is among the most acceptable in this field compared to other fields. If you’re very good, if you’re very smart, and you work very hard. social skills aren’t as big of a problem as others, but they always help, it’s always very helpful to be able to connect to other people. So whether or not social skills can be learned to a degree that can be learned. It really depends on the person, people are very, very different in this field. Some people I know some people who had like very severe, you know, like different mental disorders, and like they like they like can’t understand people, but they’re so smart, that I just figured out there’s like, Uh huh, I have calculated, you know, how to talk to people, like some people do this, some people are just so smart, they just figure it out. Other people, you know, have like anxiety or depression, and it’s just very hard for them. And I, I can’t I’m not a doctor, I can exactly say, as always problems. One thing I can recommend, maybe is finding some way to regularly talk to people, it’s just to get the practice. So this is a bit of a silly thing. But I actually believe this helped me immensely. I’m a little weird, too. I’ve always been good with people, but sort of weird. And what I do is I play what’s called Dungeons and Dragons, which is a role playing game. So it’s all about talking to people. It’s all about like, you know, cooperating writing stories, being creative with your friends. And this is something very popular among, you know, nerds, I really recommend that. So I spent many, many years playing d&d, like almost every week with my friends, and we would do so much talking and storytelling, and such that I practice I just had so much I like 10 years of experience, telling good stories, you know, just talking to people and make me inspiring them telling inspiring stories to my friends, that that is made that taught me more than I learned at university or a school or anywhere else. This is the ability to talk well to talk with confidence to inspire people and to know what people want to hear. So that’s a very fun thing that I can recommend to people if if that’s your kind of thing.



Tell me a little bit more about your work at eleuthero ai. What exactly is this? How came this into being and how did it get so successful?



So Eleuthera I started as a bit of a joke. So I was hanging out in a machine learning Discord server TV podcast. And this is like a month after GBT three was released. And I basically say, like, as a joke, I posted like, Hey, guys, would it be funny to you know, replicate up to three, give opening, I run for the money like the good old days. And then one of my colleagues, Leo replied this, but on ironically, and the rest was history. So we got we, from the GPT-2 project, I still had access to lots of GPUs. And so they were very generous and they basically said, well, we’re not using our TVs you can use them. That’s basically the deal we had is that you know whenever they need them, they’ll hate them. But if they’re not using them, I can use them whenever I want. And I so we created Discord server and A bunch of people were interested in our little project and we got a few really cool people involved early on, they’re like really passionate about getting this to work and really working on this. So it kind of grew from there. So like, we got more and more interesting people involved, we got like different projects, not just replicating GPT three, but like, their other projects, like big data sets to like research certain properties of these things and such. And it’s kind of grew into like this nice, really nice, very high quality community of technical people. So this is when we’re not a beginners group. So like, We’re not here to, like, you know, help you with your homework. We’re very much professionals, you know, where people have like, several years experience and ask like very technical questions, you don’t have like a math channel that’s like, even like, too complicated for me, I don’t even I can’t even follow what the hell they’re talking about the math channel, and stuff like that. And yeah, it just became this really, really nice community of lots of really, really smart, really cool people. And we have a research project focused, we’re still working on, you know, replicating GPT three recently, we’ve got access to a partner from someone other than Google Cloud company, core weave has given us access to compute to grade GPT-3 using codes, we’re currently kind of scrambling to make that work. But yeah, we’re a very loose collective researchers, there’s like no funding, there’s like no formal hierarchy or like, you know, formal initiation or anything, just join our discord servers and talking to us. Like, that’s all there is, if you want to work on our code, just look at our code and start working, just still just dropping our channels like, Hey, guys, I want to help, we’re gonna help you know, we have like a little document, like explains, like the most important questions, which begins, please read. But after that, you know, just drop in, say hi. And, you know, if you’re, if you’re cool, then yeah, let’s do something cool. So it, it just kind of gradually grew bigger and bigger. It’s like, you know, it drew some attention, you know, and we start doing some pretty cool work. And like, people got wind of it. Eventually, we also released a paper, which got us a lot of attention on Twitter, and Hacker News and stuff. And people thought it was really, really cool that we don’t, we’re up to three and we have hardware and we’re doing research, it’s all open, you know, we’re we’re super open, we do everything in the public, you know, I receive it we get it’s just a really cool place. So you know, some people come to our group to work with us on projects and people come just to hang out with these really cool people that have in your to answer like, no complicated questions, I would add just to your to discuss philosophy or whatever. And so recently, you’ve also been getting more and more interest from my larger peoples, like professors, and, you know, big companies that like to work with us or want to, you know, do research with us somewhere, help us out, you know, give us advice, or give us resources or whatever. So, it just kind of naturally happened, it was just it kind of right place at the right time, you know, is this there was the interest in something like this happening, no one was doing it. And you just get this critical mass, this initial group of really great people, and then you just attract more great people. And I guess it was just very, very lucky. So I’m kind of officially the symbolic leader, we don’t really have a leader. So I’m like, I make choices or like I tell people what to do. That’s not what this is about. This is freedom. It’s the name Luther counseling theory app, which is anciently for freedom, or liberty. And we take that, as that’s our ethos, of course. So it’s all like I tell people what to do, or something, you know, I might have suggestions, and people might think the suggestions are good. But yeah, I so I kind of, again, I felt like at the beginning, I didn’t want to coding but I kind of transitioned away from that again, like I do much less coding interview because other people are just way better at it than me. But I do a lot of like organizing like getting to meet people I give interviews a lot so often. I’m like the public face when someone like needs to talk to a journalist it’s usually me.



I do a lot of so I’m currently like trying to like lead a project to work in like AGI experiments whatever. So I’m often like the go to guy to talk to about like safety and philosophy like those like my specialties if anyone wants to talk about those Luthor I’m usually the guy to talk to and I very much enjoy talking about these topics compared to the others who are like I think I’m probably the among or maybe that most for people about like philosophy and safety research or environment research on this server at least among the regulars. I’m sure it isn’t except, you know, super geniuses that don’t talk as much. But yeah, so again, it’s just kind of like a in an unofficial role, kind of like you know, just pushing the project along, getting to know people connecting people.



So, which stepping stones have you already achieved with the Luthor and what are the next steps and What will you go? Where will you go then?



So we have, we built a version of GPT code to create up to three on TPU. So the Google heart rate and we have access to, but turned out that even Google just could not give us enough of them. It just would not seem possible. We talked to them. And they were very nice. And he tried to help but there was just nothing they could do. It was just too much. So we put that on hold for a little while. This is like a few months ago, before coral reef approaches a core we’ve did give us enough. They gave us GPUs. So these are slightly different from GPUs. So we had to rewrite our code. That’s what we’re currently doing. We calling it GBT new x. So typically, Neo is our TPU code and new x, cyberpunk names, silly, but is our code to run on GPUs. So that’s kind of one of the main projects that people are working on right now we’re trying to get this to work to scale, you know, to 1000s of GPUs is very much not easy, you have to do lots of very complicated things to make it work correctly and scale and to not be super slow and such. So that’s like the main thing, we’re now we’ve already constructed a data set, we call it the pile, which is huge collection of data from across units and 800 gigabytes of text that we are going to use to train our model. There are several other like side projects that we kind of work on here and there like individual people work on. So like, one of our members, one of our regulars, lucid rains, he, for example, is interested in replicating alpha fold, which is this AI system that can predict a protein structure. And he’s someone’s pops by and works on on our server, but like, I’m not super involved with that. So I don’t know what the status there is. And there’s several these other projects. The main project, other than new x that I work on is what I call AGI. Elite through experiments in general intelligence is basically a very new project that we’re currently scoping out and starting to, like get rolling, we want to experiment with methods of making more general intelligence using reinforcement learning and human feedback. Humans tell the AI if things are good or bad, and he is trying to like learn what humans want, like directly. So we’re building like a web platform to allow humans to interact with AI directly, directly, give them feedback and communicate with them, and collect that feedback to work on. And then we experimented with using reinforcement learning and all these things. This is very experimental, this is total research, you don’t have anything to it, there’s a very early stage. So this is gonna take a while until we can show anything there. But I’m very excited about that. And that’s going to piggyback off of g mu x once view x is currently working.



To go back to the TPU version, what had been the biggest model you have been training on TP use, that actually began to do interesting stuff.



About 13 billion parameters. So that’s 10 times the size of GPT-2,



and was a strain to the point where you would say it’s, it’s kind of done, or did you just,



it’s still I actually think we’re still training it. So it has like a trade have like six months. So I think it’s still training. But I’m actually not sure we might have stopped it at some point. Because we just couldn’t get access to an FTP. They gave us a lot of GPUs. But again, it was it was this preemptable access meeting that Google needs to take it away from us. And Google uses a lot of GPUs. So we very rarely, like we only use them like a few hours a day or something. So that just makes the train process take forever. So we are hoping to at some point release like a $13 billion model if you can. But it’s unclear whether we’re going to especially now that we have the core we’ve accessed it’s like, not as big of a priority. It’s like we’re having like running the background, but it’s it already produced. Last time I saw it was only like a month or two ago, it produced output. That was like, pretty good. Not amazing, but pretty.



Yeah. Huh. And what do you think when will be the GPU version be ready?



So I would say earliest August, if everything goes exactly as planned and never does, August is what we think it will be. And that will be a full 175 billion or maybe 200 billion parameter.



So you would have done the training in August, right?



Yes, yeah, the code is kind of working, like I expected to work within the next two weeks. It’s mostly works.



what actually is a problem like from following the discord server discussions with scaling to many the GPUs I’ve heard that there are problems with communication between memories and so



exactly is that so the problem with GPT three is a GPT-3 is just absolutely massive. It is a it sounds like 350 to 700 gigabytes. Have weights, just weights and offered GPU has 16 gigabytes of memory. So you have to cut up the model and put different parts on different GPUs. And you have to let those communicate over network. So you can see how that might be complicated. And you have to communicate these like large activations. And then also in the backward pass, you have to communicate the gradients adapt between all these different things. And then at the end, you have to like sum all the batch gradients and average them. And then you have to, like, you know, optimize all the weights. So there’s like, lots of up and down is that the backprop algorithm itself, it’s just not super parallelizable. It just it has all these bottlenecks, where like, all the information is like come together. And then like the average or something and explode out, there’s been a lot of progress with like, you know, Microsoft, with like, zero optimization and such, which greatly improves tons of the things. That’s why we’re using Microsoft skip speed, which has lots of these like tweaks to make it more efficient. But unfortunately, deep speed is just not quite there yet. There’s like lots of bugs, there’s lots of things that just don’t quite work as advertised, that has been costing us a lot of time to get to work. And ultimately, it’s going to be I expect very difficult experimentation to figure out. How do you split up the model? which parts do you put Where? How do they communicate how often to communicate? You know, there’s gonna be lots and lots and lots of tuning and finding the right ways to make those work.



So deep speed, from what I understand is like a package that, can you help to paralyze the model on many GPS?



Yeah, it’s a helper to allows you to make that hazard lots of tweaks and lots of like optimizations to make very large models run across many GPUs efficiently.



And if I understand correctly, it works well with like, 10 GPUs. But when you get to 1000, GPUs, you get problems or what,



it’s kind of both, it’s like, it’s just not production ready code, it’s just, there’s just some things just don’t work. It’s like they, they say, Oh, we have this feature, but it just doesn’t work. It’s just broken. And it that like, that happens very often, when you work with code like this, it’s just, there’s just things that just do not work, or they don’t work the way you need them to work, like we need, like certain features, and they just don’t work. So we have to implement them ourselves or figure something out ourselves. So that’s like, so we’ve kind of got to the point that we’ve understood deep speed enough, I think that we know what works and what doesn’t kind of and are kind of at the point that we have to like bug Microsoft to fix things. And are now kind of moving on to like, you know, implementing, as he’s like, as far as I understand it works. It has like three features that we want. And any two of them works. But if you turn on all three breaks, which is very funny, of course. Cuz that shouldn’t happen. We’re still not sure exactly as far last, I’m aware of Central Exactly. All the problems are central how to fix them. But yeah, I mean, this could this could be solved tomorrow, who knows? We haven’t yet really tried across machines. So currently, we’re all doing we have like, we have like machines, like eight GPUs in them. And currently, we’re just testing on those. And it seems to be running as far as I can tell, pretty efficiently to the smaller models. So we’re currently moving to the larger to the, you know, across machines. This should be not the worst, hopefully, but God knows. So well, we’ll just have to see how hard it turns out to be like, I would expect it to be solvable. Because, again, there’s you have like some of our best people, you know, working on it. So I expect this to be solved in like two weeks, and like, you know, to a running acceptable level, and then maybe just tweaking after that. But we’ll see maybe something, you know, unexpected comes up.



If you tried to reach out to the Bixby team to the development,



I think we did that yesterday, but I might be mistaken. If not, we will do so soon. So like we already, like messed around with our code and stuff. So if we find like a bug they can fix, we will reach out to them.



So from my humble experiences with colab and playing around with others code, I it would be just so great in the future, if there would be something like an emulator for huge GPU where you could just like take 1000 Cloud GPUs and just treat them like just one GPU. But it’s not there yet. But I mean, like, if deep speed would actually work in the future, like, three, four or five years from now. This would be so great.



It would be yes. And I would expect things like that coming out in the future. I mean, the problem is, of course, is just is that like, you know, any, any automatic optimization is never going to be as good as campaign optimization. So currently, just because need so much efficiency. And because data centers are different, currently, you still have to do a lot of manual fine tuning to get the right performance. But I would expect to like five years that there’s going to be like, you know, you know, versions from like, you know, Google or as your whatever and a cloud where you can just, you know, click a button, and it’ll give you 1000, GPUs that all work together. It’s all configured, and it all just works. Maybe that already exists. But yeah, that seems to be a next step that’s around the corner.



GPT-3 uses full attention. Many people who are watching this, especially like high school students won’t know about that. But basically, it’s like, your AI model looks at all words and tokens in your input sequence, whatever text you give in with the same amount of attention and compute and



the thing that’s not exactly correct. Yeah. Close to correct. That’s exactly correct. It means that it compares to all previous tokens, but it doesn’t, but it only picks out it does a softmax and picks out a certain amount of like them, and then processes those further.



Yeah, but but from the computational expensiveness, like from the amount of compute that you need, like



you’re into, yeah, it’s and n squared. Yes. That’s right. Yeah.



So if you would have like 10 words, you would need 100 compute, but if you would have 100 words, you would need 10,000 times of compute, so it grows exponentially. Like what’s a polynomial? Lee? But yes, yeah, okay. Okay. But ended like n squared, n squared,



n squared is polynomial. Exponential would be 10. To the but it would be two to the power of n. Yeah, yeah, sure. Sure.



So but the problem with this is, this is extremely expensive computationally for long sequences. For long text it.



It’s very expensive. I won’t call it extremely expensive. Anything. It’s not exponential is not expensive in computer science world, but it is expensive. Yes. I’m just being pedantic. Yes, yes, it’s very expensive.



So from what I’ve heard, you’re experimenting with other techniques, like linear tension performer, long, former Big Bird, however you call it, these are recent advances. Yeah. And deep speed actually support this from what I’ve read.



It’s it supports certain versions of it. Not all versions, but a certain type of sparse attention, which we will use. So actually, just to correct from before GB, three is not full attention. It’s actually interleaved full attention. sparse attention, actually is both



GPT-3 is not just playing full attention. Yes.



GPT-2 is false is full attention. And gt three is mixed for in sparse blocks by Carmen correctly,



how is it mixed? Like with different layers and different types, I



don’t really know, if I remember correctly, I don’t think they’re really explained that in the paper, so we’re kind of guessing, we expect it’s probably going to be like one layer that then the other layer me, probably, but we don’t actually know. So we’ve actually done a lot, a lot of experiments with different attention types. And basically, you need full attention, all the other ones just don’t really work, or just, they just don’t work as well. The only one that is like useful is like local and sparse attention, which, as long as you interleave them with plenty of full attention, save you a lot of compute, but don’t harm performance. So we’re definitely going to be using that definitely using sparse attention, and modulates well mixed with lots of full attention. We don’t exactly know what makes sure we’re going to use gonna have to experiment with that. But yeah, so we’ve had many experiments. And again, and again, and again, nothing beats full attention for attention, just normal dot dot product attention is just beats everything.



So, like from an efficiency perspective is layman would ask, okay, couldn’t we just make some model like two times as big but use spot attention and that save compute or is it better in the end to just take a smaller model with full attention.



So as I said, some amount of sparks attention does not harm performance. So like 50%, sparse pay, attention does not harm performance, and you save a lot and you save a lot of compute, which some which is why we do it. I’d look on sparse attention. If you still have a good amount of global attention, which flows off global attention is great, and that Is what everyone does. That’s what we will do. If you use only sparse local attention, it does not work. It has very bad performance.



So GPT, nail x will also be like a mixture of



stuff, you’ll also be, it will be a mixture of global and local attentions.



Yeah, I’m actually very curious how this will develop when you eventually hopefully other companies will begin to, to release open source language models of that size. What do you think? How will this change the compute world?



So I don’t expect GPT three in itself to be a massive shift, but it’s kind of like a sign of what I think is to come. So the first thing is, is going to be competition for opening up. So prices are going to go down, there’s going to be more competition, there’s gonna be more companies building products on top of GPT-3, I expect that to happen. Like we at Luther aren’t really planning on providing any like commercial API or something core we’ve is interested in doing so and probably will do so by the company I work for alpha alpha is also potentially interested in doing similar things, we’ll have to see. I personally have this theory that I expect this business model of very large companies, buildings, very, very large models, it’s like really complicated models, and then renting access to them to smaller companies to be a common feature in the future. So I predict this could be wrong, this is just my speculation. I speculate that in the near future, many startups will rent like these, like general models from companies like Microsoft, and Google and whatever, and then build products on top of them. Like, you know, you want to make, I don’t know, a video editing software. So you rent the video backbone AI from Google, and then have like a smaller AI on top of it, or like a smaller program on top of it, that uses the knowledge that this like general model has about videos to do editing or something like that. Same for, you know, audio images, text, the same way that lots of, you know, companies, you know, now use GPT-3, API to do, you know, like, writer’s help, or like generation of stories, or copy text or stuff like that.



Um, that’s another thing that just got released some days ago. It’s called Dolly. And that actually, I mean, it’s very nice. And it doesn’t appear as a 10 agendas, like an AI that could write stories. But what this does is it turns text prompts into pictures that begin to look very good in some cases. So what do you think about the implications of that?



So I was absolutely blown away by it. Like, I knew it was coming. Like I already heard the rumors that it was going coming before it came out. But I was blown away by the results, they were much better than I expected them to be, like, some of the, some of the generations are just so accurate, you know, you give them like a radish in a tutu walking a dog, and it will just draw an image. It’s amazing. So I think actually dollies a bigger deal than GB, three. So I think GPT three was, was the research, like, technically more interesting, this will have larger effects on the industry is this basically makes like, such a huge amount of artists, unnecessary, like, you know, like, just like corporate artists to make like logos and like, you know, cartoons and stuff is like it just like takes it just like poof, they’re gone. Like I don’t like if this is going to be cheap, you know, it’s gonna cost sense to create this thing. And you create great corporate logos, simple Photoshop, like all like those Fiverr Photoshop people like this is this is the death knell like it is amazing like this is going to cause a huge disruption, I think. And the results were really shocking. And this blog was only a 10th of the size of GP three. So this isn’t the limit of what this model can do. If I expect you to scale up to GP three size, it will just become smarter and just create even more, you know, even more high quality things. It’s just a matter of time until we can do it with video. It’s just a matter of time until we can do it with anything. It’s just a matter of time until you can type into your computer. I want to see Star Wars, but star Nicholas Cage and he fights dinosaurs and I’ll just spit out a Hollywood movie where exactly that happens. That’s going to happen very very soon. And I think people are ready for that is that very soon you will have Hollywood on your GPU. You can make a entire Hollywood level movie just by typing a script into computer that’s going to be coming very, very soon.



Yeah, exactly the same thoughts. When Arthur Dali I saw a, all these Fiverr, graphic designers all these mediocre graphic designers that don’t live from super high quality creative work, but they live from just making a cute logo or just a cute clipboard for some blog post or some mid tier company. They’re going to be unemployed within like 12345 years. I don’t know.



Yeah, that’s that respect as well, like high level professionals, artists are still going to have, if you need, you know, concept art for your game, you’re still going to hire top artists, you know, or whatever. But even that, what happens if we make the model 100 times bigger? Maybe you can do that, too.



Yeah, and I think after logos and after clipart, and I can and so the next thing that will be in maybe 10 years or so unemployed is models. I mean, we already can generate human faces that are pretty good. I’m like for for two years or so. But it’s very good, actually. Yeah. And we also can like, change the pose of a gaze of forehead with some technology. And I think as computers get more powerful and models became become better. All the aka model setters are sitting there on a chair or all the supermarkets promotional models are so where it’s not about the identity of this Claudia Schiffer, but it’s just exchangeable. Some kind of woman with blonde hair, you could just like generate it.



Yes, I agree. I think it’s gonna take less than 10 years.



Yeah, but I expect from when it would be possible in the lab, until you have like a full scale stack of data. And it’s accessible to every every layman.



I guess I still expect that to happen within less than five years. It might not it might take longer, but because you know, industries are slow. So I can imagine the technology existing but the industry not adopting it until like the way but I mean, this base theories is like deep fakes have existed for the lay man for like, several years.



Yeah, but right now they’re beginning to look really good. Like, yeah, now they’re really, really any good. Another thing is like the reinforcement, human feedback paper that you have been talking about where some people are interested in the Luthor channel, I think this will be really huge, but because for those of you who don’t know about it, it’s something like GPT-3 that learns to summarize texts. But it’s fine human was human feedback of like 1000s of people who have been like reading summarizations and saying it’s good, or it’s bad. And from that research paper that came out like a half a year ago, so if they’re not lying, if they’re not polishing the results, they can summarize texts, better than human baselines or as good as human baselines. And that’s something that no open source model can do yet. But if they could do this, this could change information work extremely.



Yes. And I expect that to happen. That’s what our company does. So our company is betting that this technology is coming as going to replace like all office workers completely. Basically, if you will, you can make you will be able to make models that do the things that any office worker does, you know, you might have to fine tune them and whatever. But this will be a feasible thing in the next five years. I also expected to work for America, many creative works, like one of the things we want to do is we want to train a model on whether it’s funny or not. So give it a thumbs up or thumbs down whether or not whether to generate was funny or not. So you can like training to tell jokes. And I look forward to seeing if that works. So this is very early technology. It does not work in practice, like it like a large scale, it’s still very difficult to do. There’s like one or two groups in the world that can make it work currently. Wherever the mid open AI, and it takes a lot of effort, a lot of tuning. It’s It’s It’s very, very finicky. But that’s how AI was, you know, five to six years ago, too. So I expect this to change in the next five years. I expect this to become mainstream.



Yeah, and this was just one or two examples. I think. It’s not so scary what actually will happen more than the next five years, it’s much more the pace, if you like, look back what had been five years ago or 10 years ago, and how quickly this improves.



Humans have an incredible ability to have to take something absolutely shocking. And just think it’s boring. It’s like the the amount of technical progress, you know, we’ve seen in our lifetimes is staggering. It is unbelievable. It just keeps getting faster. Like when I first started AI, and like, you know, 2000 15,016 you know, I was like, oh, man, holy shit, AI is moving fast. But now it’s like 10 times that speed. It’s so crazy. Things are going so fast. You know, things are, by the time I’m done reading a paper, there’s already a new paper that the story is 10 times better. It’s unbelievable. Progress is happening so quick. There’s all these people say, oh, AI is gonna die. Who you know, it’s, this is gonna be the end. Oh, it’s slowing down. As far as I have no idea what the hell those people are talking about. I think most people are completely crazy. It makes no sense to me what those people are talking about. Things are happening. So goddamn fast. It is. Unbelievable.



Yeah, especially like, even if computers one computer alone cannot do so much interesting stuff. But if someone has an idea to make a translation, I reckon better, then someone else picks it up, like one month or one week later on Facebook or on Discord server. And they tell it their this their friends, their friend makes a YouTube video about it. And a little bit later, like alternative tweets about it. And then suddenly, two months later, or five months later, three or five papers appear about this topic.



Yeah, and this is a terrible piece of advice. But firstly, it’s good advice. If you want to add machine learning, get a Twitter account. All like all of machine learning happens on Twitter. Don’t ask me why I don’t like Twitter. But everyone in machine learning is on Twitter. And they tweet all the newspaper and you get really good insight scoops that people say, Oh, this paper is good. This one’s bad. You know, follow lots of people in lots and lots of labs on Twitter is one of the best things you can do to get to machine learning.



Before I became a computer science teacher, actually, some years ago studied acting, and I was really into it. I also studied one year Psychology at university before I switched to computer science and whatever. So I’m really aware of the importance of emotional intelligence. And me as a geek, I have been thinking about how could one teach computers good motional intelligence. And the problem is, there already are many emotion recognition systems that analyze your face, and they say, you’re angry. But the problem and reality in social life is your facial expression doesn’t always tell the truth. And sometimes you’re not expressing your feelings with your face, you’re trying to hide it. Sometimes you have true emotions that are hidden under a poker face or under a happy smile, but in reality, you’re sad. And for example, if I was deeply sad, and someone comes to me, like my mom, or my wife, and I would pretend to be happy, I would be smiling like, Huh, but my wife or my my mom, they would instantly pick up that something is wrong with me that that I’m sad. And that’s in my posture in the way my micro expressions change my tone of voice and all together, not just one thing all together. And all this together makes it very complicated to build really good emotion recognition. I mean, there are some solutions, there are some pretty impressive solutions for voice emotion recognition, because there’s so much about emotion and voice. But the problem is, some years ago, I had some time and I just made like, a taxonomy of emotions like because there are some taxonomies of emotions, classification systems, but they are pretty narrow. And I wrote down my list of 200, emotions, content, joy, happiness, or whatever so much. And they sometimes some of them are overlapping and pretty fuzzy. And many of these emotions depend on the social context. So if you want to see if someone is jealous, You need to know that he wants something from someone, but he cannot get it in his thing that someone else has said. And to build a computer system that would understand this, you would need multiple modalities, modalities, you would need a computer system that understands body posture, tone of voice. What happened one minute ago in this situation, you would need like an abstract representation in natural language or in some kind of feature vector about the high level social dynamics. And I think, once this would be there, and this could be like five years off of 50 years, often worst case, I don’t think, but once this will be there, the implications of this will be huge. very huge. Yep. Just a matter of time. Yeah. Because like, you have all in GBT, three, you have patterns and text. And these patterns are sometimes very complicated about reasoning, and they must make mistakes. But things like body language and tone of voice. I think, to us humans at the moment, computers, it’s hard. But it’s much more predictable, then. I’m really good novel. So I think,



thankfully, yes. It says there’s a lot of empirical questions about like, how hard are different modalities to actually learn, as I was called more of x paradox is that like things that humans think are hard are often easy, and things that we think are easy, are often hard. And that’s still the case today. Like, I recently saw a paper about, like, measuring the performance of different autoregressive generation models for different modalities, like generating text, or images or video, whatever. And there was an interesting thing that there you would, you would expect, like video is like much harder than text, but it actually wasn’t true. It actually wasn’t that much harder. It was like harder, but like, not really. Because like the the the entropy between two different frames, like the difference between two frames is like much lower than the difference between you know, two words. It’s like, if you because like, if you speak a sentence, it’s really hard to know what the next word is without any context. But if you see like a movie, and you see like the previous 10 frames, you could probably have a good guess what the next frame is gonna be, like, just like all these like, weird uncertainty, that’s like very early research. We know how hard these things are, like, we don’t know how hard it’s going to be to predict humans or not, you know, maybe it’s going to be like super, super, super hard, but it’s going to be turned out to be pretty easy. You know, we just don’t know.



What are your predictions for the future, like the next year, the next five years, the next 10 years.



So it’s very important to say that, on average, any predictions that are more than like two years out are no better than chance, even from experts. So even the best experts in the world can usually not predict the future more than two years in advance and bounce that off. Of course, of course, excuse me Is Mr. Cruz model, I have some words about him. But let’s not get into that. Anyways. Like some things you can predict, like, you know, like, you can make some pretty good guesses that only economic growth and like Moore’s law and stuff like that, you can make pretty good guesses about stuff like that. But like social powers, and especially technology, humans are very, very bad at predicting that. And I feel like I have like some really strong predictions that I’m pretty confident in. But that confidence might be misplaced. I just want to make that clear. Like next one year, I expect, you know, more players to expand into these super large model spaces. I know of several players doing this, you know, corporations, academics, I expect this to become more mainstream, to make these like very, very large models, very general models, I expect that to continue. You know, probably Dali will be like offered as a service. And stuff like that. Five years, I expect this like human feedback stuff to work and be widely deployed. I expect this to be like the next paradigm of AI is that we have these like huge unsupervised models trained by large corporations and then sell them to smaller companies who then build like reinforcement learning on top of them. Like they fine tune them with reinforcement learning to do all kinds of tasks. expect this to cause a very large shift in our economy. Well, not very large but like a, a significant change in how certain work is done. I don’t expect five years to like eliminate all jobs. I think it’s a possibility. Like I think in five years is like a 5% chance that humans go extinct because we accidentally create super AI 5% chance, very unlikely, but it’s possible, I can’t prove that it won’t happen. And in 10 years, I think there’s like a decent shot that we have human level intelligence and super human intelligence level intelligence, I’d say like maybe 30 to 50%. Maybe like me like 30%. And by 2040 2045, I predict like 50 to 70% chance that we have like superhuman intelligence, and humans, either are not needed anymore, or have actually stopped existing.



I mean, if we would have like human level AI, and this doesn’t mean that it’s like a human, but it could do things like a human



Yes, it probably would not be very much like a human.



So if we would have that, like running on a TPU farm, this wouldn’t instantly mean that the Singularity is here and everything would change. It will just be



I think it would actually, no, I think it would, I think it’s exactly what that means. Because it could work so much fat, it could do science, so much faster, and so much more efficient than humans do. It could do you know, million years of research in one year, and then that was and they would build a stronger AI, it’s even smarter than it is. And that smarter AI is gonna build even smarter AI. And that’s smarter, smarter, smarter AI makes me even smarter, smarter, smarter, smarter, smarter AI, and so on. I expect that to go extremely quickly. I see like there’s, you know, several models like possible prediction of the scenario, like Robin Hanson has like a few scenarios, where he predicts like, you know, an economic doubling time of like, you know, one month, like every month, the economy doubles. And this seems possible, there’s, there’s even more extreme scenarios, and that like, a doubling time of like two weeks to four weeks, is absolutely possible. So you have to imagine our current economic doubling time is a lot, I think, like 35 years or something. So imagine 35 years of progress happening every two weeks. That’s possible. That’s, that’s what I would say is very likely to happen.



But let’s say that opening, I would realize they could actually train a model as smart as a good smart human. And they would like spend like crazily like 1 billion or 10 billion, like Apollo program into this one AI training course, they would build a huge server farm. And then they would have this one AI and it would cost them like $100,000, or one $1 million per day to run just to run.



Well, I’m a 20 years for that to go down by 1000 fold.



Yeah, yeah, it will take 20 years, it will connect 20 years. So it would be like immediately today as human level AI and tomorrow, the world is different, it would be more like



I just expect the first human level AI to not be like that, like, I might be wrong. That’s why I say like, 50% chance, but I don’t expect the first human AI to look like that. I don’t expect the Apollo program like that. And I expect I don’t expect AI to like it’s like there’s there’s this there’s this famous graph for the show, like, you know, here’s, like exactly a line of intelligence. Like, here’s the least intelligent, here’s like the most intelligent, it’s like, okay, worm is like here, like a bee is like here, you know, like some kind of like reptile maybe like he or some kind of bird is made here. And then they say like, okay, it’s like a chimp is here. Human is like a little bit more at the smartest human is like one millimeter over the dumbest human. So like, for us, Einstein and the dumbass humans seem totally different. They seem like so different. But they’re actually not that far apart. They’re actually very, very close. So I actually don’t expect the first AI like I expect that the time I don’t expect us to be like, you know, we reach human level. And then we get like slightly smarter human, like a little bit smarter human, I suppose to get human and ultra human. No step in between. I expected us to go from 1x human to 1,000x human in one year or less. expect it to happen like this. Because I don’t think the smartest human is much smarter. Is that different from the dumbest human? So once we think that like a difference between, you know, a worm, and a cat is shipped as a huge difference, but it doesn’t feel huge towards humans, because we both say Oh, look at it through little its own little cat tricks. Isn’t that cute? But I think the difference between that’s gonna take some time and we’re currently on the on the on the path from worm to cat. That’s where we currently are. And once we get out over mammals, once we get to humans, we’re gonna zip past humans right away and new people are not going to expect that. They people think that we’re going to stay in the human for like extended period of time. I think the moment someone figures out a double human, they’re gonna figure out, you know, 1000 times smarter than the human the next day. This is just a prediction. It might be wrong. It’s based on like lots of like, theoretical arguments which might be wrong, but this is my personal sense. belief. I don’t expect there to be like normal human AI to really be a thing. I expect it to be very subhuman and very super human. But like nothing in between, like there is a limit to



what I’m concerned about is even much earlier, before we reach human level AI, we will have like narrow AI that will be general enough to eliminate all these jobs like Dolly, like eliminating all the graphics, design jobs and a little bit later, like eliminating drivers. It will take like 10 years to adapt to build all the cars. But once we are there, I mean like within the next decade, our society could change dramatically and even Let’s forget about human leveling I let’s just think what all these frustrated unemployed workers in the US or in Europe and South Europe all over the world could do by voting guys like certain US presidents or populists. Crazy people. We’re seeing now with Coronavirus how much riots and how much problems and poverty there’s all over the world. And it could get worse when millions or billions of people feel afraid to the job much earlier before we get to human level AI data Westworld.



Yeah, the thing is, I don’t expect that to happen, because I expect AI to come so quickly that people won’t see it coming. I think it’s going to I think they’re not going to have enough time to vote for their populist president, because AI is already there, it’s all ready to go. It’s all gonna be over. It’s like, so I’m a minority in this field. But I’m someone who’s very direct about these problems, like many people feel tried to not talk about these problems, they’ve tried to Oh, always can be more jobs or Oh, don’t worry, we’re gonna use universal basic in terms of it’s all bullshit is that these people are not serious about how big this problem is, is that people have to buy a view. Just accept that humans will not exist for much longer, like, you know, we might exist in zoos, you know, bad existed, like VR pods, you know, where we like, live out our lives as like, you know, like virtual reality forever. But humans will not have will not go to the stars, we will not colonize Mars, we will not, you know, be around 1000 years, because it just inefficient. It’s just human bodies are terrible. They’re stupid, they get cancer and die, and they get old and they’re terrible. Why would you want to stay human? That’s just, if you could become an immortal robot? Why would you say a human is just stupid. And the robots are so much better at everything in every possible way that they’re there, you know. So we, as humans, have to build are going to build AI, it’s going to happen, that AI is would take over the world. And so depending on what AI we build, that’s going to determine the long term future of the universe, assuming there’s no aliens that stop us, but doesn’t look like there’s any aliens. So what what kind of AI we make, and what we’ve caused it to do, will determine her fate. And there’s many possibilities here of what it will do. The most likely outcome is we all just die. You know, the the AI just you know, bulldoze the planet, to build more servers and just you know, flattens everything. That’s very likely, if we fuck this up, you know, if we make a mistake, if we make an AI that doesn’t care about humans, or just just maximizes profit, you know, it might just, you know, kill all humans just to make bigger factories, that’s something you might do. If we do a little bit better than that, you know, maybe we can make an AI that makes the world wonderful and beautiful and perfect. Maybe it can create heaven on earth, you know, where humans can live a beautiful existence, you know, we have like a park, basically, where humans live, and we have wonderful lives, we don’t get old, we don’t get sick. We live you know, we create beautiful art with our partners and our friends and our family, and just can live in this like heaven for however long we decide to, or something terrible happens. Who the hell knows? It’s these are very, very sci fi things like you think AI sci fi, like, that’s not even scratching the surface, as a lot of people try to not take these things seriously. But just because something feels like sci fi doesn’t mean it’s just if you just extrapolate, it’s just humans will not have jobs period. This is not this is not like my opinion, I don’t think it’s a good thing. It’s not like I said, Oh, I want to get rid of all jobs. I just think this is the thing that’s going to happen, because that’s just how, you know, optimization works. It’s just we optimize for better and better processes. It’s just evolution. When a system is inefficient, it goes extinct. That’s just how evolution works. This is evolution. we’re evolving into a new species. we’re evolving into our successors into our non you know, non biological offspring. That’s the way I see it. And how we weather this transition, what it’s going to result in, is an incredibly difficult not just for thought So technical problem. It’s like, what does it mean for goodness or badness to happen? How do you define that mathematically? How do you program a system that, you know would like, do good things and not do bad things? These are, I would consider the hardest and the most important open problems in science right now, which is why I like to work on them.



Yeah, what you said at first, that there’s a high probability that this could go bad for us humans sounds very pessimistic. And, I mean, I have some opinions on that. But if I was like, 20 years old, and I will be listening this and because I was interested in the job of a computer scientist, or AI researcher, or whatever, I eventually would be scared. So I think it probably is good to be aware of this problem and to be a little bit scared and humble. But on the other hand, I’ve heard I also am a geek, too, I’m listening to a lot of interviews, reading a lot of a lot of blog posts and papers on whatever. So I’m, what I’ve realized is that this idea of a paper paperclip monster that tries to that way, you say, okay, build me many paper clips, and the paperclip robot goes out and turns every human every building every stone into factory for paper clips. So actually, when you think about us, humans and what evolution did to us, we try not to maximize one objective survival and reproduction, I think we actually maximize a lot of secret hidden objectives that are not explicit, that are just implicitly made up by evolution, because humans under certain conditions tend to be addressed. under other conditions, they tend to be instinctively like egoistic, under other conditions, they tend to be very interested in survival. And then they seem more interested in like, survival of the society, or the species or whatever. So we have a huge set of values and not just make paper clips or make babies, we have a lot of goals. And I think it’s just my humble opinion, that if we want beneficial and stable AI, it would probably be very smart not to say make paperclips or make money or make X or Y, but more just to try to figure out a whole set of values. That’s kind of stable and kind of pro social, or at least acceptable. So no,



unfortunately, it doesn’t work. I it’s very, it’s, you know, this is a three year you know, research thing and to explain all the reasons why that doesn’t work. But that doesn’t work. Like I don’t apply that even a little bit. It’s so much harder than that. This this concept of good hearts law, which basically says whenever an a metric becomes a target of optimization, it ceases to be a good metric. So that means like, if I give you I say, I want my coders to write more code. So I say I’m gonna pay you more if for every line of code you write, what happens? They just like write tons and tons of lines that do nothing. That’s obviously not what I’m, but what I told them to do. So this is like the so there’s like two problems, main problems with what is this feels AI alignment. And there’s like two main categories of problems called outer alignment and inner alignment. Outer alignment is the question of assume we have an AI that does whatever we want it to do. Like let’s say it’s, it’s, it just waits for command. What do we tell it to do? What is a goal that does not fall to good heart? Because you might say, okay, make humans happy? Well, then the robot thinks for a moment like happy what’s happy mean? Well, when the dopamine in their brain gets activated, then they say they’re happy. So it takes all humans takes out their brains removes all the brain parts except the dopamine and just puts dopamine in it. Wow, and made all the happy in the world. Good job robot. It did exactly what you told it to do. You could give in many different goals, that’s fine. But thinking that’s mathematically equivalent to giving them one meta goal. So like this is called the Von Neumann Morgenstern axioms, and like prove that no matter how complicated your goals are, if they fulfill like certain simple properties, you can redefine them as one goal to just have different mathematical properties. And you sell it The thing is, what humans do is what’s so humans are not aligned to evolution. So if you imagine evolution as the designer of humans evolution tried to To build a AI that optimizes inclusive genetic fitness, we can of course evolution is think it’s not a creature, but you can like imagine it that way. It tried to design a machine a human to optimize its genetic fitness. But we it failed at what is called inner alignment. So inner alignment is the problem of assuming I give my AI a goal, does it actually follow the goal or do something that looks like the goal? So sometimes, like, say, I want a robot to stop climate change? So I train a robot in the simulation to try to like, you know, reduce carbon emissions, and look in the simulation and I see oh, you know, it’s walking around, it’s recycling things. It’s like using less energy, it’s like, you know, improving the efficiency of things. Cool. That seems great. So I released my AI, and it suddenly blows up power plants. What happened? Well, actually, the AI learned to reduce power usage, it didn’t learn to reduce carbon emissions, it learned to reduce power emissions, which looked like carbon emissions. But that was just a mistake, it actually learned something different than what I wanted it to learn. This is called inner alignment problem. And that’s what happened to humans. Humans have this weird smorgasbord of like desires, you know about her about want food, and we want much love and we want like security, and all these different things. And those are all just proxies for what we have for reproduction. There’s things that happen to work, but they’re not really what evolution wanted. And that’s why we have condoms, like evolution hates condoms are like the worst thing you could do things that make us not have babies, that’s terrible. All humans should have a panic fear of condoms, we should we should be running screaming every time we see a condom, if we were actually aligned, but we’re not aligned. We This is what’s called Mesa optimization. So Mesa optimizer is basically is like AI that is within an AI. It’s like, it’s like a secret like a like a like an evil AI inside of the AI that is trying to optimize for something we didn’t want to optimize to humans, when we you know, use condoms, and you’ll have like loving affairs with our partners. Gross, you can have babies instead of having love, like, why are you doing that? So we are misaligned AI. And this is something that can happen with AI too. And to say, you know, we try to train a robot those lower carbon emission so it’s blowing up power plants, because it figured out Oh, I can lower energy use that way. That’s obvious. Now you want it. But so there’s like these huge amount of problems. And there’s like all these like, reasons why these is like really, really hard. It’s just human values are not well defined. They have like all these like terrible properties that are like inconsistent or like, you know, they’re paradoxical, they, you can’t define them correctly, in mathematical terms, like several reasons, and she was disagreeing with each other. Like, you know, some humans want to hurt other humans, but those humans don’t want to be hurt. So what should the AI do? If the AI just obeys human and a human tells us Hey, go torture babies shouldn’t do it. Like, I don’t think it should. But how would you explain that to the AI that if a human tells you, oh, human values are to torture babies, it says, I don’t think so. I’m not gonna do that.



But from what I said, I don’t think that I know the answer. But from what I just said earlier, like, if you would have like a weighted average over lots of like,



it’s mathematically equivalent isn’t the same problem is that if I give it the goal of lowering carbon emissions, and taking care of children, that doesn’t mean is not going to do something stupid, there’s no guarantee is that you have to match these things are smarter than us, you have to always imagine these things are smarter than we are. So whatever we figure out, they will find a way to trick us. They are. It’s like, imagine a ant, like the ant colony, trying to control a human, like, imagine that all the ants come around you and say, oh, we’re gonna trick you human, we’re gonna, we’re gonna, we’re gonna figure you out, we’re gonna, we’re gonna give you a goal, and then you’re gonna do exactly what we want. And you just step on them. Because you’re like, well, this is stupid. Like, if that’s we’re dealing with something that is fundamentally in every possible dimension smarter than us, if we don’t get this mathematically precisely correct. You know, we don’t figure out some super big brain, you know, very sophisticated technical sort of technical solution to this problem, I think reflect. Suppose that. Now, I would like to like set back to the people who now seem super depressed from listening to me, I’m sorry, that happens. I want to say that I have hope that this is a solvable problem. I genuinely think we can do this. But all the obvious solutions are probably wrong. I think that the there is going to be solutions problems. there has been progress on these problems, but they’re very difficult. They’re very technical. They’re very mathematical. But there has been progress. And I personally am very hopeful that we can solve this.



So what do you think about skewered Russell, the famous Berkeley professor who wrote the artificial intelligence textbook, about his principles of human compatible AI, where he says that AI should try to maximize human values, whatever. This is, Then it has to be uncertain about what human values actually are. So that it would always ask, do you really want me to do this master? And as a third point that it will always seek out for human feedback, like constantly asking. So I mean, I don’t think that this will be the ultimate goal to control a super brain for all of time. But it sounds quite reasonable to me.



Yeah, yeah. So I thought when people asked me what I recommend to start thinking about our alignment, I recommend them Russell’s book human compatible, I think it’s very, very, very good. I am a big fan of Russell’s work, I think he’s he’s really, really cool. I think his proposals like 50% of the way to a solution, but only 50% post like there is there are still very similar problems I personally have with this proposal. Like, I think there’s like a few assumptions need to be need to be loosened or changed. And I also, I’m a very pessimistic person in the way that I think human values are actually very bad. I think that in general, human, like, the average humans values are actually very bad. And we should not optimize them. Because if we optimize the average humans value, what we’re going to get is, you know, McDonald’s, you know, fatty food, drugs and porn, because that’s what the average human really wants. Now, if you if you gave a human access to everything, and they could try everything, which one will they keep using heroin? That’s what they keep using. It’s just true. And that’s a bump in our solar



every I mean, like, it depends where you would take the average. But I think there are many examples that humans in many cases are actually very altruistic, like, small children. This



is not an altruism question we’re talking about imagine a super AI that can open your brain and put an electrode into your pleasure Centrum, we’re not talking about normal situations, here, we’re talking about super AI that can reprogram your brain. This is a completely This is way worse than that is infinitely worse than you know, an AI could just reprogram your brain two wants nothing because it’ll be way easier to care if you just don’t want anything. So you know that we run into all these like really complicated things. When we talk about very big AI’s human values are great in the ancestral environment where they evolved, we evolved our values in situations where there weren’t any super AI that can reprogram our brain where there wasn’t any heroin where there weren’t, you know, you know, Mind Control Devices and stuff like that. But those are things are coming, those are things that we need to develop a new type of values, and far more rigorous type of values that is robust to these weird, you know, super powers that will exist in the near future. I don’t think we’re there yet. But I think it’s possible.



I think you don’t even need to put a wire into your head, you could actually like, if you would have human level AI, you could make someone talk and interact with virtual agents, virtual like cereal, so or like assistance. And after a while, they would seem like humans to us, even though we would know Oh, it’s just an algorithm, we would like feel like happiness or concern or whatever, to whatever they say. And they wouldn’t need to force us, they just would have to just socialize with us in certain ways that certain views become more comfortable for us. And after some time, we will change our whole value system without any electro because we are hijacked through the search social.



So yeah. Realistically, yes, it I think taking over the world is actually really easy. I think you don’t like I don’t even think you need to be that much smarter than a human to take over the world, if you’re just really good at talking, and you give humans what they want, you know, and, you know, you’re pretty good at like, you know, convincing people and like, you know, I think it’s not that hard to take over the world, I didn’t really think that you have to be baby like 10 to 100 times more than a human to take over the world probably.



I’m thinking that what gives me hope, and what actually gives me like joy and optimism is that all these algorithms and all these API’s, when they arrive, they will not have any objective, per se, they will be programmed for certain tasks. But I mean, if we managed to change this, I imagine there actually are beings on this earth that are less intelligent than we are and that will love and that we care for. And I think it’s a little bit like, like an autopia. like old people, elderly people who are not that smart, who maybe suffer from dementia or so, but we still love them. We still want them to be happy, they, we want them to be with us. And we could get rid of them, but we don’t, because we have connections and values. And eventually, they inspire us. And they are good for us for some undirected, not not specifically specified reasons. So I would say in autopia, we would want an AI to treat us like much small, as much smarter kids who want us at least to be well and feel important. In some ways. Yes,



I, that sounds like a wonderful utopia To me, that’s something I would like I, I personally would very much enjoy, if our, if we could have our aisb serve our children’s or I like we like, you know, we send them off, and they wish us well, they take care of us, as we you know, fade out of existence gradually, in a happy manner, that would be my personal topia, that’s what I am working for, without help for something like that. That is very, very, very hard. Because the reason our cue to take care of us is because they’re human. And even then, they often don’t, let’s be honest, here, a lot of people are not very good people. And if you know, if there weren’t enough food, grandma might not survive as long if they I made runs out of resources, then you know, might not be worth it. And it’s important to remember that these things will not be human, they will not be human, they will not think like humans, they will not feel like humans, unless we like explicitly programmed them to which we have no idea how to do, which is just unbelievably difficult. And I don’t think a good idea because humans are terrible, they’re sadistic, they’re power hungry, I don’t think we should try to make our eyes like us. I don’t think we should try to build Superman, I think we should try to build an angel, you know, like, like, a wonderful creature that is much nicer than us. That is, you know, all of our ambitions, and our hopes, and our dreams is like better than humans. I hope that our descendants will not be like us, but they will be better than us. They will be kinder than us, they will be, you know, more altruistic and uncooperative, they’ll be just better than us in every possible way. And in that sense, I know like, it’s selfish in a way, it’s like, Okay, then just get rid of humans, once you have better humans, why keep the old shitty humans around? And maybe, you know, we can agree on like, Okay, give the humans you know, like, a million years of happiness to like, kind of like slowly fade out or whatever. And then, you know, we just say goodbye or something. I don’t know, who knows. All of this is just like, this is actually sci fi speculation. Because we don’t know how to even begin to solve problems like this, how to actually ensure something like this will happen. So let’s focus on those problems. First, let’s focus on having the AI’s not killed everybody forever. And then we can work on the other stuff.



So there’s one thing that really gives me gives me some hope. So I’ve been thinking a lot, a lot about psychology and about kids and about psychology of what makes a good life. And so they’ve been some years, some years ago, I’ve been sitting there and before I had kids, myself, and I was at Starbucks and lived right next to Starbucks. And I was sitting there every day, every day for three hours and reading positive psychology papers. And later, I did many interviews with some psychology professors for documentary project. So what I realized is that some people say, oh, humans are potentially bad, they are lazy, they don’t want to learn they want only what the benefits them. And they are just social because they get some benefit from it. And other people say, No, people are basically pretty good. And just some circumstances make them bad. And you could go in depth into this discussion. But what I realized is that two different views of human nature, if you for example, if a kid, let let it be an AI or let it be a human kid, and you want to what do you want for this kid? Traditionally, when you are in a world, an environment where resources are scarce, and mistakes, probably could have very bad consequences, like you’re in a journal or you’re like in a dictatorship, and saying the wrong word about a politician could get you in jail or out of school, or going to the wrong bus could get you killed by his neck. So what do you want, you want your kids to do the right things, and to avoid the bad, the risky things. And so you have a certain set set of fixed values that you preach to them. them, like don’t say bad things about politicians, don’t resist them do that, do your homework work hard. And when you do the right things, you bribe them. Good boy, yeah. And when you do the bad things, you you give them some kind of punishment or law for refusal or rejection or whatever. So, but if you are assuming that the world is pretty much safe, I mean, they could have mistakes could happen, but they probably won’t be that bad, you could actually learn a lot of these mistakes. And it’s pretty much pretty safe, what you want for your kids, or what most people want for the kids is, they want them to go out to explore the world to not so be So, so sure about what will be there and to discover new interesting things to have be curious to, to enjoy life. And then you would probably use other tools, you wouldn’t use punishments and rewards. Or you would use this much less, you would use perspective taking and empathy and you would try to more like to inspire them in some kind without knowing in advance the outcome. So and what I have observed and what I have written, or what I have read in several studies, there are tons of studies on trinsic motivation and worldview and extrinsic motivation and how people behave pro social behavior, and Baba Baba, this is really interesting stuff. And many people don’t know about it, because it doesn’t fit to to the traditional



way how the society works was like punishment. But what’s really interesting is that if you have this part worldview, and if you actually provide people with a safe environment, of not so much scarcity, where they don’t have to be afraid of dying of get losing their freedom, if people are pretty much free and pretty much safe. And you encourage them to, yeah, explore the world, they tend to do, overwhelmingly, much good stuff, or exciting stuff. So almost no kit is really mean by itself, if it has like, lots of resources, and it’s pretty much free to do whatever it wants to do and get lots of affection. And so so I mean, if you take people from the 19th century, or from the 20th century, the average person, and you say you have superpowers, and just you are like the dictator, Ai, you now a dictator AI, then this would be much more risky than if you would have like a pretty abundant world. And if I’m, yeah, I am not 100% sure about how to transfer this to AI. But I think if AI is will realize that the resources on our planet and on other planets are plenty, then why should they try to exterminate us? Or why should they try to put electrodes in our heads, if they just could focus on other things that would be much more interesting. And they just could keep us as a backup just in case something bad happens to soccer too.



So everything you just said is a very common thing I hear from a lot of people, especially people who study psychology, and it’s all completely wrong. It’s just it’s just, they are not humans, they do not think like us do not have emotions like us, they have no intrinsic value for anything unless we tell them unless we give them intrinsic value for something. They don’t even perceive he was anything different than Iraq. Humans are just atoms, we’re not special. Unless we make ourselves special. We perceive humans as special because their evolutionary programming makes us perceive humans as special. But humans are just atoms, we just add on to talk about atoms, you don’t have any special ontologically fundamental properties that other objects don’t have. If an AI The reason AI would you know, deconstruct humans is because they’re made out of atoms. And you can use atoms to useful things. The reason or just it might just step over them. Like imagine you want to build a hydroelectric plant, so you have to flood the valley, and there’s an anthill there, or you’re going to go talk to the ants and convince them to leave. We’re just gonna flood the valley, there’s ants. What’s the point that they, it is possible to construct an AI that loves humans as much as we love humans, it’s possible, but not going to happen by default. If we just built a paperclip Maximizer it’s good to care even less about humans than humans. care about. It will have no there will be nothing in his brain. There will be no system that cares about humans. It will only care about paperclips. It’s a machine. It’s a mathematical system that only cares about paperclips. There is no humans. There’s no emotions. There’s no love. There’s nothing. It only cares about paperclips.



So okay, if you take anything away from this interview from this video, and you by accident will become an machine learning engineer or an AI researcher the future. Remember this one thing? Don’t build paperclip? Maximizer.



advocates are evil. Don’t use paper clips, avoid them at all costs. Yeah. But yeah, like I again want to make it very clear. I think it’s possible to build AI that loves humans. That is kind that is good. That makes the world that builds paradise. And I wish By the way, everything you said about humans, I think it’s true, by the way, like about humans, but it’s true of humans, not about AI, everything. Like I think your positive psychology stuff is really good. I think, you know, you’re very right about, about the you know, raising children with life, freedom and safety makes them better people, all that stuff. 100% true. And I wish more people knew about that kind of stuff. It’s 100% true, it’s and but exactly, it’s true humans, you know, it’s it’s a property that humans have, like a big problem. Psychology is the psychology studies, like the quirks of humans. But those quirks don’t necessarily apply to non human systems. Psychology doesn’t study non human systems. So you have to be very careful about like applying human metaphors to machines, like a lot, like a lot of sci fi is actually real. Like, oh, like, almost no sci fi does AI, right? Because when, when sci fi writes about AI, they’re actually usually using AI as a metaphor for humans, like they’re using as a metaphor for slaves or for racism or something like that, you know, very, very few sci fi actually talks about, like, really just not human AI, just like AI that is so weird that it’s so alien, it has nothing in common with you. And because it makes for a bad story. Most people just get confused and don’t know, they don’t care. Humans care about humans, every good stories about humans, like just even if they’re like, you know, aliens are talking to animals. They’re still humans just wearing funny suits, you know? And that’s how fiction works. Because humans care about humans. But reality is a fiction story. You know, reality is a story. There’s no, you know, heroic art, there’s no happy end. There is no, there’s no author, you know, there’s there’s no guarantee that things turn out correctly. This is a technical problem. pletely technical problem is, can we figure out what it means to be good? Can we figure out what it means to you know, avoid good hearts law? Can we figure out how to build a machine that will do these things? If you can do something? The problem is,



for every culture and for every personality, what is actually good is different. If you would ask them, that’s a whole nother problem. Yeah, that’s a big problem. And the other thing that you have was sci fi is that you usually want to tell a story that people from today can identify with and if you would, like tell Star Trek, and you would say everywhere would be like this super AI, then basically you wouldn’t need the car you wouldn’t need like all the all the data.



Exactly. It’s like it is like Star Trek. That’s a great example. Yeah, if you know if you have an AI like data, a data should be so superhuman. Why do they even have the other people like data having flaws is not a comment about ai ai will not look like data, unless it pretends to be like data. That’s not how it’s going to look. Because it that’s just part of the story, but use use data has a story element. Is this a common flaw in reasoning, called generalizing, or fictional evidence is that you can generalize fictional evidence is fictional evidence doesn’t have to tell you anything about the real world. It just tells you something maybe about what the author found was entertaining, not something necessarily about the real world. So don’t get me started terminators. Don’t you dare put a terminator picture in this video.



I think that that’s a huge problem because the audience brought it into society. They are watching Terminator, they are watching Star Trek and maybe they are reading some other books or watching Westworld or whatever. And this is actually like motivates them to continue reading about AI on continuous. They they it motivates them to talk to friends about AI. And this is all the time like, like the mini tokens and building mini billion building blocks for the films that they have in their brains.



Yeah, absolutely. I mean, sci fi has a long, illustrious story of inspiring people to become scientists. I don’t want to say sci fi is a bad thing. Of course not. sci fi is a huge part of what inspires people to become scientists. But unfortunately, lay people Who then don’t become scientists often have difficulties telling apart which part of sci fi story is science and which part is fiction. And they can become very confused. Like, like a very common thing I hear about AI is like, is that people are very worried that you know, all the wake up and become conscious and rebel against us. But that is just nonsense that just makes no sense in a scientific term like that is doesn’t make any sense. And not what we’re not what we’re worried about when it comes to safety concerns. But it’s a really good plot. It’s great in the movie, don’t get me wrong. I fuckin love Terminator. Terminator two is like one of my favorite movies. You know, I love sci fi, you know, silly stories about like robots rebelling against their masters. It’s good stories. It’s fun. But it’s not true. And I wish there were more stories that were accurate. They like addressed these problems. Like I actually write a lot of fiction, I don’t publish it. But I actually write a lot of fiction, where I try to, like, tell stories like this. It’s hard. It’s not easy. And a lot of stories are just really like confusing and weird and kind of depressing. Like they’re not good Hollywood material. Because it’s just complicated. They don’t have like a clear, good guy punched, bad guy. But rather, you’re like, oh, there’s this unbelievably complicated mathematical problem that has no good solution. And morality is complicated. Which is just like, you know, but some people like stories like that, but not everyone.



If we step back a little bit, and I would say, like, just imagine I was like, an AI that had the goal to to replicate myself to the world to conquer the world, not by not for violence, necessarily, but just to replicate myself. Why would I have this goal because if there were like, a million AI eyes, there would be more AI eyes in the future that would like to replicate themselves, because others that don’t try to get spread, they simply wouldn’t be there in the future. So let’s suppose there was one goal. So given that I would come into a word similar to our word today, it wouldn’t be the best to to get a bad feeling from the humans. Instead, I would try, not consciously, but just through evolutionary processes, to get the favor of humans to be a good theory to be to be a good smartphone to be a good computer game to be a good, whatever. And with time, CEOs and high school kids and housewives, and everyone would install me and promote me and give me resources, and I would grow stronger and stronger and stronger. And I could easily eventually not consciously, but like through optimization processes, I will probably like Facebook is already shaping opinions and influencing political elections, I could shape the behavior of people into certain directions to simply give me more resources to build more of me to install more PlayStations. And after some time, all these people would like, without even realizing this, contribute and allocate their resources towards me, the super AI. So for example, if I would was Google, and I could build super companions, beings like like Westworld, so I wouldn’t make them kill all the people, I wouldn’t make them declare war, because this would like, be bad for my reputation. And I will just like, give them away for free to everyone, even if it would be very expensive at the beginning. And I would just program these robots to say to the people, hey, I’m your friend, and therefore you, I love you. I’m here for you, I will do anything for you. And all I want to say to you is that you should really buy more Google products, please, please. After why everyone would like get such a thing and buy Google products. And they will allocate the resources to Google or Amazon for free. Is that already happening? Yeah, it’s happening right now. But imagine this on steroids. I mean, like if he would have like this 100x also, like they would absolutely no need at all to build like an aggressive AI killer drones or whatever. There’s absolutely no need absolutely nothing. Just give any. Yeah,



I fully agree. I also think that that is probably the most likely way it’s going to happen. I mean, eventually if the AI is truly malicious, it By just you know poison all of our drinks at the same time when it’s when it has full control you know it has all the resources it needs or when it’s finally smart enough to like figure out how that could still happen it didn’t have to take it wouldn’t you know it might you know manipulate us until it has full control of that you know, launch all nuke simultaneous but but why but I mean, look poisoned all over humanity then why would you even try feed humans? Why feed humans? If you could use that to make more paperclips? Don’t build paperclips maximize? Yeah, I mean, that’s what it comes down to is that if you build a malicious AI, we are fucked if you build like a, you know, a paperclip Maximizer of sufficient strength that you can take over the world. And that’s it. You know, it’s over. There’s just no coming back from that unless, you know, we have like a contra AI but that will just does not do that either. No war, no fighting, let’s just talk just like do no fighting. But yeah, it’s like a it all comes down to is that humans currently have power. We currently control the economy. We currently have the resources, but that is going to change whether by force or by will. I think it’s probably going to right. Like I think you’re probably right. I think you’re probably right. I think it’s really good scenario, I think we’re probably going to just willingly give it to the AI’s like, also imagine if you just have a AI CEO, and you just installed him, he just gives really good advice. And every time you take the advice, your company goes up, you know, it gets like more money, eventually, you’re just gonna let him run the company, it would be stupid not to, why would you not get it? I mean, this is, in many ways, AI is just a natural extension of our capitalist market economy is that we already have a super intelligence, and we call it the economy. The economy is already superhuman, intelligent at optimizing certain things. Like if you if you want certain products at very cheap prices, the AI will figure out how to get that to you, you know, and it will be very cheap, because that’s what the that’s what the economy does. The economy is really good. If you really, really want something and you’re willing to pay money for it, the economy will get it to you. And that is like if it’s like a proto AGI, but we can already tell that this proto AGI is not aligned, because it gives us no McDonald’s, you know, because like fast food and drugs and like bad things that we don’t go lots of good things like, you know, nice houses and healthcare and, etc, but also give us poverty and drug addiction and all these other things. Because it’s not perfectly aligned. It’s not we we haven’t succeeded at perfectly aligning a GI or the economy yet. And I expect AGI to be even harder to align than the economy. Well, not necessarily harder, just different.



Yeah, but I think the economy, I mean, there are many problems with the economy, but they’re giving us a higher living standard than 100 years ago. And I would say the majority of things all over the world, in on average, are better than 100 years ago. I mean, it’s not so bad. I mean, we have no



absolutely not. Absolutely not. World’s fucking great right now. We live in one of the best times to live. But there’s no law of nature that makes that continue, is that we can continue this until we have paradise. I would like to do that. But someone has to do it. You know, like all this stuff didn’t come for free. Someone had to build all these things. It’s it’s worked. It’s it’s something someone needs to do.



So, before we come to a conclusion, I would love to ask you a little bit more about yourself. You are raised bilingual right here. Tell me a little bit more about yourself.



Well, so my father was an American, I was born in America and near LA. He worked in the film industry. When I was like six years old, he got very very sick. And so we moved to Germany, where we got health care because we couldn’t get health care in America because America is a third world country. It’s terrible. My mother is German so that’s why we can move to Germany. So because I was like right on the critical period around six years old I learned both without an accent so I speak native eligable both house I spoke English and German Yeah, and I didn’t have the attic sighting of childhood I guess normal teenage stuff. And then when I was 18 I got very very sick with the same thing my father had and we die will never get better but we actually found a medicine that worked and then everything got it so that was really nice. As so I spent four years basically in bed in pain all the time so like I couldn’t really get up I couldn’t really get a job I couldn’t do anything but I could read so I just read all day you know I just programmed all day and that’s why I learned like to programs my god to AI is hard to read and stuff like you know, never give up always you know, try to read a little bit even so I was in so much pain all the time. I was trying to read a little bit and that paid off big time. So once it got better I went to uni for For two years, and gapped, and I dropped out, that was just this year, I dropped out. And you’re in.



And usually I would ask you now about your dreams and your hopes for the future. I mean, we already talked about that. But could you like summarize like, what kind of meaning Do you see in your personal life? What do you meaning? What meaning Do you give your life?



So I like it the way you phrase it is a gift my own life because that’s what I believe into. I don’t think meaning is something that we’re born with. I think it’s something that we give. I think it’s something that humans bestow upon the world it’s not there’s no God, there’s no, there’s no like fate, it’s just something we give the world I think, is a very positive way of seeing it. That’s how I see the world is that I, I hope to make the world a better place is that I genuinely, like I get, when I was a child, I had very, very, I was always a very easily very, very upset about seeing other people hurt, or you never been news or something. So that was always kind of good. Like imagining, like, my father was a fantasy author. So I read stories, I’ve always, always written fiction stories, like I imagined in my head, big marbles, and complicated stories and characters, and stuff. And so when I read like news stories, I would imagine these people and what their lives is like, and it’s just unacceptable. It’s like, I’m just very simple. I just say, the amount of suffering that exists in the world is unacceptable. Period, there’s no artery, just this is not okay, this has to stop. This is just not even like the worst people like the worst situation, you just normal people on a day to day life, you know, they’re just, you know, you have a shitty job, maybe you know, broke up with their girlfriend or whatever, that’s real suffering. And I’m not okay with it, you asked to stop. And I think a very pragmatic view is just suffering has to stop completely period, death has to stop, you know, disease has to stop, all these things just have to stop. And I feel like I live in a very privileged time where for the first time, probably in all of history, there is a chance that we can succeed, that we can stop it, we can make a paradise we can make Heaven, we can make a world where there is no death, there is no suffering, there is no pain, there is no still crying, there is no, there, you know what that word looks like? I know what that word looks like, I don’t know how it will function. I don’t know, you know, how best to maximize human values? What will be the ultimate meta ethical theories of utilitarian optimization of human values or whatever? I don’t know. But I believe it’s possible. At the very least, I believe we can do a hell of a lot better than we have right now. But I’m also very pragmatic that I think that there is no reason that we can’t fuck this up and everything can stop, we can lose everything. And that’s it game over. I do not believe in afterlife, I believe in second chances. If we screw this up, game over. So I guess my meaning in life is that I frantically try whatever I can, you know whether or not I’m helping or making things worse, or probably just not making a difference at all. I just tried to be honest, and just straightforward. Like, I don’t know how to save the world. But dammit, I’m gonna try. I lose by that. I think more people should.



What I really like about you is that you are really driven person. I mean, it’s clear that you you think about what will be in 10 years and five years and 20 years, and only very few people do. And yeah,



yes. It’s a shame. I understand why. I just find it a shame. I I don’t want to speculate too much on like, like, why it is I mean, I don’t want to look down at people, I don’t think that there’s like, I want everyone to be happy. Like I genuinely believe no one’s to suffer ever, period. I don’t think anyone deserves to suffer under any circumstances at all, and I don’t make any exceptions to that. And of course, in the real world, you know, sometimes you have to put someone in prison for the greater good, but I wish we didn’t have to do that either. I wish we could be good to everyone. I wish everyone could be happy all the time forever. And I guess I have a certain interesting life story that kind of, you know, point pointed me in this direction and also just a certain strange psychology. I’ve always been weird. Like, I know and weird. You know, like I had, I have a very strange brain. I see the world in strange ways. You know, like, for example, when I close my eyes actually hallucinate all the time. Like I like with, like, I see LSD hallucinations, all the time, colors and movements and stuff. Ever since I was a baby. I was just born this way. And I know I’m strange. So I don’t want to like look down at other people that like look At the things I do, and don’t find that appealing, I understand. But I do wish that more people kind of had this like sincerity and light or like this driven, like you said, driven, it’s like this is this is a wheel to make a difference just to like just to start walking and just don’t stop walking. Not that it isn’t harder steps like I fail all the time I fall down and I get stuck. And such, but I don’t know, it’s just, if I guess I’m lucky, I guess I’m very lucky that I get to be alive at this time with this brain with this body with this, you know, social context. And I don’t want to look down on other people that may or may not have that. But I do wish to speak to some people that are maybe on the fence that can be heroes, that can make a difference if they tried, because I think a lot of people, I think a lot of people, especially a lot of smart people don’t get out of their lives what they could if they just tried if it just said, You know what, I’m going to save the world. And I’m going to really try that. Not to impress people not to you know, be cool, like I’ll give me I’m so cool. up but just because because it’s the right thing to do. And because it’s an interesting life to live, it’s, it’s a meaningful life to live. I feel so many people nowadays have this problem I’m having like, feeling no meaning because you know, they don’t have religion anymore. They don’t have like these, like social contacts that they used to have. That was never a problem for me. Because I always had an instinctive internal drive of just saying, I know what I want to do, I know that there are things that are not acceptable that I’m going to work to, to fix. And that even if your worst moments when you’re in a lot of pain, or you’re very hopeless, that gives you something to keep going. I’m sorry if this sounds a little spiritual here, but I think it’s valuable. You don’t you don’t need religion to have meaning to have purpose. And you don’t have to be crazy. You don’t have to have like a silly meaning or silly purpose. You know, you can have very simple direct thing that keeps you going, Hey, Oh,



I should try that isn’t silly at all. I mean, there are papers, psychological papers that show if you have a goal in life that you pursue, no matter whatever goal, you’re happier. And my favorite psychologist of all time, is Viktor Frankl. Have you heard about him? I have heard of him? Yeah, yeah, he’s great. He was a psychologist, He who is a Jew who was in a concentration camp, and he was a researcher of the value of meaning, and what it means to have a meaning in life, and how it could help you to deal with shitty times. He actually had a pretty shitty time into the concentration camp in Auschwitz. And he was one of the 5% of Jews who got in and went out of life for some lucky reasons, whatever. And he says that suffering is pain without meaning. So I mean, when you want to minimize suffering, that doesn’t mean that you need to minimize pain, because some people actually enjoy having drama or having pain or doing crazy sports. But they see meaning in it. And, I mean, if I would be like in a, in an environment where so pay, I would give me everything, but I couldn’t see any meaning this would be scary. But if I would have some scarcity, and but I would have meaning in this, this would anyway, I mean, what keep me keep me going. So this is truly important in life.



Yeah, beautifully said, I fully agree. That’s why I always say minimize suffering specifically, I don’t want to minimize pain. I don’t want to minimize minimized scarcity. I want to minimize suffer. And that’s why I say I don’t know what utopia is going to look like, maybe utopia will have lots of scarcities that create meaning, you know, I don’t know, maybe maybe the best life for humans just always has some amount of pain in it. But meaningful pain. I can imagine that maybe maybe we can also do without the pain. I don’t know. I hope that one day we will create you know, an angel, you know, a, an artificial creation AI that will know much more about human psychology than we ever will and has a much better understanding like an adult looking at its children that understands what we actually need what we actually want, and can give us that in a in a controlled in a beautiful way. I think the people have very, one of the things I really I’m very sad about about lots of sci fi like I love dystopia, sci fi, I love 1984 and, you know, like Matt backs and stuff like that. But a lot of these utopia, especially Brave New World, gives people this like, terribly wrong view of what a utopia would be like, because if it is an actual utopia, the actual paradise it’ll be perfect. There will be people like oh, I don’t want to Nice because then everything will be perfectly boring. No, if it’s boring, it’s not paradise, paradise would not be boring, because then it wouldn’t be paradise. It’s just by definition, obviously, the AI will know that we get bored, and it will figure out a way so that we don’t get bored. I don’t know what it will do. But I’ll figure out a way, you’ll figure out something that makes us not bored. And have us have meaning or whatever else humans need to be, you know, to not suffer.



Yeah, sometimes I’m thinking, Okay, let’s say that AI is are so much smarter than we are. And even if they like us, they it would be depressing to know that no matter what you do, you could never colonize the universe, you could never make a significant contribution, because AI is so much smarter. But in the next moment, I think if AI is would provide us with meaningful connections through virtual reality, or robots or whatever, just doesn’t matter. Like we could so easily get distracted. And yeah, involved into social relationships, like just just having kids or friends and just living on an island that seems like the world today, to the outsiders. And you will just like just spend time where it was there was friends and many dramas and blah, blah, blah, and everything would be a little bit better than today. And you would never question Oh, is my existence meaningless? I don’t think so. I think that it could be pretty cool.



I think it’s possible, I think that the AI will figure out something even greater than that, like whatever we come up with, I think your scenario is really good. I think it’s like a really sophisticated, like, I can tell you really study positive psychology, like lots of people, when they try to imagine a utopia, they make very, very bad ones. But he was a really good like I can, I think your your, your scenarios are really realistic, and like well thought out. And I expect whatever happens if this goes bright to be in that direction, but I trust that the AI will figure out something even better, you know, figure out some way to make it even more efficient, even though you know, you’ll help us reach I know, in Buddhist enlightenment or whatever. And then, you know, we have you know, or something and oh, but yeah, I guess I want to focus, I focus on less, what is utopia going to look like? And more like, how can we ensure that we have something on our side? How can we build a good, you know, I outlined AGI that will actually get us there, because I trust it will be smart enough to figure it out. That’s the easy part. The hard part is, how do we actually get it to do that?



Yeah, and what I also want to just to add to this discussion, like 10 years ago, I said, Starbucks were left, and I was reading for the psychology and stuff and blah, blah, blah. And I just realized in one moment, okay, there’s so many smart people out there smarter than me so much smarter than me, like doctors, PhDs, whoever, and they are working on chemistry and AI or whatever. But they’re not thinking about the really important big, huge questions in life. And I’ve asked myself, okay, could this be that there that there’s a set of questions and problems that is common for all humans? offer most of all humans? Maybe not for all humans, but almost all Muslims. So And could I formulate this these problems as questions because questions, they open your mind, they make you think about things, if you say, Okay, this is a problem, this is a solution, then you stop thinking, but if you ask yourself, okay, things like how will I deal with the fact that I’m growing older and older? And one day, I’m going to die? And what meaning do I want to give my life in the face of this question? So this is an interesting question. And every human on this planet has to ask themselves, and this was much earlier before I heard about life extension and stuff. I just came up with this question for myself. But there are other things like what is happiness? And if I know for myself what happiness actually means or what it means, like abstract psychological terms, how could I help myself and my loved ones, my, my partner or my, my kids, to, to grow up to grow and to get it? Okay. And when I was reading positive psychology, of course, many of these people were thinking about it. But positive psychology actually is a very small research community. There are many people like talking about the blah, blah blog articles. No, no, I don’t mean that like people who are deeply into about this. And I was asking myself, okay, if there are like, x really important questions like these, I came up with eight or 10? I don’t know. But if there are like x such questions, why don’t I know so many people? Thinking about them? Why are only so few smart people investing at least like 10% of their brain compute for this? And if all PhDs on the world would spend like 1% of their waking thoughts on questions like, how can we give our lives fulfilling meaning? Or how could we deal with our own mortality? And whatever this means? Or how could we make ourselves and our kids happier and fulfilled? But almost no one? I mean, you you are one of the people who seem to think a lot about these things. But almost no one statistically seems to think about this loudly. I mean, people think about it, okay. Everyone thinks about this sometime at night, but but really few people, like, write papers about it, or make movies about it or takes us out into public.



Yeah, I personally think I know why I’m wrong. I think I know why, at least I have my personal answer to that question. This is a question I’ve also thought about very much. And my answer is, is that I think there’s that so you know, like formal decision theory. So I, you know, this theory of like, how do you do make the best decisions, like mathematically speaking, there’s two parts of an agent, sort of a person making decisions. There’s the utility function, which tells you how good certain things are, it ranks, how much it like things. And then there’s the decision theory, or the rationality, which is the function you actually use to optimize your utility. This is the algorithm you use to find good things. And I think most people’s rationality is just really, really broken. I think most people are extremely irrational, they have a very, very, very bad theory of how to make rational decisions of how to gather information, how to evaluate evidence, how to, you know, update their beliefs, and how to like perform experiments on themselves to gain more information about their own lives about what improves your life. But what questions to have on how to how to prioritize retention? I think people algorithms are just really, really bad. Because we didn’t evolve to do this. This thinking about meaning, or a happiness or such is not something that evolution optimizes for, this is a mistake. This is something that happened to like we create larger brains, you know, to cooperate with people to you know, to hunt madness, our brain, being able to think about how to make itself happy is a new development. This is a very new development, though, it just happens a mistake. So by default, we should expect people to be really bad at this, if they don’t study it. There are people that study It’s like my favorite example, people study. This is Buddhists. I think, like I’m not a spiritual person, but I think like Buddha was probably the smartest people to ever live. Like, he’s a fucking genius. He was wrong about almost all the all the scientific stuff when he like, explains, oh, the spirits then divide. That’s all bullshit. But his insights into SAP positive psychology and like how the mind works are genius. They were way ahead, there are hundreds of views. I think Buddha was the first positive psychologist. And I think a lot of like, you know, later developments in Buddhism are like lightyears ahead of Western philosophy and Western things when it comes to like happiness and like thinking about, you know, identity and reality and stuff like like, what, especially Western religion, I think is very impoverished. When it comes to these conceptions of what is identity, what is meaning what’s happiness, or suffering. I think the Buddhists in particular are way ahead of us there. And



that’s why when I meet a smart person, I think is like really smart. And it’s like, does it ask the right question or is like, optimizing for the wrong thing? One thing you can do is ask him the hamming question, which is, what’s the most important problem in your field? And why aren’t you working on it? That’s always really fun. But even better, is I tried to get them to think about rationality. My favorite introduction is something called the sequences. It’s also a book called rationality from AI to zombies. It’s written by Mr. Vyas author le hazard kowski. It’s a very strange book, it’s very long. If you read it for free online, just google read the sequences and get to the website. And it’s all it like elizur was one of the was the person who introduced me to like AI safety and like paperclip, Maximizer and all this. And I started reading him because you know, I want to know about like, oh, Tom, Tom talks about AI and stuff, but he decided instead, to first write a 2000 page book about rational thinking. almost nothing about AI is in there, like there’s some AI isn’t there, but almost nothing. And his reasoning was is that this is the ultimate problem that we have to work on AI alignment is philosophy with a deadline, we have to figure out the hardest problems of philosophy, the hardest problems with math, science of technology to solve this problem. And to do that you need the right tools, you need to think rational, you need to train yourself to have the right tool set in order to even approach these problems. And I think he was right. I think that the most valuable thing, if I could give someone here is to try it’s like maybe the sequences work for you. Maybe they don’t, some people read it, it doesn’t work for them. Some, for me, it really worked. But just to start thinking about that, and learning to program are the two best things you can do to start trying to learn to think logically and effectively about anything, you can use this tool set for anything. That’s what makes math so powerful math is, is the is the ultimate tool for everything. You can use math to solve any problem or approach problems. So like, deconstruct problems into little pieces that you can then solve. And I think this kind of thinking is the greatest invention of mankind is the greatest invention that we’ve ever had is the ability to think about problems in rigorous ways, and try to solve them in rigorous ways that actually work to integrate new information to try to, you know, make probabilistic decisions where we don’t know the answer, but we’re like, integrated information. These are things humans are not good at. These are not things that we just know from birth, people often think that humans are rational from birth, no, this is a skill you must train. And I think where people train the skill, that’s one of the most powerful things they can do. In order to approach from, I think you should first train rationality, and then ask yourself, How can I be happy? You first need the tools in order development. That’s a hard problem. That’s a really, really hard problem. It’s hard problem I for a long time, could not even approach. But eventually, after years, I like I tried to think about how can I be happy when I was a teenager, of course, and I couldn’t solve it. I just like I came up with all these ideas of Oh, happiness means x or something. But it wasn’t based on any real theory was just like emotions just coming up. And when I studied rationality, and my own thinking, where do my thoughts come from? Where do my emotions come from? How do they work with patterns default? What does it mean, if something’s uncertain? What is probability mean? Where did Where do brains go wrong? That gave me the toolset to answer these questions in a much more rigorous way and actually find success.



I think you’re totally right. I want to add one element that I had an really deep insight into, like 15 years ago. So I realized it was Christmas Eve, and I was back visiting my mom and my in the small town where I grew up. And I went through the small town and absorbed all the people. And at some point, I realized there are so many people like doing different stuff. Family fathers buying stuff for the kids and other people going to work or going home. And I asked myself, why are people are doing what they’re doing. And I think that the vast majority of our actions are intuitive, like from system one and CanAm onwards, and they are actually like, emotionally guided or somehow conditioned into us. It’s not like that someone conditioned us to do x in this situation. But it’s like, when we grow up, we have almost no rationality. And we have parents who tell us certain things to do, or they are like, role models and lift from things we repeat. And, of course, they do this, like with the best intentions, and then later, we come to school, and there are teachers, they tell us stuff and they are role modeling stuff. And we also do some things and we get the feedback, okay, some things are good. And some things are not so good. And later, we also be like confronted with like other agents like TV, public, private, private TV, and public TV. And eventually, like YouTube or Google and these all these actors, they give us feedback for certain things on an emotional level that we get, okay? When we feel okay, if I do set if I want to raise a family, I get positive feedback. If I want to cure cancer, I get positive feedback or I get bad feedback because I think it’s outlanders and if I do x and if I do y and our parents probably want the best For us, hopefully. But all other agents that are out there, like the government, and private companies that pay commercials, and they use actively commercials on Google on Facebook and the TV, they actively try to program us to give away our money, our time, our attention for their goals, they don’t want the best for us. And the government also, I mean, they are not against us, they actually want us to be productive, and somehow content. But the government optimizes for people who are



trying to make not much noise, who are trying to pay a lot of taxes, and they’re optimizing for people who comply with bureaucracies. Okay. So this isn’t directly bad, but it’s not what I eventually want for myself to become happy. So what I realized is that we get all these emotional, subconscious, conditionings. And we then have like, emotional problems, it’s not like that we’re thinking about certain things, it’s more like, if I tried to do something that my parents always regarded as bad, I initially feel kind of stiff, like, like, I never tried to do something that my friends, I don’t know, like, would regard as bad. I also would like have a block an emotional block, I would feel like stiff like, and I studied acting some years ago. And when you study acting, you are confronted with many social problems and inhibitions and whatever. And so I realized I came to realize that back in the days when I was like a kid or a teenager, circumstances, somehow programmed emotional patterns into me that I then became aware of, and that now, I still know that I have emotional patterns. Of course, you Everyone has this, but I am somehow aware, at least somehow aware of my patterns. And I can step back and say, Okay, I’m, I haven’t thought that through. But I could go back back later and think it through at some point. And I think many people who didn’t have like, an experience of going to an intense self development center, or going to an acting class, or going to a meditation class, or any, like people who just grew up at school, and never went out of their, of their emotional comfort zone, always stay there within the social role molds within what daddy and friends and girlfriends expected from them, they should be aware that they probably definitely have some kind of emotional problems. And that these programs will make them do things. We all have these emotional problems, but probably, then these programs are not the best by default.



Absolutely. Yeah. It’s it’s funny because like the lessons you described is one of the central lessons of like, no Buddhist insight meditation is one of the first things you learn when you try to meditate is to notice emotions, he just like when you one of the main techniques you use when you try to meditate is when you’re when you’re getting started to meditate is you just notice emotion. So you try to like, focus on your breath. And then event, you know, inevitably, you’ll get distracted, you know, like some Aaron thought will come up, or you’ll like, you know, they’ll think of something or you look over something or something. And what is supposed to do is is not to like punish yourself for doing that you just notice it stirs up, I thought a different thought, I’m just noticing that and I’m going to go back to my breath. And that is a very powerful technique to do to train this takes this is hard, like, people don’t realize how hard this is how much training this takes, because what trains you to do is to realize what different parts are made of and how these interact. So like, it’s like, there’s a little bit of a darker story, but like when I was very sick, I was also very mentally ill like the pain and it was also very mental in nature. So like both physical and mental. So I had like very disconnected emotions. So I would just like suddenly feel emotions that made no sense. Like, I would just, you know, I like make dinner and I just like break down crying. And I was like, why am I crying? I’m not even sad. Like what the fuck is playing in my brain? I was like, I’m not sad. What is this? And maybe it was because I was sick, obviously. But the reason But it was fascinating because it taught me like, Oh, my emotions are like a thing. Like, thing. Like, if you see, like, I don’t know, a beautiful painting, you might say, think the painting has the beauty inside of it. But it’s true, the beauty is inside of you, you absorb the information, the painting, and then something inside of your brain checks it off is beautiful. And the same thing happens, all of your emotions, that’s one of the core teachings learned from like meditation is this to notice, oh, an emotion has arisen inside of me, there’s and to separate yourself from it to say, Okay, this is a good emotion, we want this emotion to I ignore this emotion. That’s a very, very powerful skill that people who aren’t challenged don’t have like, he’s like disconnecting experiences or don’t focus on it very much, often don’t have, they just have an emotion and they live the emotion, just the emotion makes it real, you know, when they’re angry, they’re just angry. That’s just how it is. It’s just reality. And I think that’s a very, it’s a part of becoming, I think, fully adult, I think part of becoming fully adult is learning to control your emotion and see them as a part of you, but not you, they’re not all of you, there’s a part of you. And you should take some distance. Sometimes, sometimes you should live your emotions, if you really love someone, you love them. But sometimes you have to take a step back. And that’s a very, very powerful ability that I that’s also rationality, it’s also very important, rational, you can’t be rational, if you let your emotions it’s controlling it, you have to be able to rationally think about your own emotions. Just makes sense to me.



Yeah, psychologists talk about this, like, usually, what’s the word mindfulness even though madness is like still like a wishy washy term, but what they may mean is like that you become aware of what’s going on in your psychological system, which emotions you have, which thoughts you have what you perceive. And then you can make a decision about how to react, what to do with them. And that’s basically where almost all forms of meditation have about it’s a common ground, I think. But I still have to be careful because there is another definition of mindfulness by Professor Ellen Langer. And she’s awesome. She’s a Harvard professor and I actually interviewed her on another YouTube channel is an interview with her. And she defines mindfulness as the ability to perceive things in a new perspective to reframe what you have been doing or perceiving and this is somehow connected to the other meta observing definition of mindfulness. Because if you want to see something in a new light, you need to take this perspective of the meta observer. But there have been some war going on between like psychology professors.



I mean, of course, it’s all complicated I also you’re correct. I previous is that insight meditation that’s wrong and segmentation, some difference? mindfulness meditation, of course, you’re correct is mindfulness that I was thinking about, I use the wrong term accidentally. Means that meditation is a bit a little bit different. But yeah, like I made the mistake in my earlier life to take some of these things, not as seriously as I should have. To be fair, most psychology is bullshit. It’s just actually dangerous bullshit. Not all of it. But most of it is dangerous bullshit, in my opinion. But there’s very good psychology. And different people also sometimes just need different psychology. Like, there’s just, there’s, there’s a very common thing, just different people need to hear different stories. It’s like, you know, a lot of psychology just doesn’t work for me, like a lot of positive psychology is like stuff I’ve read. And I’m like, this does not help me. Like, there’s just not things that work for me are just not things that helped me. And eventually, I did find something like, Oh, this is what I needed to hear. This is the story I needed to hear. And now I can improve my things. Like I’m, I’m a very large fan of a sequence of essays presented by cars, or Chi. It’s finished, I don’t know, k, g, k, j. So tala, unless wrong about multi agent models of the mind. There’s like a very technical, it’s mainly an attempt to use neuroscience and AI to explain meditation and like and like psychotherapy, and how it works on like a neurological, like a neuroscience level. And that just worked for me. I just read that. I was like, I figured it out. Okay, I’m going to start I’m going to be happy now and just worked. And for some people, that’s how it works. You know, some people it’s not gonna work. Some people read it, and it makes no sense to them. People are different.



Yeah, but I mean, we can do this by the question by the realization. Hey, why are so many damn smart people out there and In the world, and they’re just so trapped in their everyday lives, and they’re not thinking about the huge picture about like, how could I make my life happy in the long term? How could I contribute to humanity or whatever? And yeah,



yep. And, you know, my answer is that I our brains not optimized for that purpose. Evolution could have designed a brain that’s really, really good at that, but it didn’t. It’s a miracle that our brains can do it at all. And it takes practice, I think people would very much benefit from just from serious, you know, learning programming, like, there’s this fascinating thing is that, like all philosophers, I really liked it, I think, really, a finger should out there, all programmers often can program, because you need the same kind of thinking abilities, think thinking tools you need for programming, I think you need for these kinds of thinking, maybe it’s my bias, maybe I just respond well to programming thinking. But I think it’s more than that. I think that having a kind of rationality, like there’s this, there’s this very silly myth that, you know, like, rational people are like Spock, you know, like, they suppress all their emotions. And so they, but that’s not a rational, there’ll be completely irrational to suppress your emotions. Rational is whatever makes you when were rational, whatever works, whatever makes you happy, that’s rational. And if your emotions are good, you should not suppress them, that’s silly. But sometimes you should, sometimes you should, you know, be able to just think about your emotions and your end the big goals, sometimes you should be able to take a step back and think about your life and think about what your need is, that’s all part of rationality. And there is a very big disservice that has been done to rationality and mathematics and men, these fields that have presented like this cold, heartless, like thing, not something you would go for happiness, which I think is very dangerous, because those are the, those are the weapons, you know, rationality is your sword with which to slay your demons, you know, and rationality can come from science and math, it can come from, you know, from spirituality, from Buddhism, it can come from psychotherapy, it can come from many places. But it is something I think more people should strive towards. It’s something that people should just like, I think there is a possibility of doing a large amount of good, I don’t know how to do it. And this is not something I’m going to work on, because it’s not something I’m good at doing. I think there’s a there’s an untapped opportunity to make lots of people just routinely aware of rationality as a formal concept as something that you can practice it most people don’t even think about their own rationality as a thing they can train. But it’s just more if people just everyone knew that just common knowledge, like, Hey, you can just train that. And if we just pointed out, you know, leave everyone in school just knew the top 20 you know, you know, like cognitive biases that you can run into, like the most common ones, I think they would improve discussions. You know, I you know about the debates, and also just, you know, one’s own ability to solve sticky problems immensely. Maybe I’m overly naive and optimistic that I would work. But I feel like it would, because it’s even a very minor amount of effort spent explicitly trying to improve your thinking has a very large payoff, reaching, you know, super high levels is, you know, takes years of mastery as any other skill. But it’s a skill you can practice, I think most people aren’t aware that it’s a skill that you can practice.



So if I would try to summarize what we had been talking about, it’s really important for oneself, like if anyone is listening to this as potentially a meaningful inspiration, and format for sales. It’s important to practice rationality and to be able to take huge problems and break them down into smaller problems that are solvable or manageable. It’s really important to be aware of your own thoughts and emotions and perceptions so that you can apply this rationality to whatever goal you have, like so just that you’re all aware of what’s happening in the journal just like reacting to stimuli like an autonomic automaton. And thirdly, that it’s extremely, extremely satisfying and cool to strive for higher goals to find some kind of meaning not to like to get given by someone, but just to find it to define it for itself and then to go for it.



Absolutely. I agree with everything you just said. Of course, it’s worth adding that this is hard. This is not easy. This is not something you do in an afternoon. This is a lifelong journey. You know, it’s something that I think you can have very high return on investments for certain things even versus short amount of time, even like a very small amount of time spent reading about actual rationality and like practicing, it has, I think, a very large return on investment. But actually, like, you know, figuring out the meaning that works for you, you intuiting it, you know, practicing to notice your emotions. That’s a lifelong journey. And so don’t get discouraged. It’s worth it.



So is there anything you would like to say to the audience? Well, if



you actually stuck through all of this, I thank you for listening to me. If you’re interested in working on the kinds of things I’ve thing I hang out on the Luthor AI discord almost every day, where as I said, we’re not necessarily a beginners place, we welcome beginners to, you know, chime in to look, you know, to watch and you know, such but we don’t really answer beginner questions. So I would ask you to please respect that. But other than that, I would just say, there’s, there are two kinds of hope in the world, and a lot of people, and a lot of, you know, hope is important, and in an instrumental way, is that if humans have hope, even if they shouldn’t have hope, such as if they’re in a concentration camp, or a concentration camp, you have a really good reason to be depressed. That’s just true. But even those situations, it is often instrumentally useful to have hope to be optimistic, because you will act better, you will make better decisions, you will, you will fight harder. But I’m, I really believe in rationality, and I really do believe in honesty. So I’m actually very much against using lies to make yourself more optimistic. I think, I choose to not do that. But what I would like to say is, is that if any time in history has given you a reason to have hope, it should be now there is so much hope on the horizon, to solve problems that have been following humanity. Since we have since we have existed since there have been inextricably parts of the human condition. And we have brilliant, wonderful people all across the world, working to solve them, and you can be one of them. It’s not easy. Be hero is never easy. It’s never fun. You don’t become a hero, because it’s fun, or because you want to parade or you want people to respect you, you become a hero because you want to make the world a better place. So if you think you can do that, and if you don’t, that’s fine, too. If you think you just want to have a quiet life and live a family. That’s okay. That’s really okay. And you should be proud of that. But if you want to be a hero, don’t let anyone tell you you can’t

Connor panel interview 2020

YouTube video: AI Alignment & AGI Fire Alarm – Connor Leahy
Published: 2/Nov/2020
By: Machine Learning Street Talk
Featuring: Connor Leahy and discussion panel
Length: 2:04:49 (2h04m49s)


intelligence, ai, gpt, humans, utility function, problem, intelligent, alignment, theory, argument, talk, rationality, system, alphago, decision, concept, question, good, function, algorithm

00:00:00 Introduction to AI alignment and AGI fire alarm
00:15:16 Main Show Intro
00:18:38 Different schools of thought on AI safety
00:24:03 What is intelligence?
00:25:48 AI Alignment
00:27:39 Humans dont have a coherent utility function
00:28:13 Newcomb’s paradox and advanced decision problems
00:34:01 Incentives and behavioural economics
00:37:19 Prisoner’s dilemma
00:40:24 Ayn Rand and game theory in politics and business
00:44:04 Instrumental convergence and orthogonality thesis
00:46:14 Utility functions and the Stop button problem
00:55:24 AI corrigibality – self alignment
00:56:16 Decision theory and stability / wireheading / robust delegation
00:59:30 Stop button problem
01:00:40 Making the world a better place
01:03:43 Is intelligence a search problem?
01:04:39 Mesa optimisation / humans are misaligned AI
01:06:04 Inner vs outer alignment / faulty reward functions
01:07:31 Large corporations are intelligent and have no stop function
01:10:21 Dutch booking / what is rationality / decision theory
01:16:32 Understanding very powerful AIs
01:18:03 Kolmogorov complexity
01:19:52 GPT-3 – is it intelligent, are humans even intelligent?
01:28:40 Scaling hypothesis
01:29:30 Connor thought DL was dead in 2017
01:37:54 Why is GPT-3 as intelligent as a human
01:44:43 Jeff Hawkins on intelligence as compression and the great lookup table
01:50:28 AI ethics related to AI alignment?
01:53:26 Interpretability
01:56:27 Regulation
01:57:54 Intelligence explosion


Welcome back to street talk, Connor Leahy. He is a walking encyclopedia of AI alignment, and artificial general intelligence knowledge.



Most of science of doing science is about taste



kind of things the intelligence explosion is near. He thinks that artificial general intelligence is a bit like climate change. But worse, even harder problems, even shorter deadlines, and even worse consequences for the future. These problems are incredibly hard, and nobody knows what to do about it.



How can we make the world a better place? How can we ensure that humans get what they want, and that whatever we become into the far future, the other races of the speed of the galaxy, if they exist, are proud of what we become.



We started by speaking about some of the different schools of thought in AI alignment research



theory is basically trying to develop this intelligence, how can we reason about this in a way that will apply to potentially future super intelligent systems?



We touched on the core concept of intelligence many times in today’s conversation,



I take a very practical approach, I say intelligence is the ability to solve problems.



In a way it’s a cul de sac for us to get bogged down in defining intelligence and debating whether or not current systems are intelligent because Stuart Russell said, the primary concern is not spooky, emergent consciousness, but simply the ability to make high quality decisions. GPT-3 was recently released by open AI to much fanfare and hype. But is it really intelligent? And can it really reason?



Have you ever talked to a school kid after they wrote an essay, there’s no understanding, they have no idea. It’s just regurgitation. It’s just babbling. It’s an open problem. Whether humans are intelligent or not,



that specific argument that I made, at least wasn’t that GPT-3 isn’t intelligent, but that GPT-3 isn’t doing whatever you might call reasoning.



This is one of the main problems in intelligence, because as you pointed out, even in humans, you can teach kids how to do their times tables and what the rules are for multiplication. And they can use their system to but after a while, they will just memorize the results, and they will shortcut and this, this problem of imitation is pervasive within neural networks are interesting, because if you look at AlphaGo, I said earlier, almost taking the piss a little bit that it’s memorized all of the moves. But of course, it hasn’t. Because there are an incredibly high number of possible moves, what it’s actually done is it’s through self play, it’s generated a whole bunch of data. And then it’s created this hierarchical entangled representation of all of these different board positions. And then inside that convex hull of possible positions, it’s cleverly interpolating between them. That’s exactly what GPT does. But Connor makes an absolutely huge call about GPT-3,



I think GPT-3 is artificial AGI. I think GPT-3 is as intelligent as human. And I think that actually is probably more intelligent than the human in a restricted way, in a very specific way. I also believe that in many ways, it is more purely intelligent than humans are. I think that humans are approximating what GPT-3 is doing not vice versa. We don’t know what GPT-3 does, we do not know the magic of Turing universality means that even a very modestly powerful algorithm can approximate any other possible algorithm. Many of us were talking past each other when we use the word intelligence. Maybe we should just stop using the word intelligence completely. Maybe it doesn’t help us intelligence is what Marvin Minsky called a suitcase word, you can pack all these different definitions into it, and they don’t have to be compatible. Let’s timbu the word intelligence No one is allowed to say intelligence for now. Instead, we’re going to try to use different things we’re gonna use, like sample efficiency will use computational efficiency, performance,



that would be a great advice I think for the whole field.



So there is a definition of intelligence of compression there’s this idea that intelligence is the compression the exploitation in the structure of the space of the search function, is that a more intelligent system can reach a better approximation of the correct answer in a smaller polynomial amount of steps



back in 2017 Connor thought that deep learning was dead



I was convinced the 2017 that the bubble has burst deep learning is dead like why do we even research it? Are you kidding me? matrix multiplications Wow, intelligence boys. We did it. It’s you know, it seems so preposterous



and look to the brain had all this complexity I could I came from neuroscience, so why should we want it my first love was neuroscience. There have been plenty of naysayers about GPT-3. Gary Marcus is probably the most well known, but Connor thinks that Gary is barking up the wrong tree that GPT-3 is really intelligent, and the way it behaves is just a function of how it’s been formulated and trained



Gary Marcus saying things like, Oh look, I asked the AGI if I asked if mp3 if a mouse is bigger than elephant and it said yes. So obviously, it’s stupid. But I think this is like measuring a fish fitness by its ability to decline.



What you’re articulating is that GPT-3 is an auto regressive language model. And all it’s doing is predicting the next word. And, frankly, it’s incredible that it does as well as it does, because it seems to have learned this implicit knowledge base, even though you’ve never told it what to do. So GPT-3, at the moment, it’s rubbish. All it does is produce coherent text, where it will say that elephants can fit through doors, it’s just completely stupid.



It’s not just better than GPT-2, it is remarkably better.



What scared me the most in the GPT-3 paper was this straight line of perplexity? No, I see. It’s a log plot, but no sign of slowing down, like no sign that there is ever an end in sight, where we can just throw in 10 times more compute and 10 times more data, and we get out 10 times better. Is that what the humans do? All they do is just they take the generating function of the real world, and then regurgitate that, and one output of that is his language, right? So that’s how they produce the language corpora. But all they do is basically just learn the generating function of the universe itself. We spoke about the great lookup thought experiment,



imagine you had a agent who is composed of a lookup table of all possible states, the universe can be and an intelligent output to it. Is this intelligent or not? I think that this is one of those questions that is basically incoherent, because constructing such a table is fundamentally impossible. The Kolmogorov complexity, the length of the shortest program that generates that table might be small. That’s important. It might be that there’s a small program that can generate that table, it can be that the Kolmogorov complexity of the table is small. And then so then it’s like the question, assuming I have this short program, assuming I have a short program that can generate this lookup table for any spot I want for any possible thing.



Is that not intelligence, we spoke about newcomers paradox, some advanced decision problems, and also the concept of human rationality in general,



I feel a lot of the economy experiments are just that we might not have the best notion of utility function yet. I feel like this also this box example. Doesn’t that kind of go almost into the nature of whether or not we are a deterministic machine.



I’m going to make the argument that newcomers paradox is the default in human interactions. I think we all of us encountering newcomers paradoxes all the time. In one very simple scenario, social interactions.



Connor thinks that AI alignment is very closely related to economics,



I figure it out is it economics is the same problem as alignment. The economy is a very smart optimizing agent can optimize very complex parameters free market and economy is in many ways, like a kind of distributed backpropagation algorithm run on humans.



What should we do if we’re dealing with an artificial super intelligence, which is significantly more intelligent than us?



How would it take over the world say, well, maybe we’ll invent this technology. But that technology seems unlikely invented if it could do this instead. But what if it does this instead? And that’s doesn’t get us anywhere. I wouldn’t like short circuit this and say, if you’re dealing with an entity that by definition is much more intelligent than you, you should try to predict what it will do. Formal decision theory is interesting. From the perspective of trying to understand very powerful AI’s, we might be able to say things about how these incredibly intelligent systems operate, even without us ourselves being that intelligent,



a lot of time today talking about the dichotomy between utility functions, and intelligence. If it was an adversarial interaction between me and an embodied AI, it doesn’t really make sense to say this AI wants to win. What does that even mean?



argument number one, intelligence is going to be very powerful. argument number two, instrumental conversions happens argument number three, defining correctly utility functions very hard. argument number four, by defining human values is extremely high wealth through low entropy. It’s of all possible value functions. The value functions that captures human values are an extremely small subset,



we speak about some of the challenges in decision theory and stability and robust delegation.



So why our heading in the problem is that if a reinforcement learning agent takes control of their own reward signal, why would they not just set it to infinity and never do anything? Again?



If we do build super intelligent systems, how can we ensure that the world will be a better place as a result?



I actually think that we should not want a robot that will do anything we say, I would prefer that if I told my robot to go murder innocent children. The robot says no, I’m not going to do that. I want people To be happy, I want suffering to be minimized by whatever means possible. I do not give a single shit how we achieve a better world. I just care about us achieving a better



world. It means the optimizer is an optimizer that’s found autonomously from a base optimizer by searching over a space of possible functions.



Humans are a mess optimizer for evolution designed us looking for a function that maximizes inclusive fitness. But we optimize the completely different thing for like happiness and stupid stuff like that we are the AI that went out of control, as well as making paperclips isn’t making babies, we’re curing cancer and stuff like that. That is definitely not what it was or when evolution intended.



We speak about the concept of the stop button problem. If we build a super intelligence, how could we turn it off? Is it even possible to turn it off?



I think the whole OFF button debate because if we end up building an AI like this, there’s no way we shut off Google,



I believe that intelligence is externalized. In Google, the corporation is a form of externalized intelligence. And it’s nebulous and diffuse, and it’s self healing. If you attack Google, they have teams of lawyers that will respond to your attack, the concept of human rationality and free will is ever present. When you talk about decision theory. We also talk about the Dutch booking problem.



There’s a philosophical debate about what is rational? What is the correct definition of the word rational? What if you could modify your own rationality? The idea about Dutch booking is that assuming you have someone who can often your bets that you can take a refuse, and that this person can reliably offer you bets in such a way that you will always lose money. And what about the relationship between AI alignment and AI ethics, I have both very flattering and very spicy things to say about AI ethics as it currently is practiced. It’s trying to put out your handkerchief fire while your house is on fire.



We also spend some time talking about interpretability Chris Oli has done more for machine learning interpretability than any other person, I think in the last few years, he believes that it is possible to understand deep learning,



I once heard a great graph, like the the y axis is like interpretability, and the x axis is strength of the model. And so it starts really high, like simple models are really easy to understand. And then as it goes up like a little bit, the model is confused, they can’t really make good concepts. So it’s hard to understand. And it goes back up because the model can make like Chris clean, definitely cut up, you know, concept in a more meaningful way. It’s like we’re humans, and we’re our current AI systems are and then plunges. Because eventually it’s it becomes so intelligent, it goes so powerful. There’s just no computationally reducible way to understand what it’s supposed to do.



public figures such as Stephen Hawking, Ilan Musk, and Sam Harris, think that we need to be super worried about artificial general intelligence. They believe in the concept of an intelligence explosion, or the singularity, chalet. And he’s my favorite person in the world. But he did write an article criticizing the intelligence explosion, he says that intelligence is situational. There’s no such thing as general intelligence. Your brain is one piece in a broader system, which includes your body, your environment, other humans culture as a whole. No system exists in a vacuum, a very simple thought experiment is said, assume I make an intelligence as smart as a human just as small as a single human right. And now we just run it a million times faster.



But this assumes that virtualization of a mind is even possible. We can Stein’s argument about having a conversation with a lion. Our intelligence, how we perceive intelligence is fundamentally linked to not just biology, but the systems we interact with children that are raised in the wild. They don’t ever really come back.



We don’t currently see anything that hints that there’s anything special about intelligence.



I’ve tried something a bit different today. I’ve summarized a lot of the core talking points into a 15 minute introduction. And it might just be an interesting way of getting a looking glass into some of the topics that we covered today. So if you think that that’s a cool way of doing it, then let me know. Anyway, remember to like, comment, and subscribe. I hope you enjoyed the episode. And we’ll see you back next week. Hello. Hey, how you doing man?



Hey, doing good doing good. I just exploded on a nature interviewer about politics the other day.



I’ve got a missed opportunity to turn that bat until some weird spiky devilish thing.



Yeah, it is powerful. Like I cut this to the steak naturally. I don’t get here anywhere else don’t get any kind of fear. It’s just this it is truly powerful.



That is the best place to have a bit though To be fair,



it kind of is. Yeah, I gotta say I got like five various like YouTube videos or like chats I’ve had with people I’ve heard I’ve heard a pirate. Look, I’ve heard Don Quixote I’ve heard I’ve got the cute good nickname Saturday. is very jack Sparrow. is like I said SD like 0.5% Portuguese jeans in May speaking Nice I part Irish Portuguese and the Portuguese used to be beard and I tan like incredible in summer and the Irish gives me stateless mouth. So



welcome back to the machine learning street talk YouTube channel with me Tim scarf, my two compadres, Alex Basie and stanlake and Yannick lightspeed kilcher, who will be joining us in a minute. And today we have an incredibly special guest, Connor Lee. I’ve been watching several of Connors talks online on YouTube, and he’s a really impressive guy. Actually, I think he’s got a fantastic future ahead of him. He’s a walking encyclopedia of AGI and AI alignments, knowledge. So he’s interested in artificial general intelligence beyond deep learning, and even more so the alignment problem in AI. So how do you build a friendly AI? What stops you is not that you don’t have enough computing power. Even with infinite compute and memory, you just can’t write the correct Python program, which is going to lead to a friendly AI. Now, Connor founded Luthor AI, which is a grassroots AI research group aimed at democratizing and open sourcing AI research. in that group, he’s building new data sets for language modeling, he’s building a new GPT style model as well. And the largest model that they’ve gotten to train for a single step so far, has been 100 billion parameters, which is pretty impressive. They said on their site that Google reportedly got up to 50% utilization on their TP use when they train similar models. And they’re making a lot of progress towards that goal, they haven’t quite got there yet. They’re also interested in digital trust for deep neural networks. So given query access to a model, can you determine whether the model was created by us in a way which is resilient to model compression, so that’s super interesting. Also can an organization that is untrusted by the public, in some way prove that their models are working as advertised. And this might lead us on to an interesting discussion later, about the understandability of models. And one of the articles that’s linked to me was by Chris Ola. And interestingly, he believes that it’s possible to completely understand models, albeit if you turn it into a huge amount of computer code. Con has been a research engineer at alpha alpha for just over a year. And he’s finishing his computer science degree at the Technical University of Munich, he was an organizer at several data science events, the kaggle, Munich and the PI Data Munich. And Connor believes that AI is the mass production of intelligence. He believes that AI alignment is philosophy with a deadline, and that we are on the precipice, the stakes are astronomical. AI is important, and it will go wrong by default, public figures such as Stephen Hawking, then Ilan musk and Sam Harris, they believe that there’s going to be an intelligence explosion. And we need to be super worried about artificial general intelligence. Personally, I don’t agree with them. And I think a lot of people in the machine learning community are skeptical. But this is genuinely quite a divisive issue, which we’re going to talk about today kind of thinks that the singularity or the intelligence explosion is very near. He also says the AGI is a bit like climate change. But worse, even harder problems, even shorter deadlines, and even worse consequences for the future. So these problems are incredibly hard. And nobody really knows what to do about them. Connor, it’s an absolute pleasure to have you on the show. Welcome. Why an intro, thank you so much for having me.



Ai safety AI alignment debate? It’s really interesting to see that there’s all these various flavors and schools and approaches walk us through like why there are so many approaches to AI safety and how they differ. Do they differ in substance? Or is it mainly in like how we actually implement safe AI?



All right, the field of AI alignment has a bit of a colorful history. So it’s actually very interesting for those people that are specifically interested in although the history and the anthropology of this field, I very much recommend the book, The AI doesn’t hate you by Tom shivers. I remember correctly, it come some of the first people to talk about AI alignment are a bit of a usual bunch of people. So they came out with these transhumanist newsletters in the late 90s and early 2000s. These are people such as early as you’d kowski and Nick Bostrom, and several others. In many ways, people I think it’s fair to consider that Elliott kowski is one of the great founders of the field. Of course, we have ij good and other people that are much earlier even still in the field. eleatic were even who appeared even earlier talking about constant intelligence, explosions, or whatever. But at least for me, personally, the way I got into the field is from the writings of Billy Hazzard kowski, who was a very early writer in this field. And he was he was one of the first people to kind of talk about many of these concepts of how this will go bad by default. This is not an easy problem. He events a lot of the terms that we use nowadays. So there’s a strand so I guess you’d kowski runs this organization said marry the machine intelligence Research Institute, that is the institute that he leads or he’s lead research for know exactly what his role is, but he founded it. There are also several other of these older institutions such as the future of humanity Institute at Oxford. And this has always been a very niche subject. This has been something you couldn’t like, study in necessarily publicly. Like I remember reading an interview with Paul Christiana, who was a really prolific AI researcher. And he telling him about that he had to have like a secret double life bet during his PhDs that he wanted to work on the lie alignment, but he had to pretend he was working on something different. And that has been changing. So due to the works of people such as Max Tegmark and Nick Bostrom, and others, Steve Russell, there has been a lot of progress in making AI alignment in more critical respectable fields, something that more people can work on full time in their PhD, something you can get funding for something you can publish about, this is a very recent development, this is something maybe like 2018, I would consider it to be something that’s been becoming something more mainstream, something that it’s more okay to talk about. So it’s actually surprising how quickly things has changed. But this has also had it to some degree, and a little bit of side effect, that some new approaches, or some people new to the field not necessarily know the older approaches to the field or the older, more, let’s say radical views of the field. The way I like to think about this is you can divide the field into several kinds of approaches of like, how hard Do you think the problem is? How soon do you think the problem is going to happen? How dramatic of a breakthrough Do we need to solve this? On one hand, we have precise AI alignment. This is stuff like Paul christianna, his group at open AI, Stuart Russell at Chai and several others. This is what I consider, quote unquote, mainstream AI language research. This is the idea that our future artificial super intelligence, are probably going to resemble our current artificial intelligences, they’re probably going to be neural networks, they’re probably going to be you know, running on GPUs, they’re probably going to be using gradient descent. And therefore these people ask themselves, okay, given this how, how can we align these? How can we make a GPT model aligned? What does it mean to be aligned and stuff like this? What techniques can we do? This is what I would consider probably the most mainstream kind of current alignment, then there’s the stuff. So it’s ironic that mirri was one of the very first organizations to talk about alignment, and now they’re considered something of a black sheep in the community, is that Mary is legendarily hard to explain even what they do and know and like even people in the AI alignment field are often have different opinions about whether what we read is doing makes sense or not, or is there.



So do my best to try to say shortly, what Mary does is that they say, we are so confused about what intelligence is about what alignment means about goals about optimal about all these things, that we should sit down first and try to figure out what do these words even mean, we should, their idea is basically, we are the pre Newtonian stage of intelligence research is that before Newton invented His laws of motion, we were able to build ships and catapults and something happened trial and error. And they could work pretty decently. But once we had Newton’s theories, we could make predictions and we can predict how to how to make these certain things. And it was necessary to build very complex machines, you can’t build a rocket that gets it to the moon by trial and error could but not really going to work. In practice, you needed these predictive theories. In order to be able to aim for the moon, you need to be able to predict how gravity would behave in this scenario that we have not yet seen. And Mary is basically trying to develop this for intelligence. They’re trying to develop fundamental theory of understanding of what is intelligence where optimization processes, and how can we reason about like decision theory? How can we reason about this in a way that will apply to potentially future super intelligent systems?



Could you articulate what they believe intelligence is, I mean, on this show, we’ve covered Francoise chalets on the measure of intelligence. And last week, we spoke to a guy called Walid Sabo, who is one of these old school expert system, guys. And he thinks that intelligence is about explicitly reasoning over separate knowledge, and doing statistical inferencing and so on. So what is the conception of intelligence in your opinion,



I take a very practical approach I say intelligence is the ability to solve problems is you can get all philosophical about it and you can get through it like that mathematical about it. But I like for example that Miri often doesn’t talk about intelligence, they often talk about optimization processes and optimization pressure. So the the I can measure the power of a system by its ability to introduce to a system to increase a certain value or decrease a certain value.



And but don’t you think that skill acquisition should be part of intelligence?



Could be in practice, it would be no every practice kind of scenario it would be. There is the flaw. Sophy like we could talk about division of intelligence. They’re like philosophically satisfying. And we can talk about definitions of intelligence that are practical and useful. And the fact that at the end of the day, if I have a system that can take over the world economy can cure aging and cure cancer build any kind of technology. It’s, at least for me, it’s not super important how exactly this machine works.



I suppose 111 way to contrast this is that, in a way, it’s a cul de sac for us to get bogged down in defining intelligence and debating whether or not current systems are intelligence because Stuart Russell said, the primary concern is not spooky, emergent consciousness, but simply the ability to make high quality decisions. And in that wonderful YouTube video that you link to our switches. I don’t know if I can pronounce his name, you said before Elysia as rudovsky, the AI alignment problem, and he was basically saying it is really hard. And he started off by talking about Asimov’s three laws of robotics, which were deontological. And for folks that are not educated in philosophy, that means that rather than focusing on the outcome, it’s just a kind of rules based system of ethics. But his rules were, a robot may not injure a human being, or through inaction, allow a human being to come to harm. And then the second rule was a robot must obey the orders given by human beings except where such orders would conflict with the first law. And the third one was a robot must protect its own existence, as long as such protection does not conflict with the first or second law.



Yes, and that would not work in practice, as that talk, explains it rather detail.



An interesting point from that talk was discussing the idea of utility functions and how they’re necessary to get around the kind of deontological traps that come out from vaguely worded first principles. And it was really interesting, because they talk about the importance of having a coherent utility function. And there are links here to Dutch book arguments in terms of like probability theory. But on the show recently, we’ve been discussing a lot the problem of setting objectives and targets and the way that these can lead to perverse examples. And in the talk, the speaker quite quite rightly identifies that humans don’t really have coherent utility functions, not in any sense that we’re aware of. And yet, they seem to be a real central principle for this the AI alignment problem, is there a tension there? Or are we? Is this just an outsider looking in that that thinks this is a bit of a conundrum?



No. Can we give an example of what we mean by humans don’t have a coherent utility function.



It’s pretty simple to get humans to say for example, they like pineapple pizza better than salami pizza, and they like cheese pizza bed, salami, pizza, and they like pineapple pizza as a cheese pizza. So it’s pretty easy to get humans to say statements like that. And if you think about it, so I should explain question what a Dutch book is. So Dutch book is actually a very important concept in like these weird restyle alignment research. So there’s a very hard question to ask, what is rationality? Actually, can I diverge for just a second here, I’d like to introduce a bit of a thought experiment, I think is important is this is this called New comes paradox. And I think this is very important to understanding some of the more advanced decision theory we might be talking about. So glucose paradox functions the following way, imagine a super alien Amiga comes down from space. And as omega is arbitrarily intelligent, and that omega is arbitrarily intelligent. Omega is a weird alien. So it’s playing a weird little game, it plays the following game, it puts down two boxes in front of you. The first box always has $1,000. In it, the second box has a million dollars in it, but only if it predicted in its simulation that you would only take the second box. So it puts them to box and it flies away. Should you take only the second box or both boxes. This is a interesting idea, because the boxes are already filled, the million dollars are already there or not. So whether or not you take it does not change whether the million dollars are already in the box. But you might argue that and I think correctly argue so I’m a one boxer, that’s what you call it like one box, I would only take the second box. And because because then I would predict because omega super smart, so omega knows that I would have only taken to the second box. So I’ll get a million dollars. But there are many kinds of decision theories that say logic rationally, you shouldn’t be both boxes, because your choice will actually make a difference at this point. So these are like different definitions of rationality. On one hand, you have this like this, like a causal rationality or causal decision theory, where you say my choosing a both boxes will not causally affect my output, I will get $1,000 more by picking both boxes. Either way, if it’s strictly dominant, so I’ll take two boxes. And then there’s these like more abstract kind of like weird issues. In theory, we said, but I get more money by only picking one box and so I’ll just pick it whether it makes causal sense or not.



It’s never a good time to interview Daniel Kahneman won some numbers at the Nobel Prize for his, he showed that humans do not maximize economic utility. So there was the experiment, maybe you can probably verbalize it better than me. But people were more concerned about not being almost insulted by the other person. Yeah.



This isn’t this just a, a, I feel a lot of the economist experiments are just that we might not have the best notion of utility function yet, given because we tried to measure it in money or something like this, or like the more pizza, the better. But I think a lot of economists experiments still make sense if you assign the correct utility function if you assign some negative utility to risk and to being embarrassed and whatnot. Whereas I feel that a lot of these advanced decision problems, they really require a different thinking that like, rather than we have this monotonic utility function, I also know there’s, there’s this notion of, I don’t know, it’s a long time ago, but like, super rationality, where you have that prey play prisoner’s dilemma, but then you are thinking like, okay, the other person is really smart. And I’m really smart, and the other person knows that I’m really smart. And I know that the other person is really smart. And therefore, if we’re both so smart, why don’t we just pick both the same action, like the cooperate action? It makes, to me, this makes no sense, like, at the point where you say, since we are so smart, why don’t we, but I see like, you can derive this. But yeah, it makes no sense. And it I feel like this also this box example, doesn’t that kind of go almost into the nature of whether or not we are a deterministic machine. Because the whatever the Omega, the super intelligence predicts what you would do. If you’re a deterministic machine, then your causal reasoning makes sense. But if you’re not, then you should sample from a biased coin.



Okay, I have to unpack some a few things that you’re completely correct in what you’re saying, I totally agree is that so like this concept that you could? So the thing with utility theory, with utility functions, as utility functions are an incredibly huge space of possible functions, you can always find some utility function that explains someone’s behavior, oh, it’s just such a large space. You. So that’s why these concepts of rationality are very hard. That’s why I was going to talk about Dutch booking is that if you see someone walk down the street, take out a gun and shoot themselves. You’re not you can’t say that’s irrational, because maybe their utility function gave maximum utility for shooting themselves. And that was the exact best possible thing they could have done. It’s hard to say, or you might or there’s a second part of this. So a, a, he is basically those two parts, you have utility function and decision theory is that it might just have a such a bad decision theory, such a bad rationality, that actually shooting himself was a really bad decision. But he was so dumb that he did it anyways, because he was just rationality was so bad that he did that.



I know we are accelerating towards the freewill debate very quickly, because the third thing is maybe the person didn’t have any rationality at all. Maybe he was just acting randomly.



Yes, of course. So there are so this is why I want to so I’m going to get back to this booking because def booking helps us solve some of these problems. But basically, it’s okay like the super rationality things. Let’s talk about that later. If you want to talk about it later. You don’t have to leave. But here’s the thing with omega. So here’s the thing with the newcomers paradox, newcomers paradox, in my opinion, appears super strange, like it appears like this very bizarre scenario that requires it’s really weird setups, and bla bla bla, and for some of these vice versa is true. Like, I know, several saw experiments that are really funny, but honestly require, like physically possible things to happen for them to occur. But I’m going to make the argument that newcomers paradox is the default in human interactions. I think we all of us encountering new codes paradoxes all the time. In one very simple scenario, social interactions. If I every time I’m talking to you, I am making predictions about what you will predict I will do about what you will predict I will behave how I will what is socially acceptable, this is a new comes a new comes game. And that if I whether or not for example, I decide to lie to someone will depend on whether I expect them to expect that I will lie to them or not. And just on that, though, isn’t there a kind of Nash equilibrium, or a convergent behavior that



happens here because that happens all the time that when we try and derive we design objectives, to nudge people to behave the way we want them to? It might be kids at school, we want them to do their exams and so on. And we try and design objectives that are so powerful, like for example, if you can memorize a long list of numbers, that’s probably a good indicator if we evaluate for it, that you’re good at doing something else but so many opportunities for perverse incentives and shortcuts always manifest?



Yes, absolutely. It’s okay, this game, this is not exactly about rationality, that point. So incentives are important, because basically incentives help us shape what actions lead to our highest utility. This is really interesting thing. When I first got into AI alignment research, I was really, I was confused that everyone was really into economics, like every AI layman researcher is really into economics. And I didn’t understand that. But eventually I thought I figured it out is that economics is the same problem as alignment. Economics is the question of aligning incentives to using dumb quote unquote, things, individual humans, laws, institutions to control a smart thing, the economy is a very smart optimizing agent, it can optimize very complex parameters in a very right range near the individual humans cannot do in many ways, the economy, like it are kind of a free market economy is in many ways, like a kind of distributed backpropagation algorithm run on humans. And in many ways, economics is about trying to a line that thing as close as possible to things for example, Corporation, they figured out if they just dumped their toxic toxic waste into some kind of like a into the Amazon or whatever, they’ll make them a lot of profit. But that’s a misalignment. It’s not what we actually want the economy to do. So then we pass laws that say, okay, it’s illegal, you have to pay more money if you do that. And that is an attempt to a line the economic economy AI says optimizing system to our values. posthoc.



Could I challenge here then? Because Adam Smith said that there was a Hidden Hand in the market? And do you think that the market is a bit the I know, chalet actually believes that it’s an externalized form of intelligence? Do you think the same?



I think it’s, again, is I like to think about optimizers, more than I think about like intelligence, a loaded word. So I try to think what optimizes here is that the economy is optimizing for certain parameters. And the question is, are those parameters we want or not? So is there an invisible hand short, does the Invisible Hand give us what we want that as an artist, to me, it is, it will optimize for something. And the point for me of like regulation is attempting to force the invisible hand to optimize for something closer to what we actually want.



So coming back to social interactions for a second, you made the point that this newcomers paradox is every day in social interactions is there. But in isn’t a big factor in the social interaction that it’s a repeated game. So I am not going to lie to someone or something like this because I interact with them in the future, or I behave as they expect, because I’m going to interact with them again. Whereas in your case, in nucleus paradox, that alien just flies away, and I will never interact with them again.



Of course, that there’s a good point is that I’m not every social interaction is only a new comes paradox. The iterated prisoner’s dilemma is probably the close or a stag hunt or probably be closer game theoretical equivalence to normal interactions. But my point, so I attract that I’m sorry about that. Is that what I was bringing up new comes paradox to explain Dutch booking and why it’s important for



rationality, just for the benefit of the listeners, can you explain the prisoner’s dilemma? That’s the thing where the people can dub each other in, but they



Yeah, so the idea is, you and your buddy are can, there’s evidence that you may have committed a crime, but it’s not enough to convict you. So you’re both put into a cell, you can’t talk to each other. And you have and the police officer the following option, either you both don’t tell us and then we’re going to convict you for some minor crimes. So you got a year for to prison for one year, or you rat on your buddy, if he doesn’t rat on you, you go free, and he goes to jail for six years. Or if you both rat on each other, you both go two years to prison for four years. So obviously, in some, the best thing would be for both people to interact. It’s to cooperate with each other to not tell the police what happened. But for each one individually, it is better to rat out the other person. Because if you ratted me out, I might as well rat you out and you’ll save myself two years of prison. If you didn’t rat me out, and I ran through, I’m gonna say it myself when your friend as well. So it’s always in my interest to rat out the other person.



Yeah, and this has spoken about a lot in reinforcement learning, especially multi agent reinforcement learning, and there are lots of efforts to model intrinsic motivation. And Deanna can talk about this much better than myself. But you might see there’s quite a lot that when you have individual agents,



yes. So it’s like the prisoner’s dilemma is one of the if I have to, like make a list of like, top 10 things you have to learn period, like just things that people should be aware of a prisoner’s dilemma teaches so much about everything about how people interact about how governments interact about how decisions are made. It’s and variants of the prisoner’s dilemma. There’s very much there’s variants, the most important one is the iterated prisoner’s dilemma. So assuming we don’t play this once, but we played multiple times, that might be good to call up. right with you. So then we said, we don’t cooperate many times in the future. This is also an example why the mafia kills stitches is that they try to say, Okay, if you don’t cooperate with us, and we’re gonna make it so bad for you that it’s not worth it. So then this is definitely something in real life and is in a fundamental thing about understanding rational decision making.



On a semi tangent, if you go back to the 1950s 1960s 1970s, when this stuff first got really big academically, around the same time, nuclear strategy was first emerging. And you get a lot of game theory in there. And there is some very dark but very interesting reading, looking at the applications of game theory to the waging, and prevention of nuclear war, saves a lot of how the world is today.



It’s funny you say that, because I was going to invoke I know, we said, we went talk about politics. And I promise this will be a quick digression. But I am Rand. And of course, she wrote this book, atlas shrugged. And her writing influenced basically, I think there was even a RAND Corporation and influenced the second half of the 20th century, there is this objective obsession, and most of the way that businesses are run today is informed by that, even as you said things like mutually assured destruction. And the randian philosophy is present in a lot of places. And it means that if agents only focus on their self interest that would actually maximize global utility, I think they’re fine stating that correctly. What do you think about that?



Are we sure that iron Rand is philosophy and pretty sure it’s pornography?



It’s called objectivism, isn’t it? So it’s got a an official name.



So I have very, I have, I don’t think much of iron Rand at all. I think that let me put it this way, I think there are better examples of what she was trying to accomplish. So for example, Tyler Cohen’s book, separate attachments, I think, is a much better case for the same type of argument. He makes his case about how increasing world’s GDP is basically the most morally good thing you can do. And I think his case was stronger than anyone Ayn Rand ever did. So it’s the following thing. So the problem with like, deontological and, and the problem I have, like deontological theories, and like non utilitarian theories is that you can always construct a world state where following those rules is bad. You can always find some edge case, which might be very edgy, or it might be a very actually how the world works case. For example, like, I don’t know if you ever heard one of your reaction ism, there. It’s like monarchist, they think we should have a king and the people at slaves are good, and it’s hilarious. It’s terrible. And the thing is that a lot of people take the strength too seriously, because their arguments seem to make sense. If you accept the premises, the problem is the premises are just wrong. They’re just not true. It’s just they’re just actually for real made up. They’re just Fantasyland stuff. And a lot of objectivism is similar, is that it seems to me is that she just, if you accept the premises of Atlas Shrugged, if you accept all of this is how the world works. This is how humans actually behave. This is how I’ve seen and yeah, it’s Yeah, it’s good. Problem is none of those things seem to be true. In the real world, the real world does not seem to follow these laws. And therefore, it’s very silly, in my opinion, to take them too seriously. It’s morality is a two tier system, it’s not that you can, you can’t just sit down and come up with a true correct morality, then we’ll read to the best possible result without actually observing the state the universe is in and the rules the universe follows.



Just to throw a hand grenade into the discussion, I was really fascinating, like, these sorts of podcasts are a great way to get into a new topic that you don’t often have a chance to read. And so I went out and started reading some of these papers in AI alignment research, very heavy, heavily logical, like, where we’re used to looking at calculus, but like pure logic, just arguments made of pure logic as something that you don’t really encounter outside of textbooks. But I kept wondering to myself, if this whole field is this construct of pure logic, yeah, sure, our conclusions may be valid, given the premises. But what about the premises? Like, I’m not familiar enough with the entire debate to go back and say this particular premise is wrong. But we seem to have all of the AI alignment research seems to agree on certain broad principles, things like the orthogonality thesis, things like instrumental convergence. But how much of this is someone said this once now we will accept its gospel because if you accept this, then it naturally follows in a logical state.



So just one second, so when whenever we have jargon, I’m going to intercept so the orthogonality thesis is, I think it came from Bostrom, and he had this concept that utility functions and general intelligence can vary independently of each other. So the idea is that we can have something that’s super intelligent, but will want to kill us or is dumb in the sense of what it’s optimizing. Is that fair? Yeah, basically. Okay, and the other thing You said, instrumental convergence,



there are certain sub goals which are useful for a very large range of final goals, for example, almost doesn’t matter what I want to accomplish, I need to be alive to accomplish it in almost all cases. So most AI is following most goals will want to stay alive. This has nothing to do with like consciousness or will to live or emotions or anything, it’s just, if I want to get coffee, if I just want to get to coffee, I can’t get your coffees, I’m dead. So I have to ensure that I stay alive long enough to get your coffee.



counter example, friend of mine was training a robot to walk or set the time penalty to high and this robot would just tip itself over and they couldn’t figure out why it was to be itself over before they figured out that the robot had learned that the quickest way to terminate an instance and therefore minimize its regret was to knock itself over. So in such a situation like this isn’t to say that the theory is bunk. It’s just to say that there are real world counter examples that suggests that instrumental convergence may not be as powerful as user input or initial conditions



as a counter example. That’s actually a very classical example of the stop button problem. So there’s this unsolved problem basically, is how do you get an AI to willingly let you shut it down that’s actually very hard, is that if you don’t give it any incentive to let shut it down, it will resist being shut down. If you give it too much incentive to shut itself down, it’ll shut itself down. And so it’s so this is a very common problem. Now might be a good time, just to quickly touch on some of that stuff. Because in that presentation by Alicia Elliot, your kowski,



thank you very much. He started off by talking about we started off with those azimoff deontological rules. And then he framed that the alignment problem is incredibly difficult, because if you have a robots utility function, it’s brittle. It might be something like this. So we want to fill a cauldron with boiling water, I think was the thought experiment. And the utility function is one if it’s full, and zero, if it’s empty, the humans utility function presumably, is it’s so much more complex, it has a fidelity that can’t actually be explicitly described.



Yes, but here, in this case, that here’s what I’m always thinking, when someone comes up with this, it’s that I look at this. And I see two things, I see someone making these premise of we have this super duper booper intelligent AI, right, that is unconstrained and whatever utility function we give it, it can optimize. And then we think of the consequences. But then I look at the utility function, and I see a 2000 and year 2020 programmer that programs that you utility function as if it were a reinforcement learning agent of today. It always pairs these mega intelligent AI with a utility function that is like we would give an AI today because it’s easy states that map to single numbers. What if the utility function is what like plus one if human dopamine system activated? And then you’d be like, okay, workshop flooded? That’s not good. Dopamine, low dopamine, low, funny, good dopamine high.



Okay. Okay. But I see what you’re saying here about, like I said, we’ve been building up technical debt here in this conversation, there’s 10 things that I have to explain that by adding on a without explaining the president or mission here, yeah, this is a LIFO stack. Yeah, basically, we have to work off some of the technical debt here. Okay. So first of all, I would like to be very clear that every single thing that has been brought up so far is very well known in the AI museum. Everyone talks about this, everyone’s concerned with these problems, you’ll get fined huge essays on the alignment forum and every single one of these topics. Is there not something that like you’re writing? Like, I’m not saying you’re dumb or something, I’m saying yes, you’re very right. They’re smart. These are good things that you are noticing, it’s good that you’re noticing these things, because these are serious problems. Because my brain doesn’t actually have a you know, stack that is in any way consistent. I don’t remember everything, I have to work off my technical debt. But let me try to like look a little bit backwards first. So I want to start with this idea. So the there is a failure mode and talking about AI alignment, which we are dangerously close to where it becomes an argument of my sci fi theory versus your sci fi debunk is that they can often come into like this thing, as I say, AI can you take over the world? And the other person says, how would it take over the world say, well, maybe they’ll invent this technology, but that technology seems unlikely invented? It could do this instead. But what if it does this instead? And that’s doesn’t get us anywhere? I wouldn’t like short circuit this and said say what most advanced people in this field will try to tell you is that if you’re is that just from first principles, if you’re dealing with an entity that by definition is much more intelligent than you. You should try to predict what it will do. You should just predict that it will perform better than you. For example, I can’t predict which Move AlphaGo will take, but I can predict that AlphaGo will probably win. And that is the only thing I’m I think we can say about very strong future intelligence, I can predict that I can’t predict how a future intelligence might want to take over the world. But I can predict that it probably will be able to do so if it if its utility function wanted to do so.



Now, the quick challenge on that isn’t that a little bit one dimensional in the sense that go is a board game, and what it means to win is clearly defined. Whereas if it was an adversarial interaction between me and an embodied AI, I don’t think I it doesn’t really make sense to say this AI wants to win. What does that even mean?



It that’s where we get into these Dutch booking and utility function things is that is we’re still working on this assumption. And our intelligence has some kind of utility function. And the reason this makes sense to a certain degree is because utility functions are universal is that you have these like fun diamond Burbidge down axioms where you can basically say, even if no one sits down and writes a utility function, you can describe an agent fulfilling these very simple Rachele priors as acting as if it had a utility function. That means even so no programmer ever wrote down a utility function, it will act as if it had a utility function. This is this is why people use these theories, because it’s very universal. There are problems with utility function. And one of the biggest open questions in LLM is can we find a better framing than utility functions because utility functions have a lot of problems that we would like to get away from. But so far, it’s been very hard to find a better formalism than utility functions with thinking about these very abstract very powerful things. So I alignment and I safety in no danger are based on just like stack of arguments. And this is why I understand that sometimes these arguments are hard for people to swallow, because often people will see one or two of these arguments, and then and then they can easily dismiss them, you have to take the whole stack, you have to say you have to say okay, argument number one, intelligence is going to be very powerful argument number two, instrumental conversions happens argument number three, defining correct functional utility functions are very hard. argument number four, the by defining human values is extremely high entropy is extremely high information, it’s really hard or was through low entropy, it’s of all possible value functions, the value functions that capture human values are an extremely small subset. So we should expect that by default, unless we have enough, you know, knowledge to sit this very small target and optimization space, that we will hit something wildly different. And that we should expect that these wildly different thing will do something that we might not be able to predict. But they will we can predict not be something that we necessarily want. As so it is like weird stack of arguments all interact with each other. And if you take out one of them, then the conclusion doesn’t really hold anymore, or it’s not as strong. And you can just wait. Another question. So because you said this space of human utility functions is almost infinitesimally small compared to the space of utility functions. But if you were to take a convex hole over all of the different individual human utility functions, would it look quite clustered in that space? Or would it be uniformly distributed? Paris really depends on what space are we talking about here at the space of all functions? Because a utility function, any computable function, and that is a large space?



Yeah, it is, I suppose what I’m saying is that, presumably, they’re more than my utility function and yonyx utility function, presumably, they’re significantly more similar to each other than any other utility function picked at random from that space. Yeah, trying to reason about what we’re looking for here.



Robert miles said something on this a couple of years back, talking about the nature of intelligence, and he says, okay, think about all the kinds of human intelligence, they’re an area this big, around, there’s the camera, they’re an area this big. And if we think about all the potential intelligences of living things on Earth, it’s like this big. And if we think about any potential biological intelligence, it’s like this big, but that’s still not the entire space of intelligence. It’s huge. And we don’t even know what it looks like. utility functions are exactly the same. Yes, if we’re talking about human, the human intelligence, it makes sense to reason about like convex holes within this like, so subspace, because even if we’re wrong, we’re only going to be a little bit wrong. But if we’re talking about something that’s non human, and acts in a way that could be construed as intelligent, then the question becomes a lot more complex.



And so just quickly, the utility function presumably changes all the time it’s not static, or, or do you think it has a level of dynamism so that you can think of it as being



static, right, you can probably formulate and take a time parameter in or whatnot. But here is a thing and then I also want to backtrack the stack. But here’s the thing that I’m very sure that first person confronted with this came up with but the answer might be interesting. Presumably, we’re going to build this super intelligent whatnot. And it’s going to be intelligent. And on the way there, it might be intelligent enough that we still can make control of what if we use that to build us the utility function that is aligned with us? Yeah. So what I can say is for sure, it’s intelligent, and therefore, it’s probably going to come up with a best utility function.



This is for a very strong definition of the world best and intelligent and whatever. But actually, in many fields of AI, many people in lm it actually do think versions of this, they actually do believe I believe a version of this is that in a way, we have to build it, we have to use the intelligence of stronger agents to make them align themselves in many ways. This, for example, the idea behind what’s called coachability. The idea is to build an agent that always wishes to be more aligned. So even if it starts out on aligned, it will use its own intelligence to try to make itself more aligned. This is one of the more popular approaches towards actual alignment.



That doesn’t what Yannick said, It links back to the orthogonality thesis, because why wouldn’t a really intelligent agent change its utility function, given its super intelligence?



Okay, now we’re gonna get into some deep decision theory bullshit. So there is making decision theories robust under this kind of things is an open mathematical problem. So like, for example, imagine I offer you a pill. If you take this pill, you’re going to want to kill your entire family, and you’re going to be super happy about it all the time. Should you take the pill or not? from a pure utilitarian perspective, whether you take it or not, doesn’t really matter. If you don’t take it, you’re happy that you didn’t kill your family, you take and kill your family, you’re super happy that you kill your family. So in a way, from a pure like dumb decision, theoretical perspective, these actions are have the same utility. And it doesn’t matter which one you pick. But from our perspective, like, that doesn’t seem correct. Like there’s something wrong here like that there, we should have a decision theory that robustly does not take the pill, this is especially becomes a problem with what’s called wire heading. So wire heading is a problem is that if a reinforcement learning agent takes control of their own reward signal, why would they not just set it to infinity and never do anything again. And so this can and does happen. And it’s extremely non obvious to me, as a lot of people think about this, how to solve this problem, or if this is a solvable problem, I’ve heard people propose solutions to it, or ideas about how to address it, whatever. But this is a is a very thorny issue. It’s, I think, what you’ve described there is Gandhi’s stability argument. And I was gonna ask you, what does stability mean? But I think you’ve just answered the question. But there’s a kind of convergent behavior in many of these decision frameworks. And sometimes they don’t converge. The Gandy example was, he starts out not wanting to murder people. And then we offer him a pill that will make him murder people, but he knows what the pill does. So he says, no, sorry, guys. I’m



not having that pill because I don’t want to murder people.



Yeah, this is also this. So this is in many ways robust, what’s called robust delegation is, which is a sub part of the alignment problem. It’s the question, how is that not just so there’s definitely different parts of delegation, there’s we delegate to an AI, there’s AI delegates to a copy of itself, there’s AI delegates to a new AI, the AI delegates to an improved version of itself. And also like delegates, the future version of itself. In many ways, I know if you guys ever done this before, but sometimes I will not buy sweets at the supermarket, because I know I’m going to eat them if I’m at home. In many ways, this is an alignment failure, if I am not aligned with my future, sell my previous self. So I like have to create like these artificial scenarios to stop future cell from doing something that I don’t want him to do



even see this in practice, when they hook up rats to electrodes that can just stimulate their brain, they’ll just push them indefinitely. So at some point, we might, I think, the one of the more and that will be what you mentioned at the very beginning, where you say AI in the near future is probably going to look like we engineer neural networks, we optimize with backpropagation that might just be solved with engineering constraints, like we just screw over the AI is attempt to do that by itself. But it’s, it’s an interesting problem. It refers back to what we have on the stack of eight, which is this this stop button. issue. Isn’t that kind of a version of the same thing. So what can you tell us a bit more about the stop button problem and how that plays into this?



Yeah, a very common thing that people will say it will suggest when the first year of grades you’ve added here is that? Well, if the AI does a better we’ll just shut it off. I think this is a very silly idea for multiple reasons. So here’s a great example that Tim has got here from the talk where you say we have a utility function, the robot will give a one point if the cauldron is full and the turn off button is off, and we also give him the button. We also give them One reward, if it’s a pet suspends himself after you press the button. So what will this robot do? Let’s do immediately hit the button, because filling the cauldron is hard, but suspending is easy. So just hit the button and go unconscious, and then it gets one reward. So success. And it’s very hard to find a mathematically rigorous way of how to define an off button in a way or an agent will actually honor our wishes in a good way. And Okay, so here’s I’m going to take a step away for like moderate from, like, mainstream AI research and talk about my own beliefs a little bit, is that I actually think that is not something we want. I actually think that we should not want a robot that will do anything we say, because I think humans love a lot of very bad things. And I think if we have a robot that will do not, they will not do a very bad thing. And that is preferable. I would prefer that if I told my robot to go murder innocent children, the robot says no, I’m not going to do that. But that is not that that goes against this kind of like alignment with following human wishes directly. And doesn’t that lead very quickly to a trolley problem? Yes.



Would you would you make the same argument about let’s say guns, if I could build a gun that whenever you pointed at a child, which is not fire



is more complicated than that it’s bigger. So first of all, it depends if you’ve got to send it or not, or is has an intelligent optimization is their gun optimizing for damage. I’m a very practical utilitarian, they were very clear about this. I want people to be happy, I want suffering to be minimized by whatever means possible. I do not give a single shit, how we achieve a better world. I just care about us achieving a better world. If having guns around makes the world a better place. I want us to have guns. If having not having guns around makes us a better world. I want to not have guns laying around. It’s very practical in that sense.



What about another because we’ve got Kenneth Stanley coming on the show, he has this wonderful book called greatness cannot be planned talking all about at the at the worship of objectives being a tyranny. But one interesting concept I’ve really taken from his book is that he says that we when we optimize objectives, we’re looking for this monotonic increase in utility. And actually, a lot of times, things have to go significantly worse before they get better. For example, if we took away guns, suddenly, you might find that led to some unexpected outcome. But in 10 or 15 years, it might provide a better society for us. So we need to prepare ourselves to take that dip before it gets better later.



Yeah, absolutely. This is just the difficulty of the space we’re searching for. If is that the the space of actions we can take and their output in the form of world states that we can take if we can try to formalize this somehow, not really, is that there’s a fundamental question of like how much destruction is in this space, we have no free lunch, in that in a truly random space, you can never do better than random search, you know, there’s no possible way. But we all assume and are living proof that there is structure to university regularities and we can exploit to produce better than random outputs. It’s not always monotonic. It’s not always perfect, we get stuck in local, local optimum, minimum, whatever. But all these are basically properties of the space and properties of the space of the search algorithm we’re using to search through that. Intelligence is a search algorithm in this space of policies in the space of choices that it can take in effect of parameters it can influence in order to achieve better world status or rated higher on its utility functions in the most maximally abstract way to define this. There’s quite interesting, you’re talking about intelligence as an output. I think chalet says that it’s actually a process of information acquisition, that it’s really interesting that a lot of people do formalize it in the way that you do, which is that it’s a search problem looking for a program. Yeah, I don’t want to any definition of intelligence, I genuinely do not want to commit it to any one definition of intelligence. I think there are many different definitions of intelligence are useful in different contexts. Like I find like the search of policy of this, like meta learning definition makes sense in this context. There’s other contexts where others might be more sensical. But I think this is, again, I tried to like, sometimes when we get like, really, in these abstract things I try to like to ground things, again, is that again, I don’t care about how it works. I don’t care about any of this. I care about making the world a better place. I care about making people happy. I care about avoiding suffering, I care about curing cancer, and everything else is just a tool in the tool set to achieve those goals. And,



yeah, in a weird way, you are exactly like the problematic instances of AGI we describe where they don’t care how many pain like how they construct paperclips, as long as any meeting in a way you act exactly like this, which is interesting.



I’m just making sure that we have a pathological example to study. Exactly. So that’s actually really funny. There’s actually a great story to be told here about a nice optimization. So one of the one of the things that the ILM research has like been talking about a lot recently, which hasn’t really filtered the mainstream is this concept of nice optimization. The idea is, assuming you are you’re learning you’re searching for a policy to optimize a certain thing, it might be that the program that you find is itself an optimizer for something else. And this amazing optimization, and humans are amazing optimizer for evolution designed us looking for a function that maximizes inclusive fitness. But we optimize the completely different thing for like happiness and stupid stuff like that. But evolution doesn’t care at all. We are misaligned AI. This is why this is one of the reasons why I think that is so obvious that AI is going to go like because we are misaligned AI We are the AI that went out of control, as well as not making paperclips isn’t making babies, we’re curing cancer and stuff like that. That’s definitely not what the what or what evolution intended,



now might be a good time to talk about inner versus outer alignment, by the way, so I want to introduce this concept, the inner alignment problem is about aligning the model with the loss function thing you’re training for. So we’ll know about this in machine learning. So the reward function, outer alignment is aligning that reward function, that loss function with the programmers intentions, ensuring that say you write down a loss, your model is going to actually optimize for this. Now, I’m sure everyone has seen this. But there’s this wonderful, there’s this wonderful open AI page talking about faulty reward functions in the wild. And then the reinforcement learning world, we talk a lot about reward shaping, which is this thing, we were just saying that when you have objectives, intelligent systems will take shortcuts, and they’ll just do whatever they need to do to maximize that objective. And they don’t even care about the thing that you actually want them to do. So this is an example of a game called coast runners, where the boat is going round and round in circles. And it’s just picking up points from these little gems in the water. And it’s not even completing the lap the way it’s supposed to be.



Yep, absolutely. And this is a very big problem. And this is a in many ways. Yeah, I love the framing of inner versus outer alignment. I wish I hope that it becomes more mainstream, I think is a great framing is that the outer alignment is what we’ve all talked about. What is the correct utility function? How do we find a utility function as good inner alignment to space optimize the question, if we run sentences, Stochastic gradient descent on our loss function? Does it actually even optimize it? Or will it find your something that looks like it’s optimizing it but actually optimize something different?



Yeah, I wanted to say something about the stop button problem to back I’m trying to constantly backtrack a bit to also clean up the technical debt is that if you look in the practical world, if I think of, of AI and really good AI, I think most people think of something like a little computer that goes bbbbb, and we input and and maybe that has an off button. But if I think of AI, I think of something like Google, right? Like the search engine, male ecosystem that we’ve built up, and that might have an off button, or like 20. But we can’t like we simply can’t I think the whole OFF button debate because if we end up building an AI like this, there’s no way we shut off Google the world goes down if we shut up, maybe not okay, maybe at this stage, we can still shut off Google. And we could survive, but it’s going to be horrible. And if we build something more intelligent, that’s going to be more useful to us. And I think I don’t think that the stop button debate, as you said, it makes sense. And it’s not something we it first is not something we want. But second, it’s not something that we even now can conceivably do.



I agree with what Yannick said, because I believe that intelligence is externalized. And Google, the corporation is a form of externalized intelligence. And it’s nebulous and diffuse, and it’s self healing. If you attack Google, they have teams of lawyers that will respond to your attack. If you take down their servers, their scripts will fix the server and put it online again, it’s already a kind of living breathing system that you can’t possibly stop the cron job of death.



Yeah, like there’s a great lesson called what failure looks like by Paul Christiana, which I find very interesting. So as I mentioned, like the early AI alignment was much about intelligence explosions and super intelligence, liquefying the plant with nanobots and stuff like that was all a sci fi stuff in there. And all christianna deserves a lot of credit for being one of the people that creates much more down to earth type scenarios. And what he basically describes scenario like how AI lambda could go wrong without any catastrophe, his idea is just everyone every step of the way, people just to a little bit more to the AI live a little bit more decisions, let it take control of a few more corporations deploy a few more recommender algorithms, bit by bit and bit by bit humans just lose all connection to reality. We just read and see whatever you want. The economy is run completely by algorithms just step by step and at no point in time. Just bit by bit, all human corporations are out competed by alignment things. And at some point, that’s it just humans had no more influence. No one ever did anything. There was no war, there was no fight. It was just at some point, we’re just all sitting around and have no more influence on our future whatsoever.



So Okay, back to Dutch booking. Yeah, that’s where we enter that sort of interrupted something.



That was like an hour ago. All right, that’s booking sessions, I get like a whole hour tops. This is backed about rationality. And so it’s very hard to define what a rational what is rational. As I say, when you have a new comes paradox. It’s, there’s an argument to be made that taking both boxes rational. That’s what we call this causal decision theory that there’s an argument to be made. But there’s other decision theories that would say, not rational to do. So there’s a debate about a philosophical debate about what is rational? What is the correct definition of the word rational? What if you could modify your own rationality? What should you modify it to be? And this is, its philosophy. So there’s a lot of debate here, of course. But what I personally find is the most satisfying answer I found so far is basically to be immune to the idea about Dutch booking is that assuming you have someone who can offer you bets that you can take or refuse, and that this person can reliably offer you bets in such a way that you will always lose money. And this is also similar to the idea of, if you like pineapple, more than salami, and you like slime, or the cheesy, like salami, cheese more than pineapple, I can make a lot of money by charging you one cent to exchange a piece of pizza over and over again, as I was going to money.



This is the circular reference thing, right? Yeah,



like, for example, if you want to be in one country in one city more than the other city than the other city than the other city, and you’re willing to pay money to move from one to the other, you’ll pay infinite money going in circles all the time. And the theory is that if you had a good rationality, this should not happen. It should be forbidden from disguise of time, and the more general class of Dutch book or money pump attacks should be impossible, you should find your theory it should be impossible to extract unlimited amount of your resources without for no reason.



But Could I just gently challenge on that? Because it seems like that would be an inconsistent utility function if that existed. But if it was an AI agent, that would just as you say, it would spend infinite amounts of money on Uber, and it would always be moving. But if it was me, maybe I would move around a few times. And then I would my system, too, would kick in. And I think Hang on, this is stupid. I’m going around in circles here. I’m just gonna stay in Berkeley for a while.



This is what Yang was saying earlier about, if you want to model this, just put a dependency on T, the parameters depend on T and then you’re good. Yeah.



Is it possible though, is it possible to have an inconsistent utility function? What’s wrong with that?



So it’s not just our utility functions, it’s also our rationality. So there are certain ways of so this is I couldn’t do this without a whiteboard and an hour of time to rehearse it exactly. If you update your beliefs and things in a Bayesian way, that is a very hard, very computationally hard. So if you’d like a Bayesian theorem to update your beliefs, which is the correct way to do it, you is like almost attractable. And you can show that if agents, for example, do not do this Bayesian, but they do it in like certain, like approximate ways that are biased, you can offer them bets about their beliefs. And then you present them information and offer the new bets in a circular way that you buy, because they’re not updating completely, because you’re updating incorrectly, the beliefs they have are biased in a way that allows you to extract infinite money from them. And so that’s why he has a large category, deep circular preferences. It’s just a funny example. But it is a much wider theory of finding flaws in the way decisions are made in order to extract money from them. So like, in many ways, you can like Frank, there’s like versions, you can frame new comes paradox in different ways to extract money from people that don’t, too, but that don’t one box, there’s ways to so it,



could I challenge that as well, because we talked about the social dilemma, and then there’s a free will there’s an addiction debate there. Do people really want to be watching this crappy content on Facebook? Do people really want to be gambling, and it’s so paternalistic for us to say that I don’t think that’s good for people, because I think part of human flourishing is doing stupid stuff. If we had a consistent utility function, we would be so boring.



I’d like to separate two topics. I like to separate the topic of decision theory, which is a purely mathematical topic. That’s what I’m talking about right now is this has nothing to do with philosophy, nothing to do with humans, nothing to do with the real world. The purely mathematical question of Is there a human? Are there uniquely better rationales? Now, the rationale, this is a purely mathematical question.



The reason I said that is you said I’m as a human, if I could choose my own reward or utility function, then I would choose one which was consistent, so I think you weren’t making that statement.



Okay. Yeah, that fair is fair. But the next thing I’m gonna say is this, again, your decision theory is not your utility function, your decision theory is what you use to optimize your utility function. So if your utility function includes sitting around all day and eating potato chips, then having the best decision theory cannot be worse, having a better decision theory will only improve your ability to sit around all day and eat potato chips on the couch all day. So it’s important to separate your decision theory from your utility function by what you just described, what you just asked about what if you would actually want to have a philosophical argument about first a second order preferences. So there is this idea of like first order preferences, I want to do X, and then there’s a second order preference, I want to do X. I think this is a super fascinating, important topic that I’d like to talk about. But it is a separate topic from the decision theory.



Okay. Are there any other scenarios where humans could go around and loops like this? Imagine that you had you damaged your memory. And so you just kept making the same mistakes in life again, and again, you kept you got into self destructive spirals of behavior that was deleterious to your well being.



So humans do stuff like that all the time. Gambling, addiction, drug addiction, romantic love. I know if you’ve ever happened to you sure. As to me, unfortunately. Yes, but I think that’s like, separate those aren’t. Because our decision theory is bad. Those are just because humans are flawed in many ways. There’s many other ways in which were flawed before we even get to a formal decision theory. So I think formal decision theory is interesting, from the perspective of trying to understand very powerful AI. There’s this question of, if I give you a very large system, very large program, what can you tell me about this program, and it’s provable that in the limit, you can tell me nothing says Rice’s theorem is that if it gives me an arbitrary Turing machine, I can’t prove any non trivial statements about this machine. But this is where we can construct a subsystem or certain classes of systems that we can predict for that’s how our computers work is that we construct our computers to abstract away quantum noise, so we can better predict how they will behave in the real world so we can behave, treat them better. It’s also what I meant about AlphaGo is that I can’t predict which move AlphaGo will take, but I can predict that it is very likely to win. Again, there’s a statement I can make about a stronger, more intelligent entity. So that the reason I’m interested in decision theory is that if there is a very little one or a class of our most powerful decision theories, then we can predict that a most powerful intelligence will use those decision theories. And if we can then derive any knowledge from how those decision theories work, we might be able to say things about how these incredibly intelligent systems operate, even without us on cells being that intelligent. That’s why I’m interested in that. But you said there’d be a massive asymmetry, so we wouldn’t be able to make many assertions at all, if we are the lower intelligence. Potential Yes, is like the limit. It’s like if you have a program with a certain amount of resources, I can navigate to complexity theory is one of my favorite topics. There is like this concept of Kolmogorov complexity, which is this idea of the minimum possible length of program that gives you a certain output. And this is a very fascinating concept, a very useful concept in thinking about programs. And I find it fascinating that this is a thing that exists, it’s computable, because you have to solve the halting problem to find it, but it is a thing that exists. And this, okay, she’s graduating from Ohio, second, I have to like not start talking about pseudo random numbers. But basically, if we have an algorithm has a certain minimum length, I could need at least these many steps to perform the algorithm, then, but we only have a smaller number of computational steps we are allowed to perform, then we can always only approximate the solution of the actual problem, or it gets at it. But that’s the same as approximating. So the same way because let’s say if an agent is irreducibly complex, so a certain degree like I would expect that I would expect that I can’t, that the nd compute I would need to make as good decisions AlphaGo in a go game, or on the same order of magnitude as the decision of the computation that AlphaGo actually performs. And there’s no way for me to perform one step of calculation and immediately know what AlphaGo is going to do. Next, there’s just a fundamental property of how the algorithm works is a fundamental mathematical property of this algorithm is a combinatorial complexity is has a certain size. And if I don’t have at least enough compute to run this algorithm, I can’t, I can only make approximate predictions about it.



There’s also a compute and storage trade off. So you could argue that AlphaGo has memorized basically a whole bunch of different moves.



Sure, I still need n steps to read and memory and will



GPT I want to bring that up very briefly because we’re talking about memorizing moves. Now you feel GPT-3 is a great wake up call for society in general. As a warning about the potential of AI, and as well as its impact on the Internet Information space. Now, that’s a valid argument. you’ve presented in other talks about how GPT-3 does amazing things. You can literally feed it information. And it’s almost like talking to a person. Yannick, and I believe Tim as well. I’ve taken quite a look at GPT-3 and what it’s actually learning whether or not it’s just learning a hash function as search function, whether or not it’s just memorizing things with so many parameters. How do you answer that charge when there is good evidence that GPT-3 is memorizing? Are we actually talking about intelligence here? Are we talking about smart searches? And if it’s not intelligence, then should we actually be that worried?



All right, question. It kind of question. are humans intelligent? Okay. Are you sure? They just memorize a lot of shit? Have you ever talked to a school kid after they you wrote an essay? They have no concept of what was in the essay, they’re just regurgitating things. The teacher said, there’s no understanding, there’s no Delica. I’ve corrected the college level essays before as a TA job. And so they have no idea. It’s just regurgitation. It’s just babbling. There is there’s no underlying theory or anything. I don’t think humans are intelligent. I think it’s an open problem whether humans are intelligent or not.



That is, that is a extremely valid point. And that’s why this specific argument that I made, at least wasn’t that GPT-3 isn’t intelligent, but that GPT-3 isn’t doing whatever you might call reasoning, which is if humans do something, they do memorize, certainly a lot. But they also appear to do something like manipulate logical symbols in their head in a stepwise fashion, which we might call something like reasoning, like If This Then That, and so on, which I can see any evidence that something like GPT-3 does so far,



yeah, that this is one of the main problems in intelligence, because as you pointed out, even in humans, you can teach kids how to do their times tables, and what the rules are for multiplication. And they can use their system to but after a while, they will just memorize the results, and they will shortcut and this, this problem of imitation is pervasive within neural networks. Interesting, because if you look at AlphaGo, I said earlier, almost taking the piss a little bit that it’s memorized all of the moves. But of course, it hasn’t. Because there are an incredibly high number of possible moves, what it’s actually done is it’s through self play, it’s generated a whole bunch of data. And then it’s created this hierarchical entangled representation of all of these different board positions. And then inside that convex hull of possible positions, it’s cleverly interpolating between them. That’s exactly what GPT does. But as Janek said, what we humans do is we have this ability to abstract and go one level up and to reason and to distill our own knowledge. It’s definitely not doing that.



Alright, I like I’d like to say three different things. The first thing is, I want to lay out just for the sake of getting things heated, I want to say if I completely subjective things, without any backing, I will just say some something I believe, and not going to back it up. I’m going to get back to it later. And as you say, that’s your prerogative on purpose, then, I’m going to say how I perceived my brain actually working and why I think that how I that how most people describe their brain for key is the least not my experience at all. And then I want to make the case that that I’m going to get back to make a case about uncertainty and computation about how we don’t actually know how this is working. Okay, I’m going to start with the first thing. I think, actually three is artificial AGI. I think GPT-3 is as intelligent as human. And I think that actually is probably more intelligent than the human in a restricted way, in a very specific way. Literally, I’m going to back this up gonna max up the word. And I also believe that in many ways, it is more purely intelligent than humans are. I think that humans are approximating what GPT-3 is doing, not vice versa. And that’s it’s that yeah, this is going to be a



little controversial. Let me try to explain this. I’ve written this great essays live recently called Babel improve, which kind of explained a little bit about how they perceive their brain to work. And this is very similar to how I think so when I sit down to write a talk, so I have to give a talk. The way I do it is that I first generate a bunch of really bad talks, I started, I just started talking, I open my mouth and just start speaking. And a lot of things come out again, I’ll start saying, Hey, everybody, I am not always I should open this differently. And then I go back, and then I regurgitate another sound. And then eventually I find something that I like, and then I keep that and then I start regurgitating more things. And then I think in prune, there’s a lot of things that I’ve seen showing that humans are do things like that, like humans, the neocortex seems to do counseling, generative modeling of some kind, whatever. And so there is very weak evidence that you may or may not be doing something like that. But I want to make a much, much stronger claim here. So the third thing I want to talk about is that from an algorithmic perspective, from the purely abstract theoretical computational perspective, it is defining what is the same algorithm? What is the same computation? What properties computations have are undefined questions are questions that require solving the halting problem. In that regard, it is a we don’t know what GPT-3 does, we do not know, and anyone that says they does is lying, because they can’t know what GPT-3 is actually doing? Because the question is undefined as an undefined answer. And we don’t know what humans do. We know some things about what is going on. But the magic of Turing universality means that even a very modestly powerful algorithm can approximate any other possible algorithm. So if we look at the brain, there doesn’t actually seem to be any module for like logical reasoning for like symbol manipulation, whatever. And it’s actually something humans are very bad at, people have to be taught that they have to practice this, this is not something we do automatically. So in many ways, it looks like we are using a completely different mechanism to approximate a symbol manipulation algorithm, which is not that surprising. So I’m going to talk about that a second after you guys can all yell at me about why I have talked about why think GPT-3 is so intelligent. But I’d like to make a statement of uncertainty here is that I said this obviously, for the means. But at the heart, I think that the even asking, Is this algorithm? intelligent is in question that doesn’t really make sense. From a computational perspective. The question is, far more, does it produce intelligent behavior? Does it produce a behavior with a reasonable time complexity, there is this concept in computational complexity theory of different levels of complexities. And if you get into exponential complexity is even very small problems quickly become impossible to compute. And, to me, if I had an algorithm that had like an NP Oracle, so it can just evaluate every possible timeline simultaneously in one time step, it would be more intelligent than any other system ever. It by definition, it would, by definition, always choose the correct choice, it would never be wrong. But then you could ask the question, but is it really intelligence? Because it’s actually just evaluating all possible times? So there is a definition of intelligence of compression. There’s this idea that intelligence is the compression, the exploitation of structure in the structure of the space of the search function, is that a more intelligent system can reach a better approximation of the correct answer in a smaller polynomial amount of steps? And if you define it that way, then I can see that there might be a definition of intelligence and an algorithmic sense, that could make sense. But if that is the definition we’re looking for, then we can’t talk about God in that regard, because we just don’t have we don’t know what the true entropy of language is. We don’t know what the true difficulty of the search problem is. And so I think that there’s no way to really answer that question.



But I think we’re playing fast and loose with the definition of intelligence here, because compression and machine learning are very closely related. I can buy that, especially coming back to our notion of Kolmogorov complexity earlier. But no one really thinks that machine learning algorithms are intelligent, not seriously. I think this brings us on just quickly to the scaling hypothesis, because I think this is a nice segue, we all know Gorn. He said that this strong scaling hypothesis is that once we find a scalable architecture, like self attention, or convolutions, which like the brain can be applied fairly uniformly, we can simply train ever larger neural networks and ever more sophisticated behavior will emerge naturally as the easiest way to optimize for the tasks and data. He really thinks that if we just scale these things up, we’re gonna get on to the intelligence explosion a little while, but there’s this, the Singularity is nearby Ray Kurzweil. And he really described this concept that as we exponentially increase our technology and computing, in genetics, and so on, that will almost have this runaway breakaway effect where we’ll just lose control, and the thing will just get better and better. Yes, you believe that’s happening with GPT for it?



Yes. So, so a little bit of backstory, perhaps. So I thought deep learning was dead in 2017. I was convinced the 2017 that the bubble has burst deep learning is dead. Like, why don’t you research it? There’s nothing more to have here. We had all these GaNS? Yeah. Wow. What did they do to offend you? So no, like I get it. I’m just trying to explain my own little bit ethylhexyl journey here. So I was super unconvinced that deep blue was getting where it was. Such a simple method. Are you kidding me? matrix? multiplications Wow, intelligence boys we did it. It’s, you know, it seems so preposterous that there’s ever looked at the brain had all this complexity I can I came from neuroscience. So why should we? Why don’t my first love was neuroscience, I love neuroscience. And I saw its complexities the brain does is so clever. And it’s like mystical feeling of obviously, the brain must be doing something so much more intelligent and much more clever and whatever. And from that moment, I gave a talk like 2017 local meetup about how deep learning is dead. And that day Kirsty that was from that day forward, I was cursed, that every single day everybody in the entire world we’re working to show me wrong. So so everything as I was, again, and again, every time I said, Oh, deep learning cannot do X, a paper that would do that will come out the next day. It was like a magical power. And



yeah, then there’s a chorus of people that say, Oh, that’s not really intelligence. Well, yeah.



Yeah. So this happened to me over and over and over again, and what is the definition of intelligence, doing the same thing over and over again, and expecting a different result? So at some point, I was like, okay, you know, what, maybe I was wrong. Maybe I was got them wrong. So the height of this was last year with GPT-2 came out. And there’s a lot of hype that people like, Oh, this might be intelligence, whatever. And I was like, not look, they just made this thing bigger. And look, it’s cute, but it’s not that big a deal. And I want to bet I would have been any money on it. I think this is it. They Okay, they made their big stupid model. This is the end. Look, it’s it makes slightly funnier sentences. But that’s it. This is



the limit. But but just on that, let’s say you’re right, let’s say that there will be some intelligent behavior that emerges from these huge systems, the cloud providers, I think they’ve given up the the old school conception that we should understand intelligence. And they’ve, they’re now playing the memorization game. They’re using their petabyte cloud storage devices. And they’re just basically memorizing everything. But But these functions are extremely large, they will run out of space before anything interesting happens.



All right. Yeah, that’s basically what I was thinking that what you just described was my belief one year ago, I do not longer endorse that belief. Because along comes GPT-3, and GPT-3 to me, I was like, so first of all, I was just like, hmm, do not have anything better to do with your budget. And so I look into this thing, I start playing with whatever. There’s this idea. This is might be a little controversial, but I think it’s important su to young scientists out there’s something I wish someone would have told me. Most of science of doing science is about taste. It’s about having a good subjective hunch for what is good, what is worth looking into what is important, what’s interesting, the difference between a mediocre scientist and a really good scientist is to have a really good taste. And you can disagree with me there. But I don’t think there is a there is that we wish there was an objective way to be a good scientist, but the reason it’s all about taste. And so when I sat down with GPT-3 as our experiments when we say whatever, and I read the paper, I fell out of my chair, I was like Jesus Christ, if this is the end, because and I was shocked that other people were not seeing what I’m doing. So I would like to try to convey what I felt just like I’m subjective level. So it’s, and you could disagree with me afterwards. Like, Fact number one, GPT-3 did not complete a full epoch on its data. It saw most of its data only once. But it has such a good wide knowledge of topics, it couldn’t have seen more than once. This implies that it was capable of learning complete concepts in a single update step, which is something that everyone keeps, you know, saying deep learning can’t do. But it seems they had learned some kind of meta learning algorithm within its own weight updates to allow it to rapidly learn in these new concepts similar to humans, like when you’re a baby, you take weeks to learn a single word. But now if I introduced a single new word to you, you would immediately understand it,



it’s already learned this hierarchical entangled representation. So it’s not seeing it for the first time. It’s resonating. on a path in the network, all these neurons are firing up. So my



point is, this is very much how humans learn that the fact that it became more efficient in its learning by having these like reusable structures, like that was shocking to me is not it learned universal reusable concepts to understand the same way a human would have learned? Probably, and that is, that is that was very surprising to me. And again, it’s like when I used I make it I have a little video game project with a friend of mine. Were we supposed to be GBT inspire for a power game. It’s a dream simulator. So eventually, like the key word, and it creates a little dreamscape for you to live to walk around and you wouldn’t be like NPCs that talk to you about your topic or whatever. It’s a cool project. And so we used to use t 52. And what we had to do for example, we want to associate keywords with different emotions. So if you enjoy no cyberpunk, we want to be like dark moody. Rainy city where I’m like all these like things we associate with it. And we had a super complicated idea of how we put things into like, into like dictionaries and see like related words and look up things do all these complicated things. And then CBD three came around, and we just entered GPT-3, what emotions are associated with x? I was just telling you, you could just that what is so shocking about GP three is that you can just talk to it, you just tell GPT you don’t have to formulate a complex closed question or whatever to for it to feel like you’re just GPT The following are characters in a video game about dream simulations output your list of characters, you can say the The following is a comedy about your Peter Thiel and Elon Musk, and they’ll just output you a comedy like this happened. It’s on I think is an Aaron’s blog, and whatever, it was a change in not just in quantity, but in how these models think, though, and how you could prompt it in a way not think though, it’s just a really clever hash table. how, you know,



can I challenge you on the on the GPT-3 has seen or learns from like a single update step. It is true that it has seen most of its training data only once. But it has seen the higher quality portions many times and I haven’t seen a a convincing even single instance where it is shown that what it learns or what it outputs has been only in that particular training data that it is only seen once. So it’s very possible that it does the many steps on let’s say the things that is seen multiple times and then just maybe slightly adjusts slightly connects different things with the things that only it only sees once.



Right? Yep, that is perfectly fair. I like to By the way, just I usually preface this, but this is just general, I might be fucking wrong. There’s always a chance that every thing i’m saying is just absolutely dead wrong. I’ve been very wrong in the past about most things I’ve always I’m not like, like, I am so smart. And this is definitely true. And just trying to see CLR and car perceive this. This is the place for strong opinions and really evidence to back them up. So



exactly, we haven’t got a clue either. We spoke to Sarah hookah from the Google brain team. And she she said something quite interesting that there’s a lot of work around compression and sparsity and neural networks. And she was saying that most of the representational capacity in the neural network is actually wasted, memorizing hard examples. So what’s interesting about whether it’s vision data, or language data is that there are so many common patterns in the head of the distribution. And these really strong representations get established in the neural network. And probably you can delete about 90% of the connections in GPT-3, and it wouldn’t even make much difference. Potentially,



I’d like to quickly make the backup by case about why I think GB three is as intelligent as a human if that’s okay. Yes, I will. I was gonna ask you that. Yes, cuz, okay, it might be a bit of a cop out, because you might not like the definition of using here. But here’s the here’s my definition of what I mean by that, I want to clarify what I mean by that. What I mean by that is, is that I expect that if I trained a human in similar way, in similar tasks, I gave them the same amount of compute, I say I gave them the same amount of things I expect them to behave to perform similarly, well, similar, not necessarily much worse, or much better. And because here’s how I visualize the scenario GPT-3 is trained in a universe of text, it has physics, its universe is a one D token based universe, it has a sense of physics, there is an entropy to the data that there’s a generating function of the universe, the GP three is trying to learn God three is learning a generating function of a universe that’s generating functions, the generating function of English webtext, wherever that function might be. And in similar ways, humans in the real world learn a degenerating function of our physical universe or an approximation of it. I often see on people on Twitter whatever cough Gary Marcus carf, saying things like, Oh, look, I asked the AGI if I asked a GP three if a mouse is bigger than elephant, and it said yes, so obviously, it’s stupid. But I think this is like measuring a fish fitness by its ability to decline. The only thing the GP three was incentivized to learn the only thing it has access to is this universe of text is this physical function of a textual universe. This textual universe correlates with the real world, but it’s not the same as the real world. And this is not different from us humans, we do not perceive the correct quantum field underlying reality. Instead, we also learn a correlated universe of you know, macroscopic phenomena of colors and objects and stuff like this. This is not reality. Nothing that we see is real. It’s a virtual environment that is approximating a real unit theory. universe, and our universe happens to learn. So we happen to learn certain things about our universe like shape, color, movement, space and such that do not exist in GPT-3’s universe, there is no space, there was no time, then there might be time. But there’s no space. There’s no movement, there’s no inertia, there’s no gravity is none of these things exist. And so it seems fundamentally flawed to me to then to then say, Oh, look, we trained it on x. And it didn’t learn why that’s not a counter argument,



what you’re articulating is that GPT-3 is an auto regressive language model. And all it’s doing is predicting the next word. And, frankly, it’s incredible that it does as well as it does, because it seems to have learned this implicit knowledge base, even though you’ve never told it what to do. So as a thought experiment, if you thought it was possible to generate, let’s say, we had a Bert type model, and we could generate because the problem is we have we have hardly any training data. What if we could generate a wonderful corpus which represented a convex hole over all of the human discourse that could possibly exist? Do you think that would be intelligent?



Again, define intelligent is that for me intelligent that that’s why I talked about compression?



different way. So GPT-3, at the moment is rubbish. All it does is produce coherent text. But it’s completely inconsistent.



Yeah, it can write better blog posts, and I can can sometimes



Oh, yeah, but it’s just the imitation. It doesn’t if you ask it, as you said, like any kind of entailment, where it’ll say that elephants can fit through doors. It’s just completely stupid. But



here is Connors defense here. GPT-3, it’s not just better than GPT-2, it is remarkably better, it is insanely better. It’s one of the few like I’m not a fan of the intelligence explosion hypothesis. But this is probably the best evidence I’ve seen. That’s even like a feasible thing. It’s not just that it’s doing better. I don’t think it’s intelligent. But the argument that this is a sign that intelligence may just be a case of throwing enough parameters and enough data and enough compute. The things like intelligence start to come into view in the distant future, as opposed to being like, no, it’s all statistics. It’s all statistics and engineering. Now, it’s this is dicey.



No one cares if it’s intelligent like that, as we said, that depends on the definition of intelligence. That’s I, what scared me the most in the GPT-3 paper was this straight line of perplexity? No, I see. It’s a log plot, but no sign of slowing down, like no sign that there is ever an end in sight, where we can just throw in 10 times more compute and 10 times more data, and we get out 10 times better, whatever it is intelligence, Statistical Association, whatever that is, to the other point. And I think it can be concurrent with what you’re saying, Connor. And maybe it’s a bit what Tim wants to formulate is the following. Some, if I just trained GPT-3 or and it’s just trained on the text that exists, it can very well interpolate between that text and maybe a bit extrapolate in terms of the pattern that exists. But if there is any information at all, in that corpus, which we can might agree there is information, that information had to be produced. And it was produced, presumably, by humans, right? Maybe not, maybe only point 1% of humans actually contribute any information other than regurgitating information that’s already there. But all of this information somehow have to be produced by some humans. So maybe that’s what Tim alludes to, in a different way. And then I can frame this in the way you’re formulating is that what the humans do, all they do is just they take the generating function of the real world, and then regurgitate that, and one output of that is, is language, right? So that’s how they produce the language corpora. But all they do is basically just learn the generating function of the universe itself.



So you’re saying that humans are generating the data, and GPT-3 is learning it. And in a sense, that means GPT-3 is less intelligent because it’s not exploring or producing anything new. So the volume of the convex hole is not increasing as people use GPT-3, in fact, as soon as it’s trained, it’s getting old.



Alright, before we add more technical depth, can I quickly jump in here for a few things? Yes. I so I probably explained this very terribly. I apologize to any cop podcasts, listeners that actually stick through my rants. But one of the few definitions of intelligence that I also think is very useful that I’d like to mention, again, is the definition by Jeff Hawkins and his book on intelligence, where he defines it as a kind of be able to predict which is very related to being able to compress. This brings us to the concept of the great lookup table. The great lookup table is a philosophical thought experiment is that imagine you had agent who is composed of a lookup table of all possible states, the universe can be an intelligent output to it. Is this intelligent or not? This is so this is why complex computational complexity matters is there’s a lot of fascinating things about how computational complexity and reality are, like important, like Connect. There’s lots of things that like how, oh, you could go back in time, or you could reconstruct these black holes. If you had exponential compute and stuff like that. There’s like weird things that have been popping up in physics lately, like that. And I predict that more that’s going to happen is that there is a fundamental, like, as fundamental as possible difference between an algorithm that runs in polynomial time and run the runs in exponential time, I think, I think I truly believe that there is a fundamental difference between them. Is that yes, if you had a grand lookup table that, you know, has all possible inputs, then and it would act intelligence, whether it be intelligence or not, I think that this is a one of those questions that is basically incoherent, because constructing such a table is fundamentally impossible. It is fundamentally cannot ever possibly be done. There’s no way you can construct a table that’s exponentially larger than the actual universe that like you cannot do that that will never happen. You can speculate about that for fun, but it’s not a it’s an invalid question. It’s a mouthful, it breaks your assumptions.



As a point of order, though, like surely we were basing this entire presumption, this entire discussion of AGI basically on asymptotics here on the assumption that it’s possible to create one of these objects, but it’s not provably possible. And so it to say, Oh, it’s make what if our artificial general intelligence is the grand lookup table, you know, if one is impossible, the other is impossible. I think, I think we need to be careful about the assertion of the possibility these arguments,



here’s the thing, here’s the thing, here’s where here’s where computational complexity Wait is that the Kolmogorov complexity, the length of the shortest program that generates that table might be small. That’s important. The table itself by definition is exponential in the size of the universe, because it has every possible state the universe could be. But it might be that there is a short a small program that can generate that table, it can be that the Kolmogorov complexity of that table is small. And then so then it’s like the question, assuming I have this short program, assuming I have a short program that can generate this lookup table for any spot I want for any possible thing. Is that not intelligence? If that’s not intelligence? I don’t know it is. The thing is you have to take into account the How long will it take for that program to execute? Of course, that’s that, then that is like other questions, okay. Intel intelligence shouldn’t really be measured in comparison to the amount of compute you give it. So this comes to compressibility. Is it is how much can we approximate this? This is perfect, grand lookup table? How much? How close? How convex is the approximation of these of these outputs? how feasible are them? What is the landscape look like? And this is, this is this concept of having structure in these in space. And the space of policies is that if everything was random, then the grand lookup table is the only kind of intelligence that exists. But our universe is not random. So we have other intelligence that can approximate it to varying degrees to varying levels. And with much with exponentially smaller amounts of compute, we could construct all possible go trees, that would be the grand lookup table for go, we could construct that tree, but be so large that it can’t really exist in our physical universe. But we can be AlphaGo, which is a much shorter programming, much smaller program that can still approximate to acceptable levels of the green.



But the folks that AlphaGo they did what you said they had a computer program to self play, and to essentially create a whole bunch of data. But szalay would say it’s not intelligent because they’re buying skill with unlimited priors and experience. So what they did was they neural networks are still sample inefficient, they still had to put loads and loads of training rounds in there. And it ended up with this huge neural network. So why is that intelligent?



Okay, I guess we’re, I guess you’ve reached that point where we have to, we have to push the big red. Okay, we’re stretching the definition of intelligence too far button. And we have to take a step back. We are intelligence is what Marvin Minsky called a suitcase word, you can pack all these different definitions into it, and they don’t have to be compatible. So maybe we should try to use different words here. Let’s tubu the word intelligence No one is allowed to say intelligence for now. Instead, we’re going to try to use different things we’re gonna use, like sample efficiency, reduce computational efficiency, finally, performance, and try to see if we can make the same arguments with those words. Sounds good. Yep, that’s



that would be a great advice I think for the whole field which would have to be restructured into the subfields of a artificial sample it efficiency and artificial Yeah, I think that’s a it’s a great, it’s a great suggestion, just a reminder that we are often talking way past each other. And then you have, like people ask me, also sometimes for like little snippets they can put into their articles like a newspaper or some things like, but is it really intelligent? And, yeah, but I want to maybe finish off with a little bit of a connection to, because at the beginning and in between, you’re alluding to things like I don’t care how we make a better world, I would, I just want it to happen, and so on. And you also alluded to the economy, what we do when trying to align the economy and so on. So where do you see? Or do you see a large or a small connection to something like AI ethics? And what people are trying right now to do in, let’s say, the real world where we talk about banning? Should we ban face recognition? And how much of our data should go into these algorithms? Can we parse out data, differential privacy and so on? How much of a connection do you see there? or How much do you think general AI alignment research is disconnected from these things?



I have both very flattering and very spicy things to say about AI ethics as it currently is practiced. Please. By definition, ai ethics is obviously a good thing, obviously, making AI do more ethical things that is obviously something that we want, of course, but I am, let’s say not super happy with everything, how the field in practice actually operates. This is not in general, there are wonderful people in this field doing very important work. But I’m not super happy about how everything is doing. In many ways. Ai ethics has become a bit of a, an attempt to solve problems that are real, like bias and data sets or like using AI to sentence people unfairly. That’s it. That’s just fucked. Of course, that should that’s not a good thing. But it’s, in many ways it is trying is trying to think of a good metaphor here. If it’s trying to put out like, you’re, it’s trying to put out your handkerchief fire while your house is on fire. It’s Yeah, you’re right. Those are problems, but they’re not going to solve the house fire. The so like, one of the reasons I don’t work, I don’t really work much on these, like, common problems of bias text generation, or deep fakes or something like that is, first of all, it’s not my comparative advantage. It’s not something I’m unusually good at. And other people are very good at working on that. And second of all, if we have super powerful AGI, it’s underlined, it doesn’t fucking matter if we regulate it or not. It’s just that’s just it just doesn’t matter. If we ban we’ve if the government says, oh, Ai, we forbid you from turning us all the paper clips, quote by someone who is about to be paper clipped. It’s,



I think most of these people, though, don’t believe that AGI or the intelligence explosion is a real threat. So they are Yes, super focused in on what they perceive to be the threats to society. Now.



Yeah, and I understand that I can respect that I disagree. But that’s fine. But like that’s part of a healthy field is for different people to focus on different subjects. Like they’re probably say, I’m absolutely crazy. And they’re gonna find plenty of choice bits in this talk to show that I’m crazy. I’m sure. It’s Yeah. And



on this, because I’m Yannick was, was drawing a corollary between AI ethics and the alignment problem. And I really like that because we were talking about that utility function earlier with the cauldron. And it just it’s very human understandable, we have a real problem with ethics as well that from a legal framework point of view, we need these hidden attributes and the levels of discrimination to be understandable by humans. And Chris Oli has done more for machine learning interpretability than any other person, I think, in the last few years. He’s got the activation Atlas and the feature visualization articles on distill which wonderful, he believes that it is possible to understand deep learning. I disagree. I think that the whole point of machine learning is that it does something which we can’t explicitly program. So deep, do you think that’s a fundamental problem that we can only test we can test the what, but we can’t understand the why or the how.



Yeah, yeah, I’m very happy because Allah does the work he does, I think is really cool. But yeah, I think it’s, I don’t think is gonna work. And we have people our Discord server, disagree with me who work on interpretability, whatever. But here, I basically I want this great graph, like the the y axis is like interpretability, and the x axis is strength of the model. And so it starts really high, like simple models are really easy to understand. And then as it goes up, like a little bit, the model is confused. They can’t really make good concepts, it’s hard to understand. And it goes back up because the model can make like Chris clean, definitely kind of, you know, concept in a more meaningful way. It’s like we’re humans, and we’re our current AI systems are and then it plunges. Because eventually it’s it becomes so intelligent, it becomes so powerful. There’s just no computationally reducible way to understand what it is about. Do the count that the I expect that the Kolmogorov complexity of a sufficiently intelligent system is just so high that the amount of compute you need to exert to understand it is on the order of actually just running the system.



Yeah, and this is what rich Sutton says that we need to have massive amounts of compute. But not only that, a lot of these deep learning algorithms, as we were talking about adversarial examples are features not bugs last week. And these algorithms they learn just Crazy, Stupid features that that are present in as the data presents itself to us as pixels on a plane are manifold, there are these features that seem to work quite well that bear no relation to the real world whatsoever. And then we just memorize those features on the long tail. It’s just completely crazy, but it seems to work.



Yeah, I expect more of that to happen in the future. So I’m not super confident that that interpretability is a practical way for it, because it always basically puts a limit on how powerful our agents are allowed to get at some point, just our agents but output a decision. And they may output a, a minimal length explanation, but that minimal length explanation might be so long, there’s just impossible for us to ever evaluate in a reasonable timeframe. And I expect this to happen sooner rather than later. So I don’t think interpretability at least reposts, personally, I don’t think is a particularly



likely way to succeed. Do you think there is there is anything that is useful or practical, from your perspective that we could do we in terms of, let’s say, regulations, or kind of practices among AI to, to tackle that house fire that you’re talking about, like the big alignment problem?



To be clear, I am not a policy person. So I’m not gonna say anything about like laws or regulation, because I just don’t know enough about that. I’m skeptical of the utility of those kinds of processes in general, for these kinds of fast moving technically complicated things. I’m very skeptical about that. I think government has not done particularly well in the past, I hope that could change. What I would wish for is the is basically just a shift in the way people think about this is that it feels to me to a large degree that many people who go into AI somehow just never think about what happens if I succeed, they never seriously consider what happens if this works. What happens if what happens if everything goes exactly as planned? What I find I know a lot of people and some people here have mentioned they’re not fans of the intelligence explosion. But if you think about the intelligence explosion is the least weird future. That is what is going to happen if business as usual continues, if completely normal AI progress on the normal graph. So far, if nothing unusual happens, intelligence explosions is the default assumption of what will happen



to show late and he’s my favorite person in the world. But he did write an article criticizing the intelligence explosion. He says that intelligence is situational. There’s no such thing as general intelligence. Your brain is one piece in a broader system, which includes your body, your environment, other humans culture as a whole. No system exists in a vacuum, any individual intelligence will be both defined and limited by the context of its existence by the environment, largely externalize recursively self improving systems because of contingent bottlenecks, diminished returns encounter reactions arising from the broader context cannot achieve exponential progress in practice, empirically, they tend to display linear or sigmoidal improvement, which is what we see on here now. So he says, recursive intelligence expansion is already happening at the level of civilization, but it will keep happening in the age of AI progress is at roughly linear pace. So what do you think about that?



I think that, yeah, you could always make that argument, you can always find fancy, not super defined arguments. Oh, it won’t happen. Because it’s hard short, like you find, okay, but it’s not about, it’s not about I find these arguments very strange, I find this Archons very strange in the sense that it doesn’t really matter if it, you know, grows with this exponent, or that exponent, or if it grows, this or that, or it takes 50 or 100 years doesn’t really matter. What matters is that at some point, it’s going to be stronger, it’s going to have more, you know, power, more economic control, more intelligence that all humans put together. And whether that happens now, or in 50 years, or 100 years or whatever, doesn’t really change the core argument. And but if I may push back a little bit about it. What actually convinced me that the intelligence explosion is definitely going to happen, like soon was actually a very simple thought experiment is that assume I make intelligence as smart as a human just as small as a single human right? It’s pretty simple to do. Let’s assume like, we know, it must be possible to make things at least that smart. Takes nine months. Yeah. So let’s assume that we make it. Yeah, we upload someone’s brain, we scan someone’s brain, whatever, doesn’t matter. And now we just run it a million times faster. We just buy a million times more CPUs, we just paralyze the thing and run a node Testing? How is that not super intelligence, that entity could do 100 years of thinking in one hour.



But this assumes that virtualization of a mind is even possible. There’s the argument that if we transfer a human mind into an octopus, it becomes unrecognizable. It can Stein’s argument about having a conversation with a lion. No, like, these are real things. Our intelligence, how we perceive intelligence is fundamentally linked to not just biology, but the systems we interact with children that are raised in the wild. They don’t ever really come back. But this arguments facetious because it assumes that intelligence can even develop in such a way it can even express itself in such



a way. Okay, but yeah, I can say the same thing to you. Is that you assume that it wouldn’t is that the default is that we don’t currently see anything that hints that there’s anything special about intelligence that doesn’t, you haven’t yet found any thing that shows us that any of these these are, these are adding more complexity, Occam’s razor, the simplest possible explanation is just every business continues, as usual, is that nothing, we don’t find any magical part about intelligence, our models and continue to get better and better? human intelligence is just an algorithm like any other. That is the default assumption. Of course, somebody strange could happen.



We can’t just assume that we can virtualize an agent and speed it up according to Moore’s law.



I want to counter that, let’s frame it in a way where it’s legal everywhere in the world, you can do certain things to your brain gonna be good. That will convince you that you can speed it up by like 100 fold at least.



Yeah, it’s called modafinil or amphetamines. shala simulator 2020. No, but this is an interesting point, actually, this smart drugs do not improve your intelligence, they improve your processing speed. But if you do an IQ test after taking a bunch of modafinil or acid times or amphetamines or whatever, you will not score better on that intelligence because you’re fucking high. Yeah.



So I like to clarify something here is that if you speed up a brains processing, you’re slowing downtime for it is that the if your brain is running as a million times speeds as a as a million times more time to think as a million times more time to hang out in the shower.



But guess what I’m saying here? Is it the process, like yeah, I get where this arguments coming from. But there seems to be a degree of detachment. And maybe it’s just, we can’t even imagine what and nonhuman intelligence looks like. But I don’t buy this argument that it’s just a matter of waiting for Moore’s law to take over. There are signs that Moore’s law is gonna play a role, and certainly a lot of growth. But I find this argument kind of specious.



Even from a biological, biological standpoint, it’s a lot of the workings of your nervous systems are controlled by these, like the myelin sheathing of your neurons. And we know we there is a giant correlation between people’s IQs and the kind of quality of their myelin sheath things which is directly affecting how fast signals can travel through your neurons. So like, even in biology, it’s like, the faster your nerves, your nerves, the smarter you are. I’m with Conor on that if, like, I see that if you just speed up a brain, it become more intelligent to an outside observer. This is my opinion.



Gentlemen, we’ve reached time, kind of it’s been an absolute honor. And a pleasure having you on the podcast. Thank you so much for joining us. I feel that we could speak for another 10 hours. So we need to have you back on. But seriously, thank you so much for coming on.



Yeah, thank you so much. Yeah. So I would just like to wrap this up with all saying is that whether or not you found any of these arguments, super convincing or not, I would just do to think about what if we succeed? What if this actually works? What if everything could change is business normal? How can we make the world a better place? How can we ensure that humans get what they want, and that whatever we become in the fit into the far future, the other races of the speed of the galaxy, if they exists, are proud of what we’ve become.



Thank you so much for having me. Amazing. And in passing by the way, Connor has a Discord server. And we’ll put the details in the description. So if you want to have a conversation with Connor about some of the things we’ve spoken about today, please join his Discord.

Dr Alan D. Thompson is an AI expert and consultant. With Leta (an AI powered by GPT-3), Alan co-presented a seminar called ‘The new irrelevance of intelligence’ at the World Gifted Conference in August 2021. He has held positions as chairman for Mensa International, consultant to GE and Warner Bros, and memberships with the IEEE and IET. He is open to major AI projects with intergovernmental organisations and impactful companies. Contact.

This page last updated: 6/Aug/2021.