#144 – Michael Littman: Reinforcement Learning and the Future of AI

Lex Fridman Podcast XX

--:--

Full Transcription:

[0] The following is a conversation with Michael Littman, a computer science professor at Brown University doing research on and teaching, machine learning, reinforcement learning, and artificial intelligence.

[1] He enjoys being silly and lighthearted in conversation, so this was definitely a fun one.

[2] Quick mention of each sponsor, followed by some thoughts related to the episode.

[3] Thank you to Simply Save, a home security company I use to monitor and protect my apartment.

[4] ExpressVPN, the VPN I've used for many years to protect my privacy and the internet, masterclass, online courses that I enjoy from some of the most amazing humans in history, and BetterHelp Online Therapy with a licensed professional.

[5] Please check out these sponsors in the description to get a discount and to support this podcast.

[6] As a side note, let me say that I may experiment with doing some solo episodes in the coming month or two.

[7] The three ideas I have floating in my head currently is to use one, a particular moment in history, two, a particular movie, or three, a book to drive a conversation about a set of related concepts.

[8] For example, I could use 2001, a Space Odyssey, or X Machina to talk about AGI for one, two, three hours.

[9] or I could do an episode on the, yes, rise and fall of Hitler and Stalin, each in a separate episode, using relevant books and historical moments for reference.

[10] I find the format of a solo episode very uncomfortable and challenging, but that just tells me that it's something I definitely need to do and learn from the experience.

[11] Of course, I hope you come along for the ride.

[12] Also, since we have all this momentum built up on announcements, I'm giving a few lectures on machine learning at MIT this January.

[13] In general, if you have ideas for the episodes, for the lectures, or for just short videos on YouTube, let me know in the comments that I still definitely read, despite my better judgment and the wise, sage device of the great Joe Rogan.

[14] If you enjoy this thing, subscribe on YouTube, review it with five stars on Apple Podcast, follow on Spotify, support on Patreon, or connect with me on Twitter and Lex Friedman.

[15] As usual, I'll do a few minutes of ads now and no ads in the middle.

[16] I try to make these interesting, but I give you timestamps, so if you skip, please still check out the sponsors by clicking the links in the description.

[17] It is the best way to support this podcast.

[18] This show is sponsored by SimplySafe, a home security company.

[19] Everyone wants to keep their home and family safe.

[20] what they told me to say.

[21] So it must be true.

[22] Whether it's from a break -in, a fire, flooding, or a medical emergency, simply safe, home security, got your back, day and night, ready to send police, fire, or EMTs when you need them most straight to your door.

[23] I'm pretty sure if you suffer an AGI robot takeover, they will also, allegedly, send spot robots from Boston Dynamics for a full -on -robot -on -robot battle.

[24] small caveat.

[25] I haven't tried this aspect of the service yet myself, so I can't tell you if it's a good idea or not.

[26] They have sensors and cameras that protect every inch of your home.

[27] All it takes is a simple 30 -minute setup.

[28] I have it set up in my apartment, but unfortunately anyone who tries to break in will be very disappointed by the lack of interesting or valuable stuff to take.

[29] Some dumbbells, pull a bar, and some suits and shirts.

[30] That's about it.

[31] You get a free security camera and a 60 -day risk -free trial when you go to simplysafe .com slash Lex.

[32] Again, that's simplysafe .com slash Lex.

[33] This episode is also sponsored by ExpressVPN.

[34] Earlier this year, more than 100 Twitter users got their accounts hacked into.

[35] Passwords, email address, phone numbers, and more.

[36] The list included Elon Musk and Kanye West.

[37] I can't believe they gave me those two options.

[38] ExpressVPN can help avoid that.

[39] I use it to safeguard my personal data online.

[40] Did you know that for 20 years, the permissive action link PAL access control security device that controls access to the United States nuclear weapons had a password of just eight zeros?

[41] That's it?

[42] Apparently, this was a protest by the military to say that PAL systems are generally a bad idea because they are hackable and so on.

[43] Also, the most popularly, The password of 2020 are 1, 2, 3, 4, 5, 6, 1, 2, 4, 5, 6, 7, 8, 9, picture 1, password, and 1, 2, 3, 4, 6, 7, 7, 8.

[44] If you have one of these passwords, please, perhaps, make it a new year resolution to change them.

[45] Anyway, ExpressVPN encrypts your data unless you surf the web safely and anonymously.

[46] get it at ExpressVPN .com slash RelaxPod to get extra three months free.

[47] That's ExpressVPN .com slash LexPod.

[48] This show is also sponsored by Masterclass, $180 a year for an all -access pass to watch courses from literally the best people in the world on a bunch of different topics.

[49] Let me list some of enjoyed watching in part or in whole.

[50] Chris Hatfield on Space Exploration, Neil Douglas Tyson on Scientific Thinking and Communication, Willa Wright, creator of SimCity and Sims on game design, Carlos Santana on guitar, Gary Kasparo van Chess, Daniel Negrano on poker, Neil Gaiman on storytelling, Martin Scorsese on filmmaking, Jane Goodall on conservation, and many more.

[51] By the way, you can watch it on basically any device.

[52] Sign up at Masterclass .com slash Lex to get 15 % off the first year of an annual subscription.

[53] That's Masterclass .com slash Lex.

[54] This episode is also sponsored by BetterHelp, spelled H -E -L -P -Help.

[55] They figure out what you need and match you with a licensed professional therapist in under 48 hours.

[56] I chat with the person on there and enjoy it.

[57] Of course, I also have been talking to David Goggins over the past few months, who's definitely not a licensed professional therapist, but he does help me meet his and my demons and become comfortable to exist in their presence.

[58] Everyone is different, but for me, I think, suffering is essential for creation, but you can suffer beautifully in a way that doesn't destroy you.

[59] Therapy can help in whatever form that therapy takes.

[60] Better help is an option worth trying.

[61] They're easy, private, affordable, available worldwide.

[62] You can communicate by text any time and schedule weekly audio and video sessions.

[63] You didn't ask me, but my two favorite psychiatrists Zingman Freud and Carl Jung.

[64] Their work was important in my intellectual development.

[65] Anyway, check out betterhelp .com slash lex.

[66] That's betterhelp .com slash lex.

[67] And now here's my conversation with Michael Blakeman.

[68] I saw a video of you talking to Charles Isbell about Westworld, the TV series.

[69] You guys were doing a kind of thing where you're watching new things together, but let's rewind back.

[70] Is there a sci -fi movie or book or shows that you, that was profound out had an impact on you philosophically or just like specifically something you enjoyed nerding out about yeah interesting i think a lot of us have been inspired by robots in movies the one that i really like is uh there's a movie called robot and frank which i think is really interesting because it's very near -term future where uh robots are being deployed as uh helpers in people's homes and it was it was and we don't have to make robots like that at this point, but it seemed very plausible.

[71] It seemed very realistic or imaginable.

[72] And I thought that was really cool because they're awkward.

[73] They do funny things.

[74] It raised some interesting issues, but it seemed like something that would ultimately be helpful and good if we could do it right.

[75] Yeah, he was an older cranky gentleman, right?

[76] He was an older cranky jewel thief, yeah.

[77] It's kind of a funny little thing, which is, you know, he's a jewel thief, and so he pulls the robot into his life, which is like, which is something you could imagine taking a home robotics thing and pulling into whatever quirky thing that's involved in your existence.

[78] Yeah, it's meaningful to you.

[79] Exactly so.

[80] And I think from that perspective, I mean, not all of us are jewel thieves.

[81] And so when we bring our robots into our lives.

[82] Speak for yourself.

[83] It explains a lot about this apartment, actually.

[84] But no, the idea that people should have the ability to, you know, make this technology their own, that it becomes part of their lives.

[85] And I think that's, it's hard for us as technologists to make that kind of technology.

[86] It's easier to mold people into what we need them to be.

[87] And just that opposite vision, I think, is really inspiring.

[88] And then there's a anthropomorphization where we project certain things on them, because I think the robot is kind of dumb.

[89] But I have a bunch of Roomba's that play with, and they, you immediately project stuff onto them, much greater level of intelligence.

[90] We'll probably do that with each other, too.

[91] much greater degree of compassion.

[92] One of the things we're learning from AI is where we are smart and where we are not smart.

[93] Yeah.

[94] You also enjoy, as people can see, and I enjoyed myself, watching you sing and even dance a little bit, a little bit, a little bit, a little bit of dancing.

[95] A little bit of dancing.

[96] That's not quite my thing.

[97] As a method of education or just in life, you know, in general.

[98] So easy question.

[99] What's the definitive, objectively speaking, top three songs of all time?

[100] Maybe something that, you know, to walk that back a little bit, maybe something that others might be surprised by the three songs that you kind of enjoy.

[101] That is a great question that I cannot answer, but instead, let me tell you a story.

[102] Pick a question you do want to answer.

[103] That's right.

[104] I've been watching the presidential debates and vice presidential debates, and turns out, yeah, you can just answer any question you want.

[105] So, let's a related question.

[106] Yeah, well said.

[107] I really like pop music.

[108] I've enjoyed pop music ever since I was very young.

[109] So 60s music, 70s music, 80s music.

[110] This is all awesome.

[111] And then I had kids, and I think I stopped listening to music.

[112] And I was starting to realize that my musical taste had sort of frozen out.

[113] And so I decided in 2011, I think, to start listening to the top 10 billboard songs each week.

[114] So I'd be on the treadmill, and I would listen to that week's top 10 songs.

[115] so I could find out what was popular now.

[116] And what I discovered is that I have no musical taste whatsoever.

[117] I like what I'm familiar with.

[118] And so the first time I'd hear a song, because the first week that was on the charts, I'd be like, ugh.

[119] And then the second week, I was into it a little bit.

[120] And the third week, I was loving it.

[121] And by the fourth week is, like, just part of me. And so I'm afraid that I can't tell you my favorite song of all time because it's whatever I heard most recently.

[122] Yeah, that's interesting.

[123] People have told me that there's an art to listening to music as well.

[124] And you can start to, if you listen to a song, just carefully, explicitly just force yourself to really listen.

[125] You start to, I did this when I was part of jazz band and fusion band in college.

[126] You start to hear the layers of the instruments.

[127] You start to hear the individual instruments.

[128] And you start to, you can listen to classical music or to orchestra this way.

[129] You can listen to jazz this way.

[130] It's funny to imagine you now to walk in that forward to listening to pop hits now as like a scholar.

[131] Listening to like Cardi B or something like that or Justin Timberlake.

[132] No, not Temple Lake, Bieber.

[133] They've both been in the top ten since I've been listening.

[134] They're still up there.

[135] Oh my God, I'm so cool.

[136] If you haven't heard Justin Timberlake's top ten in the last few years, there was one song that he did where the music video was set at a. essentially NURIPS.

[137] Oh, wow.

[138] Oh, the one with the robotics.

[139] Yeah, yeah, yeah, yeah.

[140] Yeah, yeah.

[141] It's like at an academic conference, and he's doing a demo.

[142] He was presenting, right?

[143] It was sort of a cross between the Apple, like Steve Jobs kind of talk and NURIps.

[144] So, you know, it's always fun when AI shows up in pop culture.

[145] I wonder if he consulted somebody for that.

[146] That's really interesting.

[147] So maybe on that topic, I've seen your celebrity in multiple dimensions, but one of them is you've done, cameos in different places.

[148] I've seen you in a TurboTax commercial as like, I guess, the brilliant Einstein character.

[149] And the point is that TurboTax doesn't need somebody like you.

[150] It doesn't need a brilliant person.

[151] Very few things need someone like me. But yes, they were specifically emphasizing the idea that you don't need to be a computer expert to be able to use their software.

[152] How did you end up in that world?

[153] I think it's an interesting story.

[154] So I was teaching my class.

[155] It was an intro computer science class for non -concentrators, non -majors.

[156] And sometimes when people would visit campus, they would check in to say, hey, we want to see what a class is like.

[157] Can we sit on your class?

[158] So a person came to my class who was the daughter of the brother of the husband of the best friend of my wife.

[159] Anyway, Basically, a family friend came to campus to check out Brown and asked to come to my class and came with her dad.

[160] Her dad is, who I've known from various kinds of family events and so forth, but he also does advertising.

[161] And he said that he was recruiting scientists for this ad, this turbotex set of ads.

[162] And he said, we wrote the ad with the idea that we get like the most brilliant researchers.

[163] but they all said no. So can you help us find the, like, B -level scientists?

[164] And I'm like, sure, that's who I hang out with.

[165] So that should be fine.

[166] So I put together a list and I did what some people called a Dick Cheney.

[167] So I included myself on the list of possible candidates, you know, with a little blurb about each one and why I thought that would make sense for them to do it.

[168] And they reached out to a handful of them.

[169] But then they ultimately, the YouTube stalked me a little bit.

[170] And they thought, oh, I think he could.

[171] do this and they said okay we're we're going to offer you the commercial i'm like what so um it was it was such an interesting experience because it's it's they have another world the people who do like nationwide kind of ad campaigns and and television shows and movies and so forth it's quite a a remarkable system that they have going because like a set yeah so i went to uh it was just somebody's house that they rented in new jersey um but But in the commercial, it's just me and this other woman.

[172] In reality, there were 50 people in that room and another, I don't know, half a dozen kind of spread out around the house in various ways.

[173] There were people whose job it was to control the sun.

[174] They were in the backyard on ladders, putting filters up to try to make sure that the sun didn't glare off the window in a way that would wreck the shot.

[175] So there was like six people out there doing that.

[176] There was three people out there giving snacks, the craft table.

[177] There was another three people giving healthy snacks because that was a separate craft table.

[178] There was one person whose job it was to keep me from getting lost.

[179] And I think the reason for all this is because so many people are in one place at one time, they have to be time efficient.

[180] They have to get it done.

[181] The morning they were going to do my commercial.

[182] In the afternoon, they were going to do a commercial of a mathematics professor from Princeton.

[183] They had to get it done.

[184] No, you know, no wasted time or energy.

[185] And so there's just a fleet of people all working as an organism.

[186] and it was fascinating.

[187] I was just the whole time I'm just looking around like, this is so neat.

[188] Like one person whose job it was to take the camera off of the cameraman so that someone else whose job it was to remove the film canister because every couple's takes they had to replace the film because, you know, film gets used up.

[189] It was just, I don't know, I was geeking out the whole time.

[190] It was so fun.

[191] How many takes it to take?

[192] It looked the opposite like there was more than two people there.

[193] It was very relaxed.

[194] Right, right.

[195] Yeah, I mean, the person who I was in the scene with, um, is a professional.

[196] She's a, you know, uh, she's an actress, improv comedian from New York City.

[197] And when I got there, they had given me a script as such as it was.

[198] And then I got there and they said, we're going to do this as improv.

[199] I'm like, I don't know how to improv.

[200] Like this is not, I don't know what this, I don't know what you're telling me to do here.

[201] Don't worry.

[202] She knows.

[203] Like, okay.

[204] I'll see how this goes.

[205] I get, I guess I got pulled into the story because like, where the heck did you come from?

[206] I guess in the scene.

[207] Like, how does you show up this random person's house.

[208] I don't know.

[209] Yeah, well, I mean, the reality of it is I stood outside in the blazing sun.

[210] There was someone whose job it was to keep an umbrella over me because I started to Schwitz.

[211] I started to sweat.

[212] And so I would wreck the shot because my face was all shiny with sweat.

[213] So there was one person who would dab me off had an umbrella.

[214] But yeah, like the reality of it, like, why is this strange, stalkery person hanging around outside somebody's house?

[215] We're not sure when you have to look in.

[216] We'll have to wait for the book.

[217] But are you, uh, so you make, you make, like, you're just, you make, like, you're said YouTube, you make videos yourself, you make awesome parody, sort of parody songs that kind of focus in a particular aspect of computer science.

[218] How much those seem really natural, how much production value goes into that?

[219] Do you also have a team of 50 people?

[220] The videos, almost all the videos, except for the ones that people would have actually seen, are just me. I write the lyrics, I sing the song.

[221] I generally find a, like a backing track online because unlike you can't really play an instrument and then I do in some cases I'll do visuals using just like PowerPoint lots and lots of PowerPoint to make it sort of like an animation the most produced one is the one that people might have seen which is the overfitting video that I did with Charles Isbell and that was produced by the Georgia Tech and Udacity people because we were doing a class together it was kind of I usually do parody songs kind of to cap off a class at the end of a class so that one you're wearing, so it was just a thriller.

[222] Yeah.

[223] You're wearing the Michael Jackson, the red leather jacket.

[224] The interesting thing with podcasting that you're also into is that I really enjoy is that there's not a team of people.

[225] It's kind of more, because you know, there's something that happens when there's more people involved in just one person that just the way you start acting i don't know there's a censorship you're not given especially for like slow thinkers like me you're not and i think most of us are if we're trying to actually think we're a little bit slow and and careful it kind of large teams get in the way of that and i don't know what to do with i say like that's the to me like if you know It's very popular to criticize, quote -unquote, mainstream media.

[226] But there is legitimacy to criticizing them, the same.

[227] I love listening to NPR, for example.

[228] But it's clear that there's a team behind it.

[229] There's a constant commercial breaks.

[230] There's this kind of like rush of like, okay, I have to interrupt you now because we have to go to commercial.

[231] Just this whole, it creates, it destroys the possibility of nuanced conversation.

[232] Yeah, exactly.

[233] Evian, which Charles, Isbel, who I talked to yesterday, told me that Evian is naive backwards, which the fact that his mind thinks this way is just quite brilliant.

[234] Anyway, there's a freedom to this podcast.

[235] He's Dr. Awkward, which, by the way, is a palindrome.

[236] That's a palindrome that I happen to know from other parts of my life.

[237] You just figured out of, well, you know, use it against Charles.

[238] Dr. Awkward.

[239] So what was the most challenging parody song to make?

[240] Was it the thriller one?

[241] No, that one was really fun.

[242] I wrote the lyrics really quickly, and then I gave it over to the production team.

[243] They recruited an a cappella group to sing.

[244] It went really smoothly.

[245] It's great having a team, because then you can just focus on the part that you really love, which in my case is writing the lyrics.

[246] For me, the most challenging one, not challenging in a bad way, but challenging in a really fun way, was I did one of the one of the parody songs I did is about the halting problem in computer science the fact that you can't create a program that can tell for any other arbitrary program whether it actually going to get stuck in infinite loop or whether it's going to eventually stop and so I did it to an 80s song because that's I hadn't started my new thing of learning current songs and it was Billy Joel's The Piano Man Nice Nice.

[247] Which is a great song.

[248] Great song.

[249] Yeah, yeah.

[250] Sing me a song.

[251] You get the piano man?

[252] Yeah, it's a great song.

[253] So the lyrics are great because, first of all, it rhymes.

[254] Not all songs rhyme.

[255] I've done Rolling Stone songs, which turn out to have no rhyme scheme whatsoever.

[256] They're just sort of yelling and having good time, which makes it not fun from a parody perspective because, like, you can say anything.

[257] But, you know, the lines rhymed, and there was a lot of internal rhymes as well.

[258] And so figuring out how to sing with internal rhymes, a proof of the halting problem was really challenging.

[259] And I really enjoyed that process.

[260] What about last question on this topic?

[261] What about the dancing and the thriller video?

[262] How many takes that take?

[263] So I wasn't planning to dance.

[264] They had me in the studio and they gave me the jacket and it's like, well, you can't, if you have the jacket and the glove, like there's not much you can do.

[265] So I think I just danced around.

[266] And then they said, why don't you dance a little bit?

[267] There was a scene with me and Charles dancing together.

[268] In that video?

[269] They did not use it in the video, but we recorded it.

[270] I don't remember.

[271] Yeah, yeah.

[272] No, it was pretty funny.

[273] And Charles, who has this beautiful, wonderful voice, doesn't really sing.

[274] He's not really a singer.

[275] And so that was why I designed the song with him doing a spoken section and me doing the singing.

[276] It's very like Barry White.

[277] Yeah, it's smooth baritone.

[278] Yeah, yeah.

[279] It's great.

[280] That was awesome.

[281] So one of the other things Charles said is that, you know, everyone knows you as like a super nice guy super passionate about teaching and so on what he said don't know if it's true that despite the fact that you're you are so cold blood like okay i will admit this finally for the first time that was that was me it's the Johnny Cash song kill the manorino just to watch him die uh that you actually do have some strong opinions and some topics so if this in fact is true what strong opinions, would you say you have?

[282] Is there ideas you think maybe an artificial intelligence, machine learning, maybe in life that you believe it's true that others might, you know, some number of people might disagree with you on?

[283] So I try very hard to see things from multiple perspectives.

[284] There's this great Calvin and Harbs, Calvin and Hobbs cartoon where do you know, okay, so Calvin's dad is always, He's kind of a bit of a foil.

[285] And he talked Calvin into, Calvin had done something wrong.

[286] The dad talks him into, like, seeing it from another perspective.

[287] And Calvin, like, this breaks Calvin because he's like, oh, my gosh, now I can see the opposite sides of things.

[288] And so it becomes like a cubist cartoon where there is no front and back.

[289] Everything's just exposed.

[290] And it really freaks him out.

[291] And finally he settles back down.

[292] It's like, oh, good.

[293] No, I can make that go away.

[294] But, like, I'm that.

[295] I live in that world where I'm trying to see everything from every perspective all the time.

[296] So there are some things that I've formed opinions about that I would be harder, I think, to disavow me of.

[297] One is the superintelligence argument and the existential threat of AI is one where I feel pretty confident in my feeling about that one.

[298] Like, I'm willing to hear other arguments, but like I am not particularly moved by the idea that if we're not careful, we will accidentally create a superintelligence that will destroy human life.

[299] Let's talk about that.

[300] Let's get you trouble and record you on video.

[301] It's like Bill Gates, I think he said like some quote about the internet, that that's just going to be a small thing.

[302] It's not going to really go anywhere.

[303] And I think Steve Balmer said, I don't know why I'm sticking on Microsoft.

[304] That's something that like smartphones are useless.

[305] There's no reason why Microsoft should get into smartphones.

[306] That kind of.

[307] So let's talk about AGI.

[308] As AGI is destroying the world, we'll look back at this video and see.

[309] No, I think it's really interesting to actually talk about it because nobody really knows the future, so you have to use your best intuition.

[310] It's very difficult to predict it.

[311] But you have spoken about AGI and the existential risks around it and sort of based on your intuition that we're quite far away from that being a serious concern relative to the other concerns we have.

[312] Can you maybe unpack that a little bit?

[313] Yeah, sure, sure.

[314] Sure.

[315] So as I understand it, for example, I read Bostrom's book and a bunch of other reading material about this sort of general way of thinking about the world.

[316] And I think the story goes something like this, that we will at some point create computers that are smart enough that they can help design the next version of themselves, which itself will be smarter than the previous version of themselves, and eventually bootstrapped up.

[317] to being smarter than us, at which point we are essentially at the mercy of this sort of more powerful intellect, which in principle, we don't have any control over what its goals are.

[318] And so if its goals are at all out of sync with our goals, like, for example, the continued existence of humanity, we won't be able to stop it.

[319] It'll be way more powerful than us, and we will be toast.

[320] So there's some, I don't know, very smart people who have signed on to that story.

[321] And it's a, it's a compelling story.

[322] I once, now I can really get myself in trouble.

[323] I once wrote an op -ed about this, specifically responding to some quotes from Elon Musk, who has been, you know, on this very podcast more than once.

[324] And.

[325] AI summoning the demon.

[326] That's a thing he said.

[327] But then he came to Providence, Rhode Island, which is where I live, and said to the governors of all the states, you're worried about entirely the wrong thing.

[328] You need to be worried about AI.

[329] You need to be very, very worried about AI.

[330] And journalists kind of reacted to that.

[331] They wanted to get people's take.

[332] And I was like, okay, my belief is that one of the things that makes Elon Musk so successful and so remarkable as an individual is that he believes in the power of ideas.

[333] He believes that you can have, you can, if you know, if you have a really good idea for getting into space, you can get into space.

[334] If you have a really good idea for a company or for how to change the way that people drive, you just have to do it and it can happen.

[335] It's really natural to apply that same idea to AI.

[336] You see these systems that are doing some pretty remarkable computational tricks, demonstrations, and then to take that idea and just push it all the way to the limit and think, okay, where does this go?

[337] Where is this going to take guess next.

[338] And if you're a deep believer in the power of ideas, then it's really natural to believe that those ideas could be taken to the extreme and kill us.

[339] So I think, you know, his strength is also his undoing because that doesn't mean it's true.

[340] Like, it doesn't mean that that has to happen, but it's natural for him to think that.

[341] So another way to phrase the way he thinks, and I find it very difficult to argue with that line of thinking.

[342] So Sam Harris is another person from neuroscience perspective that things like that is saying well is there something fundamental in the physics of the universe that prevents this from eventually happening and this nick boston thinks in the same way that kind of zooming out yeah okay we humans now are existing in this like time scale of minutes and days and so our intuition is in this time scale of minutes hours in days.

[343] But if you look at the span of human history, is there any reason you can't see this in a hundred years?

[344] And like, is there something fundamental about the laws of physics that prevent this?

[345] And if it doesn't, then it eventually will happen.

[346] Or we will destroy ourselves in some other way.

[347] And it's very difficult, I find, to actually argue against that.

[348] Yeah.

[349] Me too.

[350] and not sound like not sound like you're just like rolling your ass like I have like it's science fiction we don't have to think about it but even even worse than that which is like I don't have kids but like I got to pick up my kids now like this okay I see there's more pressing short term yeah there's more pressing short term things that like stop up with this existential crisis with much much shorter things like now especially this year there's COVID so like any kind of discussion like that It's like there's, you know, there's pressing things today.

[351] And then so the Sam Harris argument, well, like any day, the exponential singularity can occur is very difficult to argue against.

[352] I mean, I don't know.

[353] But part of his story is also he's not going to put a date on it.

[354] It could be in a thousand years.

[355] It could be in 100 years.

[356] It could be in two years.

[357] It's just that as long as we keep making this kind of progress, it ultimately has to become a concern.

[358] I kind of I'm on board with that, but the thing that the piece that I feel like is missing from that that way of extrapolating from the moment that we're in is that I believe that in the process of actually developing technology that can really get around in the world and really process and do things in the world in a sophisticated way, we're going to learn a lot about what that means, which that we don't know now because we don't know how to do this right now.

[359] if you believe that you can just turn on a deep learning network and eventually give it enough compute and it'll eventually get there, well, sure, that seems really scary because we won't, we won't be in the loop at all.

[360] We won't be helping to design or target these kinds of systems.

[361] But I don't see that, that feels like it is against the laws of physics because these systems need help, right?

[362] They need to surpass the, the difficulty, the wall of complexity that happens in arranging something in the form that, that will happen in.

[363] Yeah.

[364] Like, I believe in evolution.

[365] Like, I believe that there's an argument, right?

[366] So there's another argument, just to look at it from a different perspective, that people say, well, I don't believe in evolution.

[367] How could evolution, it's sort of like a random set of parts assemble themselves into a 747, and that could just never happen.

[368] So it's like, okay, that's maybe hard to argue against, but clearly 747s do get assembled.

[369] They get assembled by us.

[370] basically the idea being that there's a process by which we will get to the point of making technology that has that kind of awareness.

[371] And in that process, we're going to learn a lot about that process and we'll have more ability to control it or to shape it or to build it in our own image.

[372] It's not something that is going to spring into existence like that 747, and we're just going to have to contend with it completely unprepared.

[373] That's very possible that in the context of the long arc of human history, it will, in fact, spring into existence.

[374] But that springing might take, like if you look at nuclear weapons, like even 20 years is a springing in the context of human history.

[375] And it's very possible, just like with nuclear weapons, that we could have, I don't know what percentage you want to put at it, but the possibility of...

[376] Could have knocked ourselves out.

[377] Yeah, the possibility of human being.

[378] destroying themselves in the 20th centuries with nuclear weapons i don't know you can if you really think through it you could really put it close to like i don't know 30 40 percent given like the certain moments of crisis that happened so like i think one like fear in the shadows that's not being acknowledged is it's not so much the i will run away is is that as it's running away we won't have enough time to think through how to stop it.

[379] Right.

[380] Fast takeoff or fume.

[381] Yeah.

[382] I mean, my much bigger concern, I wonder what you think about it, which is we won't know it's happening.

[383] So I kind of think that there's an AGI situation already happening with social media that our minds, our collective intelligence of human civilization is already being controlled by an algorithm.

[384] And like we're we're already super like the level of a collective intelligence thanks to Wikipedia people should donate to Wikipedia to feed the AGI man if we had a super intelligence that that was in line with Wikipedia's values that it's a lot better than a lot of other things I can imagine I trust Wikipedia more than I trust Facebook or YouTube as far as trying to do the right thing from a rational perspective yeah now that's not where you were going I understand that but it it does strike me that there's sort of smarter and less smart ways of exposing ourselves to each other on the internet.

[385] Yeah, the interesting thing is that Wikipedia and social media have very different forces.

[386] You're right.

[387] I mean, Wikipedia, if AGI was Wikipedia, it would be just like this cranky, overly competent editor of articles.

[388] You know, there's something to that.

[389] But the social media aspect is not, so the vision of AGI is as a separate system that's super intelligent.

[390] That's super intelligent.

[391] That's one key little thing.

[392] I mean, there's the paperclip argument that's super dumb, but super powerful systems.

[393] But with social media, you have relatively like algorithms we may talk about today, very simple algorithms that when, so something Charles talks a lot about, which is interactive AI, when they start like having at scale, like tiny little interactions with human beings, they can start controlling these human beings.

[394] So a single algorithm can control the minds of human being slowly to what we might not realize it can start wars, it can start, it can change the way we think about things.

[395] It feels like in the long arc of history, if I were to sort of zoom out from all the outrage and all the tension on social media, that it's progressing us towards better and better things.

[396] It feels like chaos and toxic and all that kind of stuff.

[397] It's chaos and toxic, yeah.

[398] But it feels like actually the chaos and toxic is similar to the kind of debates we had from the founding of this country.

[399] You know, there was a civil war that happened over that, over that period.

[400] And ultimately, it was all about this tension of like something doesn't feel right about our implementation of the core values we hold as human beings and they're constantly struggling with this.

[401] And that results in people calling each other just being shitty to each other on Twitter.

[402] But ultimately the algorithm is managing all that.

[403] And it feels like there's a possible future in which that algorithm controls us to into the direction of self -destruction and whatever that looks like.

[404] Yeah.

[405] So, all right, I do believe in the power of social media to screw us up royally.

[406] I do believe in the power of social media to benefit us, too.

[407] I do think that we're in a, yeah, it's sort of almost got dropped on top of us.

[408] and now we're trying to, as a culture, figure out how to cope with it.

[409] There's a sense in which, I don't know, there's some arguments that say that, for example, I guess college age students now, late college age students now, people who were in middle school when social media started to really take off, may be really damaged.

[410] Like, this may have really hurt their development in a way that we don't have all the implications of quite yet.

[411] That's the generation who, and I hate to make it something.

[412] somebody else's responsibility, but they're the ones who can fix it.

[413] They're the ones who can figure out how do we keep the good of this kind of technology without letting it eat us alive.

[414] And if they're successful, we move on to the next phase, the next level of the game.

[415] If they're not successful, then, yeah, then we're going to wreck each other.

[416] We're going to destroy society.

[417] So you're going to, in your old age, sit on a porch and watch the world burn because of the TikTok generation I believe, well, so this is my kids age, right?

[418] And certainly my daughter's age.

[419] And she's very tapped in to social stuff.

[420] But she's also, she's trying to find that balance, right?

[421] Of participating in it and in getting the positives of it, but without letting it eat her alive.

[422] And I think sometimes she ventures, I hope she doesn't watch this, sometimes I think she ventures a little too far and is consumed by it.

[423] And other times she gets a little distance.

[424] And if, you know, if there's enough people like her out there, they're going to, they're going to navigate this, this choppy waters.

[425] That's, that's an interesting skill, actually, to develop.

[426] I talked to my dad about it.

[427] You know, I've now somehow, this podcast in particular, but other reasons has received a little bit of attention.

[428] And with that, apparently in this world, even though I don't shut up about love and I'm just all about kindness, I have now a little.

[429] mini army of trolls.

[430] It's kind of hilarious, actually, but it also doesn't feel good.

[431] But it's a skill to learn to not look at that.

[432] Like, to moderate actually how much you look at that.

[433] The discussion I have with my dad, it's similar to, it doesn't have to be about trolls.

[434] It could be about checking email, which is like if you're anticipating, you know, there's, my dad runs a large institute at Drexel University, and there could be stressful, like, emails you're waiting like there's drama of some kinds and so like there's a temptation to check the email if you send an email and you kind of and that pulls you in into it doesn't feel good and it's a skill that he actually complains that he hasn't learned i mean he grew up without it so he hasn't learned the skill of how to shut off the internet and walk away and i think young people while they're also being quote unquote damaged by like uh you know being bullied online all of those stories which are very, like, horrific.

[435] You basically can't escape your bullies these days when you're growing up.

[436] But at the same time, they're also learning that skill of how to be able to shut off the, like, disconnect.

[437] Would they be able to laugh at it and not take it too seriously?

[438] It's fascinating.

[439] Like, we're all trying to figure this out.

[440] Just like you said, has it been dropped on us and we're trying to figure it out.

[441] Yeah, I think that's really interesting.

[442] And I guess I've become a believer in the human design, which I feel like I don't completely, understand like how do you make something as robust as us like we're so flawed in so many ways and yet and yet you know we dominate the planet and we do seem to manage to get ourselves out of scrapes eventually not necessarily the most elegant possible way but somehow we get we get to the next step and I don't know how I'd make a machine do that I I I'm generally speaking like if I train one of my reinforcement learning agents to play a video game and it works really hard on that first stage over and over and over again and it makes it through it.

[443] It succeeds on that first level.

[444] And then the new level comes and it's just like, okay, I'm back to the drawing board.

[445] And somehow humanity, we keep leveling up and then somehow managing to put together the skills necessary to achieve success, some semblance of success in that next level too.

[446] And, you know, I hope we can keep doing that.

[447] You mentioned reinforcement learning.

[448] So you've had a couple of years.

[449] the field?

[450] No. Quite, you know, quite a few.

[451] Quite a long career in artificial intelligence broadly, but reinforcement learning specifically.

[452] Can you maybe give a hint about your sense of the history of the field that in some ways it's changed with the advent of deep learning, but as long roots, like how is it we've done it out of your own life?

[453] How have you seen the community change or maybe the ideas that it's playing with change?

[454] I've had the privilege, the pleasure of having almost a front row seat to a lot of this stuff.

[455] And it's been really, really fun and interesting.

[456] So when I was in college in the 80s, early 80s, the neural net thing was starting to happen.

[457] And I was taking a lot of psychology classes and a lot of computer science classes as a college student.

[458] And I thought, you know, something that can play tic -tac -toe and just like learn to get better at it, that ought to be a really easy thing.

[459] So I spent almost all of my, what would have been vacations during college, like hacking on my home computer, trying to teach it how to play tic -tac -toe.

[460] Programming language, do you remember?

[461] Basic.

[462] Oh, yeah.

[463] That's my first language.

[464] That's my native language.

[465] Is that when you first fell in love with computer science?

[466] Just like programming basic on that.

[467] What was the computer?

[468] Do you remember?

[469] I had a TRS -80, Model 1, before they were called Model 1 because there was nothing else.

[470] I got my computer in 1979 instead so I was I would have been bar mitzvahed but instead of having a big party that my parents threw on my behalf they just got me a computer because that's what I really really really wanted I saw them in the in the mall in Radio Shack and I thought what how are they doing that?

[471] I would try to stump them I would give them math problems like one plus and then in parentheses two plus one and it would always get it right.

[472] I'm like, how do you know so much?

[473] Like, I've had to go to algebra class for the last few years to learn this stuff, and you just seem to know.

[474] So I was, I was, I was smitten and I got a computer.

[475] And I think ages 13 to 15, I have no memory of those years.

[476] I think I just was in my room with the computer.

[477] Listening to Billy Joel.

[478] Communing, possibly listening to the radio, listening to Billy Joel.

[479] That was the one album I had on vinyl at that time.

[480] And then I got it on cassette tape, and that was really helpful, because then I could play it.

[481] I didn't have to go down to my parents' Wi -Fi, or hi -fi, sorry.

[482] And at age 15, I remember kind of walking out and like, okay, I'm ready to talk to people again.

[483] Like, I've learned what I need to learn here.

[484] And so, yeah, so that was my home computer.

[485] And so I went to college and I was like, oh, I'm totally going to study computer science.

[486] And I opted the college I chose, specifically had a computer science major.

[487] The one that I really wanted, the college I really wanted to go to, didn't, so bye -bye to them.

[488] Which college did you go to?

[489] So I went to Yale.

[490] Princeton would have been way more convenient, and it was just a beautiful campus, and it was close enough to home, and I was really excited about Princeton, and I visited, I said, so, computer science major, like, well, we have computer engineering.

[491] I'm like, oh, I don't like that word engineering.

[492] I like computer science.

[493] I really, I want to do, like, you're saying hardware and software?

[494] They're like, yeah, I just want to do software.

[495] I couldn't care less about hardware.

[496] And you grew up in Philadelphia?

[497] I grew up outside Philly, yeah, yeah.

[498] Yeah.

[499] Okay.

[500] So the, you know, local schools were like Penn and Drexel and Temple.

[501] Like everyone in my family went to Temple, at least at one point in their lives except for me. So, yeah, Philly family.

[502] Yale had a computer science department, and that's when you, it's kind of interesting.

[503] You said 80s in neural networks.

[504] That's when the neural networks is a hot new thing or a hot thing period.

[505] So what is that in college when you first learned about neural networks?

[506] Yeah, yeah.

[507] And it was in a psychology class, not in a CSI.

[508] Yeah.

[509] Was it psychology or cognitive science or like, do you remember like what context?

[510] It was, yeah, yeah.

[511] So I was a, I've always been a bit of a cognitive psychology groupie.

[512] So like I study computer science, but I like, I like to hang around where the cognitive scientists are because I don't know.

[513] Brains, man. They're like, they're wacky.

[514] Cool.

[515] And they have a bigger picture view of things.

[516] They're a little less engineering, I would say.

[517] They're more, they're more interested in the nature of cognition and intelligence and perception and how the vision system work.

[518] They're asking always bigger questions.

[519] Now with the deep learning community there, I think, more, there's a lot of intersections.

[520] But I do find that the neuroscience folks actually and cognitive psychology, cognitive science folks are starting to learn how to program, how to use neural artificial neural networks.

[521] And they are actually approaching problems in like totally new, interesting ways.

[522] It's fun to watch at grad students from those departments.

[523] like approach to problem of machine learning.

[524] Right, they come in with a different perspective.

[525] Yeah, they don't care about like your image net data set or whatever.

[526] They want like to understand the like the basic mechanisms at the neuronal level and the functional level of intelligence.

[527] It's kind of cool to see them work.

[528] But yeah, okay, so you always love, you were always a groupie of cognitive psychology.

[529] Yeah, yeah.

[530] And so it was in a class by Richard.

[531] Gerrigg.

[532] He was kind of my, my favorite psych professor in college, and I took like three different classes with him.

[533] And yeah, so they were talking specifically, the class I think was kind of a, there was a big paper that was written by Stephen Pinker and Prince.

[534] I'm blanking on Prince's first name, but Pinker and Prince, they wrote kind of a, they were at that time kind of like I'm blanking on the names of the current people the cognitive scientists who are complaining a lot about deep networks Oh, Gary Marcus And who else?

[535] I mean there's a few But Gary, Gary is the most feisty Sure, Gary's very feisty And with his co -author, they're kind of doing these kind of takedowns where they say, okay, well, yeah, it does all these amazing things, but here's a shortcoming, here's a shortcoming, here's a shortcoming, here's a shortcoming.

[536] And so the Pinker Prince paper is kind of like that generation's version of Marcus and Davis, right, where they're trained as cognitive scientists, but they're looking skeptically at the results in the artificial intelligence neural net kind of world and saying, yeah, it can do this and this and this, but it can't do that and it can't do that, maybe in principle or maybe just in practice at this point.

[537] But the fact of the matter is you've narrowed your focus too far to be impressed.

[538] You know, you're impressed with the things within that circle, but you need to broaden that circle a little bit.

[539] You need to look at a wider set of problems.

[540] And so I was in this seminar in college that was basically a close reading of the Pinker Prince paper, which was like really thick.

[541] There was a lot going on in there.

[542] And it talked about the reinforcement learning idea a little bit.

[543] I'm like, oh, that sounds really cool because behavior is what is really interesting to me about psychology anyway.

[544] Right.

[545] So making programs that, I mean, programs are things that behave.

[546] People are things that behave.

[547] Like, I want to make learning that learns to behave.

[548] In which way was reinforcement learning presented?

[549] Is this talking about human and animal behavior, or are we talking about actual mathematical construct?

[550] Ah, that's a great.

[551] So that's a good question.

[552] Right.

[553] So this is, I think it wasn't actually talked about as behavior in the paper that I was reading.

[554] I think that it just talked about learning.

[555] And to me, learning is about learning to behave.

[556] But really, Neural Nets at that point were about like supervised learning, so learning to produce outputs from inputs.

[557] So I kind of tried to invent reinforcement learning when I graduated, I joined a research group at Bell Corps, which had spun out of Bell Labs recently at that time because of the divestager of the long distance and local phone service in the 1980s, 1984.

[558] And I was in a group with Dave Ackley, who was the first author of the Boltzman machine paper, so the very first neural net paper that could handle XOR, right?

[559] So XOR sort of killed neural nets, the very first, the zero -with -order neural nets.

[560] Yeah, the Perceptron's paper.

[561] And Hinton, along with his student, Dave Ackley, and I think there was other authors as well, showed that, no, no, no, with bolts machines, we can actually learn nonlinear concepts.

[562] And so everything's back on the table again.

[563] And that kind of started that second wave of neural networks.

[564] So Dave Ackley was, he became my mentor at Belcour and we talked a lot about learning and life and computation and how all these things get together.

[565] Now Dave and I have a podcast together.

[566] So I get to kind of enjoy that sort of his perspective once again, even all these years later.

[567] And so I said, so I said I was really interested in learning but in the concept of behavior.

[568] And he's like, oh, well, that's reinforcement learning here.

[569] And he gave me Rich Sutton's 1984 TD paper.

[570] So I read that paper.

[571] I honestly didn't get all of it, but I got the idea.

[572] I got that they were using, that he was using ideas that I was familiar with in the context of neural nets and like sort of back prop.

[573] But with this idea of making predictions over time, I'm like, this is so interesting, but I don't really get all the details I said to Dave.

[574] And Dave said, oh, well, why don't we have him come and give a talk?

[575] And I was like, wait, what, you can do that?

[576] Like, these are real people?

[577] I thought they were just words.

[578] I thought it was just like ideas that somehow magically seeped into paper.

[579] He's like, no, I know Rich.

[580] Like, we'll just have him come down and he'll give a talk.

[581] And so I was, you know, my mind was blown.

[582] And so Rich came and he gave a talk at Belcourt.

[583] And he talked about what he was super excited, which was they had just figured out at the time, cue learning.

[584] So Watkins had visited the Rich Sutton's Lab at UMass or Andy Bar.

[585] Bartow's lab that Rich was a part of, and he was really excited about this because it resolved a whole bunch of problems that he didn't know how to resolve in the earlier paper.

[586] And so...

[587] For people who don't know, TD, temporal difference, these are all just algorithms for reinforcement learning.

[588] Right, and TD, temperate difference in particular is about making predictions over time, and you can try to use it for making decisions, right?

[589] Because if you can predict how good an action outcomes will be in the future, you can choose one that has better and or, but the theory didn't really support changing your behavior.

[590] Like the predictions had to be of a consistent process if you really wanted it to work.

[591] And one of the things that was really cool about Q learning, another algorithm for reinforcement learning is it was off policy, which meant that you could actually be learning about the environment and what the value of different actions would be while actually figuring out how to behave optimally.

[592] Yeah.

[593] So that was a revelation.

[594] Yeah, and the proof of that is kind of interesting.

[595] I mean, that's really surprising to me when I first read that and then in Rich Sutton's book on the matter.

[596] It's kind of beautiful that a single equation can capture all.

[597] One line of code, and like you can learn anything.

[598] Yeah, like, so equation and code, you're right.

[599] Like, you can, the code that you can arguably, at least if you like squint your eyes, can say this is all of intelligence, is that you can implement that I think I started with LISP, which is, shout out to Lisp, like a single line of code, a key piece of code, maybe a couple, that you could do that.

[600] It's kind of magical.

[601] It feels so good to be true.

[602] Well, and it sort of is.

[603] Yeah.

[604] It seems they require an awful lot of extra stuff supporting it.

[605] But nonetheless, the idea is really good.

[606] And as far as we know, it is a very reasonable way of trying to create adaptive behavior.

[607] behavior, behavior that gets better at something over time.

[608] Did you find the idea of optimal at all compelling?

[609] You could prove that it's optimal.

[610] So, like, one part of computer science that it makes people feel warm and fuzzy inside is when you can prove something like that a sorting algorithm, worst case, runs in log -in, and it makes everybody feel so good.

[611] Even though in reality, it doesn't really matter what the worst case is.

[612] What matters is, like, does this thing actually work in prime?

[613] practice on this particular actual set of data that I enjoy.

[614] Did you...

[615] So here's a place where I have maybe a strong opinion, which is like, you're right, of course, but no, no. So what makes worst case so great, right?

[616] If you have a worst case analysis so great, is that you get modularity.

[617] You can take that thing and plug it into another thing and still have some understanding of what's going to happen when you click them together, right?

[618] If it just works well in practice, in other words, with respect to some distribution that you care about, when you go plug it into another thing, that distribution can shift, it can change, and your thing may not work well anymore, and you want it to, and you wish it does, and you hope that it will, but it might not, and then, ah.

[619] So you're saying you don't like machine learning.

[620] But we have some positive theoretical results for these things.

[621] You know, you can come back at me. with, yeah, but they're really weak and, yeah, they're really weak.

[622] And you can even say that, you know, sorting algorithms, like if you do the optimal sorting algorithm, it's not really the one that you want.

[623] And that might be true as well.

[624] But it is, the modularity is a really powerful statement.

[625] I really like that.

[626] As an engineer, you can then assemble different things.

[627] You can count on them to be.

[628] I mean, it's interesting.

[629] It's a balance, like with everything else in life.

[630] You don't want to get too obsessed.

[631] I mean, this is what computer scientists do, which they tend to, like, get obsessed and they over -optimized things, or they start by optimizing them, and they over -optimized.

[632] So it's easy to, like, get really granular about this thing, but the step from an n -squared to an n -log -n sorting algorithm is a big leap for most real -world systems, no matter what the actual behavior of the system is, that's a big leap.

[633] And the same can probably be said for other kind of, the first leaps that you would take on a particular problem.

[634] Like, it's the picking the low -hanging fruit or whatever, the equivalent of doing the, not the dumbest thing, but the next to the dumbest thing.

[635] I see, picking the most delicious reachable fruit.

[636] Yeah, most delicious reachable fruit.

[637] I don't know why that's not a saying.

[638] Yeah.

[639] Okay, so you, then this is the 80s, and this kind of idea starts to percolate of learning.

[640] Yeah, well, at that point, I got to meet Rich Sutton.

[641] So everything was sort of downhill from there.

[642] And that was really the pinnacle of everything.

[643] But then I, you know, then I felt like I was kind of on the inside.

[644] So then as interesting results were happening, I could like check in with Rich or with Jerry Tesoro, who had a huge impact on kind of early thinking in temporal difference learning and reinforcement learning and show that you could do, you could solve problems that we didn't know how to solve any other way.

[645] And so that was really cool.

[646] So it was good things were happening.

[647] I would hear about it from either the people who were doing it or the people who were talking to the people who were doing it.

[648] And so I was able to track things pretty well through the 90s.

[649] So what wasn't most of the excitement on reinforcement learning in the 90s era with, what is it, TD Gamma?

[650] Like, what's the role of these kind of little, like, fun game playing things and breakthroughs about, you know, exciting the community?

[651] was that like what were your because you've also built across or part of building a crossword puzzle uh solver solving program uh called proverb so so you were interested in this as as a problem like in forming using games to understand how to build intelligent systems so like what did you think about td gamma like what did you think about that whole thing in the 90s?

[652] Yeah, I mean, I found the TD Gammon result really just remarkable.

[653] So I had known about some of Jerry's stuff before he did TD Gammon.

[654] He did a system just more vanilla, well, not entirely vanilla, but a more classical back -propy kind of network for playing backgammon, where he was training it on expert moves.

[655] So it was kind of supervised.

[656] But the way that it worked was not to mimic the actions, but to learn internally an evaluation function.

[657] So to learn, well, if the expert chose.

[658] this over this, that must mean that the expert values this more than this.

[659] And so let me adjust my weights to make it so that the network evaluates this as being better than this.

[660] So it could learn from human preferences.

[661] It could learn its own preferences.

[662] And then when he took the step from that to actually doing it as a full -on reinforcement learning problem where you didn't need a trainer, you could just let it play, that was remarkable.

[663] Right.

[664] And so I think as As humans often do, as we've done in the recent past as well, people extrapolate.

[665] It's like, oh, well, if you can do that, which is obviously very hard, then obviously you could do all these other problems that we want to solve that we know are also really hard.

[666] And it turned out very few of them ended up being practical, partly because I think neural nets, certainly at the time, we're struggling to be consistent and reliable.

[667] And so training them in a reinforcement learning setting was a bit of a mess.

[668] I had, I don't know, generation after generation of, like, master students who wanted to do value function approximation, basically reinforcement learning with neural nets.

[669] And over and over and over again, we were failing.

[670] We couldn't get the good results that Jerry Tesoro got.

[671] I now believe that Jerry is a neural net whisperer.

[672] He has a particular ability to get neural networks to do things that other people would find.

[673] impossible and it's not the technology it's the technology and jerry together yeah at which i think speaks to the role of the human expert in the process of machine learning right it's so easy we we're so drawn to the idea that that it's the technology that is that is where the power is coming from that i think we lose sight of the fact that sometimes you need a really good just like i mean no one would think hey here's this great piece of software here's like i don't Gnu Emacs or whatever.

[674] And doesn't that prove that computers are super powerful and basically going to take over the world?

[675] It's like, no, Stalman is a hell of a hacker, right?

[676] So he was able to make the code do these amazing things.

[677] He couldn't have done it without the computer, but the computer couldn't have done it without him.

[678] And so I think people discount the role of people like Jerry who have just a particular set of skills.

[679] On that topic, by the way, as a small side note, I tweeted.

[680] Emacs is greater than VIM yesterday and deleted the tweet 10 minutes later when I realized it started a war.

[681] I was like, oh, I was just kidding.

[682] I was just being, and I'm going to walk, walk back and so people still feel passionately about that particular piece of.

[683] Yeah, I don't get that because Emacs is clearly so much better.

[684] I don't understand.

[685] But, you know, why do I say that?

[686] Because I, like, I spent a block of time in the 80s.

[687] making my fingers know the Emacs keys.

[688] And now, like, that's part of the thought process for me. Like, I need to express.

[689] And if you take that, if you take my Emacs key bindings away, I become, I can't express myself.

[690] I'm the same way with the, I don't know if you know what it is, but it's a Kinesis keyboard, which is this butt -shaped keyboard.

[691] Yes, I've seen them.

[692] Yeah.

[693] And they're very, I don't know, sexy, elegant.

[694] They're just beautiful.

[695] fall.

[696] Yeah, they're gorgeous, way too expensive.

[697] But the problem with them, similar with Emacs, is once you learn to use it, it's harder to use other things.

[698] It's hard to use other things.

[699] There's this absurd thing where I have like small, elegant, lightweight, beautiful little laptops, and I'm sitting there in a coffee shop with a giant, can he just keep wearing it in a sexy little laptop.

[700] It's absurd, but it, you know, like I used to feel bad about it, but at the same time, You just kind of have to, sometimes it's back to the Billy Joel thing.

[701] You just have to throw that Billy Joel record and throw Taylor Swift and Justin Bieber to the wind.

[702] See, but I like them now because, again, I have no musical taste.

[703] Like, now that I've heard Justin Bieber enough, I'm like, I really like his songs.

[704] And Taylor Swift, not only do I like her songs, but my daughter's convinced that she's a genius.

[705] And so now I basically am signed on to that.

[706] So, yeah, that speaks to the back to the robustness of the human brain.

[707] than your plasticity, that you can just, you can just like a mouse, teach yourself to, or probably a dog, teach yourself to enjoy Taylor Swift.

[708] I'll try it out.

[709] I don't know.

[710] I try, you know what, it has to do with just like acclamation, right?

[711] Just like you said, a couple of weeks.

[712] Yeah.

[713] That's an interesting experiment.

[714] I'll actually try that.

[715] Like, I'll listen to it.

[716] That wasn't the intent of the experiment, just like social media.

[717] It wasn't intended as an experiment to see what we can take as a society, but it turned out that way.

[718] I don't think I'll be the same person on the other side of the week listening to Taylor Swift, but let's try.

[719] It's more compartmentalize.

[720] Don't be so worried.

[721] Like, I get that you can be worried, but don't be so worried because we compartmentalize really well.

[722] And so it won't bleed into other parts of your life.

[723] You won't start, I don't know, wearing red lipstick or whatever.

[724] Like, it's fine.

[725] It's fine.

[726] It's fine.

[727] But you know what?

[728] The thing you have to watch out for is you'll walk into a coffee shop once we can do that again.

[729] And recognize the song?

[730] And you'll be, no, you won't know that you're singing along until everybody in the coffee shop is looking at you.

[731] And then you're like, that wasn't me. Yeah, that's the, you know, people are afraid of AGI.

[732] I'm afraid of the Taylor, the Taylor Swift takeover.

[733] Yeah, and, I mean, people should know that TD Gammon was, I get, would you call it, do you like the terminology of self -play by any chance?

[734] So, like, systems that learn by playing themselves, just, I don't know if it's the best word, but.

[735] So what's the problem with that term?

[736] I don't know.

[737] silly it's like the big bang like it's it's like talking to serious physicists do you like the term big bang when when it was early i feel like it's the early days of self -play i don't know maybe it was used previously but i think it's been used by only a small group of people and so like i think we're still deciding is this ridiculously silly name a good name for the consp potentially one of the most important concepts in artificial intelligence okay depends how broadly you apply the term so i used the term in my 1996 Ph .D. dissertation.

[738] Oh, you, wow, the actual terms of...

[739] Yeah, because...

[740] Oh, I didn't know.