Lex Fridman Podcast XX
[0] The following is a conversation with Ben Gertzel, one of the most interesting minds in the artificial intelligence community.
[1] He's the founder of SingularityNet, designer of OpenCog A .I. Framework, formerly a director of research at the Machine Intelligence Research Institute, and she's scientist of Hanson Robotics, the company that created the Sophia robot.
[2] He has been a central figure in the AGI community for many years, including in his organizing and contributing to the conference and artificial general intelligence.
[3] intelligence, the 2020 version of which is actually happening this week, Wednesday, Thursday, and Friday.
[4] It's virtual and free.
[5] I encourage you to check out the talks, including by Yosha Bach, from episode 101 of this podcast.
[6] Quick summary of the ads, two sponsors, the Jordan Harbinger Show and Masterclass.
[7] Please consider supporting this podcast by going to jordanharbinger .com slash Lex and signing up a masterclass .com Click the links, buy all the stuff.
[8] It's the best way to support this podcast and the journey I'm on in my research and startup.
[9] This is the Artificial Intelligence Podcast.
[10] If you enjoy it, subscribe by on YouTube, review it with five stars on Apple Podcast, support it on Patreon, or connect with me on Twitter, Alex Friedman, spelled without the E, just F -R -I -D -M -A -N.
[11] As usual, I'll do a few minutes of ads now and never any ads in the middle that can break the flow of the content.
[12] conversation.
[13] This episode is supported by the Jordan Harbinger Show.
[14] Go to jordanharbiger .com slash Lex.
[15] It's how he knows I sent you.
[16] On that page, there's links to subscribe to it on Apple Podcasts, Spotify, and everywhere else.
[17] I've been binging on his podcast.
[18] Jordan is great.
[19] He gets the best out of his guest, dives deep, calls them out when it's needed, and makes the whole thing fun to listen to.
[20] He's interviewed, Kobe Bryant, Mark Cuban, Neil deGrasse Tyson, Karakasparov, and many more.
[21] This conversation with Kobe is a reminder how much focus and hard work is acquired for greatness in sport, business, and life.
[22] I highly recommend the episode if you want to be inspired.
[23] Again, go to Jordanharbinger .com slash Lex.
[24] It's how Jordan knows I sent you.
[25] This show, sponsored by Masterclass.
[26] Sign up at masterclass .com slash Lex to get a discount and to support.
[27] this podcast.
[28] When I first heard about Masterclass, I thought it was too good to be true.
[29] For 180 bucks a year, you get an all -access pass to watch courses from to list some of my favorites.
[30] Chris Hadfield on space exploration, near the Grass Tyson on scientific thinking and communication, will write, creator of the greatest city building game ever, SimCity, and Sims on game design, Carlos Santana on guitar, Gary Kasparov, the greatest chess player ever on chess, Daniel Negrano on poker, and many more.
[31] Chris Hadfield explaining how rockets work and the experience of being launched into space alone is worth of money.
[32] Once again, sign up on masterclass .com slash Lex to get a discount and to support this podcast.
[33] And now here's my conversation with Ben Gertzel.
[34] Wood books, authors, ideas had a lot of impact on you in your life in the early days you know what got me into AI and science fiction and such in the first place wasn't a book but the original star trek tv show which my dad watched with me like in its first run it would have been like 1968 69 or something and that that was incredible because every every show they visited a different a different alien civilization with different culture and weird mechanisms But that got me into science fiction, and there wasn't that much science fiction to watch on TV at that stage.
[35] That got me into reading the whole literature of science fiction from the beginning of the previous century until that time.
[36] I mean, there was so many science fiction writers who were inspirational to me. I'd say, if I had to pick two, it would have been Stanis Lalem, the Polish writer.
[37] Yeah, Solaris, and then he had a bunch of more obscure writings on superhuman AIs that were engineered.
[38] Solaris was sort of a superhuman naturally occurring intelligence than Philip K. Dick, who, you know, ultimately my fandom for Philip K. Dick is one of the things that brought me together with David Hansen, my collaborator on robotics project.
[39] So, you know, Stanis Lalum was very much an intellectual, right?
[40] So he had a very broad view of intelligence going beyond the human and into what I would call, you know, open -ended superintelligence.
[41] The Solaris super -intelligent ocean was intelligent, in some ways more generally intelligent than people, but in a complex and confusing way so that human beings could never quite connect to it, but it was still probably very, very smart.
[42] And then the Golem 4 supercomputer in one of Lem's books, this was engineered by people.
[43] But eventually it became very intelligent in a different direction than humans and decided that humans were kind of trivial and not that interesting.
[44] So it put some impenetrable shield around itself, shut itself off from humanity, and then issued some philosophical screed about the pathetic and hopeless nature.
[45] of humanity and all human thought and then disappeared.
[46] Now, Philip K. Dick, he was a bit different.
[47] He was human -focused, right?
[48] His main thing was, you know, human compassion and the human heart and soul are going to be the constant that will keep us going through whatever aliens we discover or telepathy machines or super AIs or whatever it might be.
[49] So he didn't believe in reality, like the reality that we see, be a simulation or a dream or something else we can't even comprehend, but he believed in love and compassion as something persistent through the various simulated realities.
[50] So those two science fiction writers had a huge impact on me. Then a little older than that, I got into Dostoevsky and Friedrich Nietzsche and Rambod and a bunch of more literary typewriting.
[51] Can we talk about some of those things?
[52] So on the Salara side, Stanisov -Lam, this kind of idea of there being intelligences out there that are different than our own.
[53] Do you think there are intelligences maybe all around us that we're not able to even detect?
[54] So this kind of idea of maybe you can comment also on Stephen Wolfram thinking that there's computations all around us and we're just not smart enough to kind of detect their intelligence or appreciate their intelligence.
[55] Yes, so my friend Hugo Degaris, who I've been talking to about these things for many decades since their early 90s, he had an idea he called SIPI, the search for intra -particulate intelligence.
[56] So the concept there was, as AIs get smarter and smarter and smarter, you know, assuming the laws of physics, as we know them now, are still what these superintelligence is perceived to hold and are bound by, as they get smarter and smarter, they're going to shrink themselves littler and littler because special relativity makes it sort of communicate between two spatially distant points.
[57] So they're going to get smaller and smaller.
[58] But then ultimately, what does that mean?
[59] the minds of the super, super, super intelligences, they're going to be packed into the interaction of elementary particles or quarks or the partons inside quarks or whatever it is.
[60] So what we perceive as random fluctuations on the quantum or sub -quantum level may actually be the thoughts of the micro, micro, micro -miniaturized super -intelligences because there's no way we can tell random from structured but with an algorithmic information more complex in our brain.
[61] right we can't tell the difference so what we think is random could be the thought processes of some really tiny super minds and if so there's not a damn thing we can do about it except you know try to upgrade our intelligences and expand our minds so that we can we can perceive more of what's around us but if those random fluctuations like even if we go to like quantum mechanics if that if that's actually uh super intelligent systems aren't we then part of this soup of superintelligence the word aren't we just like like a finger of the entirety of the body of the superintelligence system we could be i mean a finger is a is a strange metaphor i mean we we a finger is dumb is what i mean is uh but a finger is also useful and is controlled with with intent by the brain whereas we may be much less than that right i mean i mean yeah we may be just some random epiphenomenon that they don't care about too much like think about the the shape of the crowd emanating from a sports stadium or something right there's some emergent shape to the crowd it's there you can take a picture of it it's kind of cool it's irrelevant to the main point of the sports event or where the people are going or or what's on the minds of the people making that shape in the crowd right so we we may just be some semi -arbitrary higher level pattern popping out of a lower level hyper -intelligent self -organization.
[62] And I mean, so be it, right?
[63] I mean, that's one thing that's still a fun ride.
[64] Yeah, I mean, the older I've gotten, the more respect I've achieved for our fundamental ignorance.
[65] I mean, mine and everybody else's.
[66] I look at my two dogs, two beautiful little toy poodles, and, you know, they watch me sitting at the computer typing.
[67] They just think I'm sitting there wiggling my fingers to exercise.
[68] and maybe, or guarding the monitor on the desk, that they have no idea that I'm communicating with other people halfway around the world, let alone creating complex algorithms running in RAM on some computer server in St. Petersburg or something, right?
[69] Although they're right there, they're right there in the room with me. So what things are there right around us that we're just too stupid or close -minded to comprehend?
[70] Probably quite a lot.
[71] Your very poodle could also be communicating across multiple dimensions.
[72] with other beings, and you're too unintelligent to understand the kind of communication mechanism they're going through.
[73] There have been various TV shows and science fiction novels, positing cats, dolphins, mice, and whatnot are actually superintelligence is here to observe that.
[74] I would guess, as one of the other quantum physics founders said, those theories are not crazy enough to be true.
[75] The reality is probably crazier than that.
[76] Beautifully put.
[77] So on the human side with Philip K. Dick and in general, where do you fall on this idea that love and just the basic spirit of human nature persists throughout these multiple realities?
[78] Are you on the side?
[79] Like, the thing that inspires you about artificial intelligence, is it the human side of somehow persisting through all, of the different systems we engineer, or is AI inspire you to create something that's greater than human, that's beyond human, that's almost non -human?
[80] I would say my motivation to create AGI comes from both of those directions, actually.
[81] So when I first became passionate about AGI when I was, it would have been two or three years old after watching robots on Star Trek.
[82] I mean, then it was really, a combination of intellectual curiosity, like can a machine really think?
[83] How would you do that?
[84] And, yeah, just ambition to create something much better than all the clearly limited and fundamentally defective humans I saw around me. Then as I got older and got more enmeshed in the human world and, you know, got married, had children, so my parents begin to age, I started to realize, well, not only will age you, I let you go far beyond the limitations of the human, but it could also stop us from dying and suffering and feeling pain and tormenting ourselves mentally.
[85] So you can see AGI has amazing capability to do good for humans, as humans, alongside with its capability to go far, far beyond the human level.
[86] So, I mean, both aspects are there, which makes it even more exciting and important.
[87] So you mentioned Dostoevsky in nature, what did you pick up from those guys?
[88] That would probably go beyond the scope of a brief interview, certainly.
[89] Both of those are amazing thinkers who one will necessarily have a complex relationship with, right?
[90] So, I mean, Dostoevsky, on the minus side, he's kind of a religious fanatic, and he's sort of helped squash the Russian nihilist movement, which was very interesting.
[91] Because what nihilism meant originally in that period of the mid -late 1800s in Russia was not taking anything fully 100 % for grand.
[92] It was really more like what we'd call Bayesianism now, where you don't want to adopt anything as a dogmatic certitude and always leave your mind open.
[93] And how Dostoevsky parodied nihilism was a bit different, right?
[94] He parodied is people who believe absolutely nothing.
[95] So they must assign an equal probability weight to every proposition, which doesn't really work.
[96] So on the one hand, I didn't really agree with Dostoevsky on his sort of religious point of view.
[97] On the other hand, if you look at his understanding of human nature and sort of the human mind and heart and soul, it's really unparalleled.
[98] And he had an amazing view of how human beings, you know, construct a world for themselves based on their own understanding and their own mental predisposition.
[99] And I think if you look in the brothers Karamazov in particular, that the Russian literary theorist Mikhail Bakhtin wrote about this as a polyphonic mode of fiction, which means it's not third person, but it's not first person from any one person, really.
[100] There are many different characters in the novel, and each of them is sort of telling part of the story from their own point of view.
[101] So the reality of the whole story is an intersection, like synergetically, of the many different characters' worldviews.
[102] And that really, it's a beautiful metaphor and even a reflection, I think, of how all of us socially create our reality.
[103] Like each of us sees the world in a certain way.
[104] each of us, in a sense, is making the world, as we see it, based on our own minds and understanding.
[105] But it's polyphony, like in music where multiple instruments are coming together to create the sound.
[106] The ultimate reality that's created comes out of each of our subjective understandings, you know, intersecting with each other.
[107] And that was one of the many beautiful things in Dostoevsky.
[108] So maybe a little bit to mention, you have a connection to Russia and the...
[109] Soviet culture.
[110] I mean, I'm not sure exactly what the nature of the connection is, but at least the spirit of your thinking is in there.
[111] Well, my, my ancestry is three -quarters Eastern European Jewish.
[112] So, I mean, three of my great -grandparents emigrated to New York from Lithuania and sort of border regions of Poland, which were in and out of Poland, in around the town of World War I. And they were socialists and communists as well as Jews, mostly Menshevik, not Bolshevik.
[113] And they sort of, they fled at just the right time to the U .S. for their own personal reasons.
[114] And then almost all or maybe all of my extended family that remained in Eastern Europe was killed either by Hitler's or Stalin's minions at some point.
[115] So the branch of the family then emigrated to the U .S. was pretty much the only one.
[116] so how much of the spirit of the people is in your blood still like do you when you look in the mirror do you see uh what do you see me i see a bag of meat that i want to transcend by uploading into some sort of superior reality but very i mean yeah very clearly i mean i'm i'm not religious in a traditional sense but clearly the the eastern european jewish tradition was what I was raised in.
[117] I mean, there was my grandfather, Leozwell, was a physical chemist to work with Linus Pauling and a bunch of the other early grates in quantum mechanics.
[118] I mean, he was, he was into x -ray diffraction.
[119] He was on the material science side, experimentalist rather than a theorist.
[120] His sister was also a physicist.
[121] My father's father, Victor Gertzel, was a PhD in psychology who had the unenviable job of giving psychotherapy to the Japanese in internment camps in the U .S. in World War II, like to counsel them why they shouldn't kill themselves, even though they'd had all their stuff taken away and been imprisoned for no good reason.
[122] So, I mean, there's a lot of Eastern European Jewishness in my background.
[123] One of my great Uncle's was, I guess, conductor of San Francisco Orchestra.
[124] So there's a lot of Mickey Salkin.
[125] A bunch of music in there also.
[126] And clearly, this culture was all about learning and understanding the world and also not quite taking yourself too seriously while you do it.
[127] There's a lot of Yiddish humor in there.
[128] So I do appreciate that culture, although the whole idea that, like the Jews are the chosen people of God never resonated with me too much.
[129] The graph of the Grisel family, I mean, just the people I've encountered just doing some research and just knowing your work through the decades, it's kind of fascinating.
[130] Just the number of PhDs.
[131] Yeah, yeah.
[132] I mean, my dad is a sociology professor who recently retired from Rutgers University.
[133] But that clearly that gave me a head start in life.
[134] I mean, my grandfather gave me all this quantum mechanics books, and I was like seven or eight years old.
[135] I remember going through them, and it was all the old quantum mechanics, like Rutherford Adams and stuff.
[136] So I got to the part of wave functions, which I didn't understand, although I was a very bright kid.
[137] And I realized he didn't quite understand it either, but at least, like, he pointed me to some professor he knew at UPenn nearby, who understood these things, right?
[138] So that's an unusual opportunity for a kid to have.
[139] right.
[140] My dad, he was programming Fortran when I was 10 or 11 years old on like HV 3 ,000 mainframes at Rutgers University.
[141] So I got to do linear regression in Fortran on punch cards when I was in middle school, right?
[142] Because he was doing, I guess, analysis of demographic and sociology data.
[143] So yes, certainly, certainly that gave me a head start and a push towards science beyond what would have been the case with many, many different situations.
[144] When did you first fall in love with AI?
[145] Is it the programming side of Fortran?
[146] Is it maybe the sociology, psychology that you picked up from your dad?
[147] I fell in love with AI when I was probably three years old when I saw a robot on Star Trek.
[148] It was turning around in a circle going, error, error, error, error, because Spock and Kirk had tricked it into a mechanical breakdown by presenting it with a logical paradox.
[149] And I was just like, well, this makes no sense.
[150] This AI is very, very smart.
[151] It's been traveling all around the universe, but these people could trick it with a simple logical paradox.
[152] Like, why, if, you know, if the human brain can get beyond that paradox, why can't, why can't this AI?
[153] So I felt the screenwriters of Star Trek had misunderstood the nature of intelligence.
[154] And I complained to my dad about it, and he wasn't going to say anything one way or the other.
[155] But, you know, before I, I was born when my dad was at Antioch College in the middle of the U .S. He led a protest movement called Slam, Student League Against Mortality.
[156] They were protesting against death wandering across the campus.
[157] So he was into some futuristic things, even back then, but whether AI could confront logical paradoxes or not, he didn't know.
[158] But that, you know, when I, 10 years after that is something, I discovered.
[159] Douglas Hofstadter's book, Gordle Escherbach, and that was sort of to the same point of AI and paradox and logic, right?
[160] Because he was over and over and over with Gordle's incompleteness theorem.
[161] And can an AI really fully model itself reflexively, or does that lead you into some paradox?
[162] Can the human mind truly model itself reflexively, or does that lead you into some paradox?
[163] So I think that book, Gordel Escher Bach, which I think I read when it first came out, it would have been 12 years old or something.
[164] I remember it was.
[165] like 16 hour day.
[166] I read it cover to cover and then re -read it.
[167] Oh, really?
[168] I reread it after that because there was a lot of weird things with little formal systems in there that were hard for me at the time.
[169] But that was the first book I read that gave me a feeling for AI as like a practical academic or engineering discipline that people were working in because before I read Cordillus or Bach, I was into AI from the point of view of a science fiction fan.
[170] And I had the idea, well, it may be a long time before we can achieve immortality in superhuman AGI.
[171] So I should figure out how to build a spacecraft traveling close to the speed of light, go far away, then come back to the Earth in a million years when technology is more advanced and we can build these things.
[172] Reading Gordalesh or Bach, while it didn't all ring true to me, a lot of it did, but I could see like there are smart people right now at various universities around me who are actually trying to work on building what I would now call AGI, although Hofstadter didn't call it that.
[173] So really, it was when I read that book, which would have been probably middle school, then I started to think, well, this is something that I could practically work on.
[174] Yeah, it's supposed to flying away and waiting it out.
[175] You can actually be one of the people that actually builds the system.
[176] And if you think about, I mean, I was interested in what we'd now call nanotechnology.
[177] in the human immortality and time travel, all the same cool things as every other, like science fiction loving kid.
[178] But AI seemed like if Hofstadter was right, you just figure out the right programs sit there and type it.
[179] Like you don't need to spin stars into weird configurations or get government approval to cut people up and fill it with their DNA or something, right?
[180] It's just programming.
[181] And then, of course, that can achieve anything else.
[182] There's another book from back then, which was by what a Feimbaum, Gerald Feinbaum, who was a physicist at Princeton.
[183] And that was the Prometheus Project.
[184] And this book was written in the late 1960s, though I encountered it in the mid -70s.
[185] But what this book said is in the next few decades, humanity is going to create superhuman thinking machines, molecular nanotechnology, and human immortality.
[186] And then the challenge will have is, what to do with it?
[187] Do we use it to expand human consciousness in a positive direction?
[188] Or do we use it just to further vapid consumerism?
[189] And what he proposed was that the UN should do a survey on this.
[190] And the UN should send people out to every little village in remotest Africa or South America and explain to everyone what technology was going to bring the next few decades and the choice that we had about how to use it and let everyone on the whole planet vote about whether we should develop, you know, super AI nanotechnology and immortality for expanded consciousness or for rampant consumerism.
[191] And needless to say, that didn't quite happen.
[192] And I think this guy died in the mid -80s, so he didn't even see his ideas start to become more mainstream.
[193] But it's interesting, many of the themes I'm engaged with now from AGI and immortality, even to trying to democratize technology, as I've been pushing for with singularity in my work in the blockchain world.
[194] Many of these themes were there in, you know, Feinbaum's book in the late 60s even.
[195] And of course, Valentin Turchin, a Russian writer and a great Russian physicist who I got to know when we both lived in New York in the late 90s and early arts.
[196] I mean, he had a book in the late 60s in Russia, which was the phenomenon of science, which laid out all these same things.
[197] as well.
[198] And Val died in, I don't remember, 2004 -5 or something of Parkinson'sism.
[199] So yeah, it's easy for people to lose track now of the fact that the futurist and singularitarian advanced technology ideas that are now almost mainstream and are on TV all the time.
[200] I mean, these are not that new, right?
[201] They're sort of new in the history of the human species.
[202] But I mean, these are all around in fairly mature form in the middle of the last century were written about quite articulately by fairly mainstream people who were professors at top universities.
[203] It's just until the enabling technologies got to a certain point, then you couldn't make it real.
[204] So even in the 70s, I was sort of seeing that and living through it, right?
[205] From Star Trek to Douglas Hofstader, things were getting done.
[206] very, very practical from the late 60s to the late 70s.
[207] And, you know, the first computer I bought, you could only program with hexadecimal machine code, and you had to solder it together.
[208] And then, like a few years later, there's punch cards, and a few years later, you could get, like, Atari 400 and Commodore VIC -20, and you could type on the keyboard and program in higher -level languages alongside the assembly language.
[209] So these ideas have been building up a while, And I guess my generation got to feel them build up, which is different than people coming into the field now for whom these things have just been part of the ambiance of culture for their whole career, or even their whole life.
[210] Well, it's fascinating to think about, you know, there being all of these ideas kind of swimming, you know, almost with a noise all around the world, all the different generations.
[211] And then some kind of nonlinear thing happens where they percolate.
[212] up and capture the imagination of the mainstream.
[213] And that seems to be what's happening with AI now.
[214] I mean, Nietzsche, who you mentioned, had the idea of the Superman, right?
[215] But he didn't understand enough about technology to think you could physically engineer a Superman by piecing together molecules in a certain way.
[216] He was a bit vague about how the Superman would appear, but he was quite deep at thinking about what the state of consciousness and the mode of cognition of a Superman would be.
[217] He was a very astute analyst of how the human mind constructs the illusion of a self, how it constructs the illusion of free will, how it constructs values like good and evil out of its own desire to maintain and advance its own organism.
[218] He understood a lot about how human minds work.
[219] Then he understood a lot about how post -human minds would work.
[220] I mean, this Superman was supposed to be a mind that would basically have complete root access to its own brain and consciousness and be able to architect its own value system and inspect and fine -tune all of its own biases.
[221] So that's a lot of powerful thinking there, which then fed in and sort of seeded all of post -modern continental philosophy and all sorts of.
[222] of things have been very valuable in development of culture and indirectly even of technology.
[223] But, of course, without the technology there, it was all some quite abstract thinking.
[224] So now we're at a time in history when a lot of these ideas can be made real, which is amazing and scary, right?
[225] It's kind of interesting to think, what do you think Nietzsche would, if he was born a century later, or transported through time?
[226] What do you think you would say about AI?
[227] Well, those are quite different.
[228] If he's born a century later, we're transported through time.
[229] Well, he'd be on, like, TikTok and Instagram, and he would never write the great works he's written.
[230] So let's transport him through time.
[231] Maybe also Spragzarathustra would be a music video, right?
[232] I mean, who knows?
[233] Yeah, but if he was transported through time, do you think...
[234] Ed, that'd be interesting, actually, to go back.
[235] You just made me realize that it's possible to go back and read Nietzsche with an eye of, is there some thinking about artificial beings?
[236] I'm sure he had inklings.
[237] I mean, with Frankenstein before him, I'm sure he had inklings of artificial beings somewhere in the text.
[238] It'd be interesting to try to read his work to see if he hadn't, if Superman was actually an AGI system, like if he had inklings of that kind of thinking.
[239] He didn't.
[240] No, I would say not.
[241] I mean, he had a lot of inklings of modern cognitive science, which are very interesting.
[242] If you look in like the third part of the collection that's been titled, The Will to Power, I mean, in book three there, there's very deep analysis of thinking processes.
[243] But he wasn't so much of a physical tinkerer type guy, right?
[244] It was very abstract.
[245] And do you think, what do you think about the will to power?
[246] Do you think human, what do you think drives humans?
[247] Is it, is it?
[248] Oh, an unholy mix of things.
[249] I don't think there's one pure, simple, and elegant objective function driving humans by any means.
[250] What, do you think, if we look at, I know it's hard to look at humans in an aggregate, but do you think overall humans are good?
[251] Do we have both good and evil within us, depending on the circumstances, depending on whatever can percolate to the top?
[252] Good and evil are very ambiguous, complicated, and in some ways silly concepts.
[253] But if we could dig into your question from a couple directions.
[254] So I think if you look in evolution, humanity is shaped both by individual selection and what biologists would call group selection, like tribe level selection, right?
[255] So individual selection has driven us in a selfish DNA sort of way so that each of us does to a certain approximation what will help us propagate our DNA to future generations.
[256] I mean, that's why I've got to have four kids so far and probably that's not the last one.
[257] On the other hand...
[258] I like the ambition.
[259] tribal like group selection means humans in a way will do what what will advocate for the persistence of the DNA of their whole tribe or their social group and in biology you have both of these right like a and you can see say an ant colony or a beehive there's a lot of group selection in the evolution of those social animals on the other hand say a big cat or some very solitary animal it's a lot more biased toward individual selection humans are an interesting balance.
[260] And I think this reflects itself in what we would view as selfishness versus altruism to some extent.
[261] So we just have both of those objective functions contributing to the makeup of our brains.
[262] And then as Nietzsche analyzed in his own way and others have analyzed in different ways, I mean, we abstract this as well, we have both good and evil within us, right?
[263] Because a lot of what we view as evil is really just selfishness.
[264] A lot of what we view as good is altruism, which means doing what's good for the tribe.
[265] And on that level, we have both of those just baked into us, and that's how it is.
[266] Of course, there are psychopaths and sociopaths and people who get gratified by the suffering of others, and that's a different thing.
[267] Yeah, those are exceptions, but on the whole.
[268] I think at core, we're not purely selfish, we're not purely altruistic.
[269] We are a mix, and that's the nature of it.
[270] And we also have a complex constellation of values that are just very specific to our evolution.
[271] history.
[272] Like we, you know, we, we love waterways and mountains and the ideal place to put a house as in a mountain overlooking the water, right?
[273] And, you know, we, we care a lot about our kids and we care a little less about our cousins and even less about our fifth cousins.
[274] I mean, there are many particularities to human values, which, whether they're good or evil, depends on your, on your perspective.
[275] Really, say, I, I spent a lot of time.
[276] in Ethiopia in Addis Ababa, where we have one of our AI development offices for my SingularityNet project.
[277] And when I walk through the streets in Otis, you know, there's so, there's people lying by the side of the road, like just living there by the side of the road, dying probably of curable diseases without enough food or medicine.
[278] And when I walk by them, you know, I feel terrible.
[279] I give them money.
[280] When I come back home to the developed world, they're not on my mind that much.
[281] I do donate some, but I mean, I also spend some of the limited money I have enjoying myself in frivolous ways rather than donating it to those people who are right now like starving, dying, and suffering on the roadside.
[282] So does that make me evil?
[283] I mean, it makes me somewhat selfish and somewhat altruistic.
[284] And we each balance that in our own way, right?
[285] So that's, whether that will be true of all possible AGI, is a subtler question.
[286] That's how humans are.
[287] So you have a sense, you kind of mention that there's a selfish, I'm not going to bring up the whole Ayn Rand idea of selfishness being the core virtue.
[288] That's a whole interesting kind of tangent that I think will just distract ourselves on it.
[289] I have to make one amusing comment or comment that has amused me anyway.
[290] So I have extraordinary negative respect, for Ayn Rand.
[291] What's a negative respect?
[292] But when I work with a company called Genescent, which was evolving flies to have extraordinary long lives in Southern California.
[293] So we had flies that were evolved by artificial selection to have five times a lifespan of normal fruit flies.
[294] But the population of super long -lived flies was physically sitting in a spare room at an Ayn Rand Elementary.
[295] school in Southern California.
[296] So that was just like, well, if I saw this in a movie, I wouldn't believe it.
[297] Well, yeah, the universe has a sense of humor in that kind of way.
[298] That fits in.
[299] Humor fits in somehow into this whole absurd existence.
[300] But you mentioned the balance between selfishness and altruism as kind of being innate.
[301] Do you think it's possible that's kind of an emergent phenomenon, those peculiarities of our value system, how much of it?
[302] is innate, how much of it is something we collectively kind of like a Dostoevsky novel bring to life together as a civilization?
[303] I mean, the answer to nature versus nurture is usually both.
[304] And of course, it's nature versus nurture versus self -organization, as you mentioned.
[305] So clearly, there are evolutionary roots to individual and group selection leading to a mix of selfishness and altruism.
[306] On the other hand, different cultures.
[307] cultures manifest that in different ways.
[308] Well, we all have basically the same biology.
[309] And if you look at sort of pre -civilized cultures, you have tribes like the Yanomamo in Venezuela, which their culture is focused on killing other tribes.
[310] And you have other Stone Age tribes that are mostly peaceful and have big taboos against violence.
[311] So you can certainly have a big difference in how culture manifests these innate biological characteristics, but still, you know, there's probably limits that are given by our biology.
[312] I used to argue this with my great -grandparents who were Marxists, actually, because they believed in the withering away of the state.
[313] Like, they believe that, you know, as you move from capitalism to socialism to communism, people would just become more social -minded so that a state would be unnecessary and people would just give everyone would give everyone else what they needed.
[314] Now, setting aside that that's not what the various Marxist experiments on the planet seemed to be heading toward in practice.
[315] Just as a theoretical point, I was very dubious that human nature could go there.
[316] At that time, when my great -grandparents are alive, I was just like, you know, I'm a cynical team, I think humans are just jerks.
[317] The state is not going to wither away.
[318] If you don't have some structure, keeping people from screwing each other over, they're going to do it.
[319] So now I actually don't quite see things that way.
[320] I mean, I think my feeling now, subjectively, is the culture aspect is more significant than I thought it was when I was a teenager.
[321] And I think you could have a human society that was dialed dramatically further toward, you know, self -awareness, other awareness, compassion, and sharing than our current society.
[322] And of course, greater material abundance helps.
[323] But to some extent, material abundance is a subjective perception also because many Stone Age cultures perceive themselves as living in great material abundance.
[324] They had all the food and water they wanted.
[325] They lived in a beautiful place.
[326] They had sex lives.
[327] They had children.
[328] I mean, they had abundance without any factories, right?
[329] So I think humanity probably would be capable of fundamentally more positive and joy -filled mode of social existence than what we have now.
[330] Clearly, Marx didn't quite have the right idea about how to get there.
[331] I mean, he missed a number of key aspects of human society and its evolution.
[332] And if we look at where we are in society now, how to get there is.
[333] is a quite different question, because they're very powerful forces pushing people in different directions than a positive, joyous, compassionate existence, right?
[334] So if we were tried to, you know, Elon Musk is dreams of colonizing Mars at the moment, so we, maybe you'll have a chance to start a new civilization with a new governmental system, and certainly there's quite a bit of chaos.
[335] We're sitting now, I don't know what the date is, but this is June.
[336] There's quite a bit of chaos and all different forms going on in the United States and all over the world.
[337] So there's a hunger for new types of governments, new types of leadership, new types of systems.
[338] And so what are the forces at play?
[339] And how do we move forward?
[340] Yeah, I mean, colonizing Mars, first of all, it's a super cool thing to do.
[341] We should be doing it.
[342] So you love the idea.
[343] Yeah.
[344] I mean, it's more important.
[345] It's more important than making chocolate of your chocolates and sexier lingerie and many of the things that we spend a lot more resources on as a species, right?
[346] So, I mean, we certainly should do it.
[347] I think the possible future is in which a Mars colony makes a critical difference for humanity are very few.
[348] I mean, I think, I mean, assuming we make a Mars colony and people go live there in a couple decades.
[349] I mean, their supplies are going to come from Earth.
[350] The money to make the colony came from Earth, and whatever powers are supplying the goods there from Earth are going to, in effect, be in control of that Mars colony.
[351] Of course, there are outlier situations where, you know, Earth gets nuked into oblivion, and somehow Mars has been made self -sustaining by that point.
[352] And and then Mars is what allows humanity to persist.
[353] But I think that those are very, very, very unlikely.
[354] Do you don't think it could be a first step on a long journey?
[355] Of course, it's a first step on a long journey, which is awesome.
[356] I'm guessing the colonization of the rest of the physical universe will probably be done by AGIs that are better designed to live in space than by the meat machines that will.
[357] are.
[358] But I mean, who knows, we may cryopreserve ourselves in some superior way to what we know now and, like, shoot ourselves out to Alpha Centaurium beyond.
[359] I mean, that's all cool.
[360] It's very interesting, and it's much more valuable than most things that humanity is spending its resources on.
[361] On the other hand, with AGI, we can get to a singularity before the Mars colony becomes sustaining, for sure, possibly before it's even operational.
[362] So your intuition is, that's the problem if we really invest resources and we can get too faster than a legitimate full, like, self -sustaining colonization of Mars.
[363] Yeah, and it's very clear that we will to me because there's so much economic value in getting from narrow AI toward AGI, whereas the Mars colony, there's less economic value until you get quite far out into the future.
[364] So I think that's very interesting.
[365] I just think it's somewhat, somewhat off to the side.
[366] I mean, just as I think say, you know, art and music are very, very interesting.
[367] And I want to see resources go into amazing art and music being created.
[368] And I'd rather see that than a lot of the garbage that society spends their money on.
[369] On the other hand, I don't think Mars colonization or inventing amazing new genres of music is not one of the things that is most likely to make a critical difference in the evolution of human or non -human life in this part of the universe over the next decade.
[370] Do you think AGI is really...
[371] AGI is by far the most important thing that's on the horizon and then technologies that have direct ability to enable AGI or to enable AGI or to accelerate AGI are also very important.
[372] For example, say, quantum computing.
[373] I don't think that's critical to achieve AGI, but certainly you could see how the right quantum computing architecture could massively accelerate AGI, similar other types of nanotechnology.
[374] Right now, the quest to cure aging and end disease, while not in the big picture as important as AGI, Of course, it's important to all of us as individual humans.
[375] And if someone made a super longevity pill and distributed it tomorrow, I mean, that would be huge and a much larger impact than a Mars colony is going to have for quite some time.
[376] But perhaps not as much as an AGI system.
[377] No, because if you can make a benevolent AGI, then all the other problems are solved.
[378] I mean, then the AGI can be, once it's as generally intelligent as humans, it can rapidly become massively more generally intelligent than humans.
[379] And then that AGI should be able to solve science and engineering problems much better than human beings, as long as it is, in fact, motivated to do so.
[380] That's why I said a benevolent AGI.
[381] There could be other kinds.
[382] Maybe it's good to step back a little bit.
[383] I mean, we've been using the term AGI.
[384] People often cite you as the creator, or at least the popularizer of the term AGI, artificial general intelligence.
[385] Can you tell the origin story of the term?
[386] For sure.
[387] So, yeah, I would say I launched the term AGI upon the world for what it's worth, without ever fully being in love with the term.
[388] Right.
[389] What happened is I was editing a book, and this.
[390] process started around 2001 or two.
[391] I think the book came out 2005 only.
[392] I was editing a book which I provisionally was titling real AI.
[393] And I mean, the goal was to gather together fairly serious academic -ish papers on the topic of making thinking machines that could really think in the sense like people can or even more broadly than people can.
[394] So then I was reaching out to other folks that I had encountered here or there who were interested in that, which included some other folks out of the, who I knew from the transumist and singularitarian world, like Peter Vos, who has a company, AGI incorporated still in California, and included Shane Legg, who had worked for me at my company WebMind in New York in the late 90s, who by now has become rich and famous.
[395] He was one of the co -founders of Google deep mind.
[396] But at that time, Shane was, I think he may have been, have just started doing his PhD with Marcus Hooter, who at that time hadn't yet published his book, Universal AI, which sort of gives a mathematical foundation for artificial general intelligence.
[397] So I reached out to Shane and Marcus and Peter Vos and Pei Wang, who was another former employee of mine, who had been Douglas Hofstadter's PG student, who had his own approach to AGI, and a bunch of that, some Russian folks reached out to these guys, and they contributed papers for the book.
[398] But that was my provisional title, but I never loved it, because in the end, you know, I was doing some what we would now call narrow AI as well, like applying machine learning to genomics data or chat data for sentiment analysis.
[399] I mean, that work is real, and in a sense, in a sense, it's really AI.
[400] It's just a different kind of, kind of AI.
[401] Ray Kurzweil wrote about narrow AI versus strong AI.
[402] But that seemed weird to me because, first of all, narrow and strong are not Antenis.
[403] But secondly, strong AI was used in the cognitive science literature to mean the hypothesis that digital computer AIs could have true consciousness like human beings.
[404] So there was already a meaning to strong AI, which was complexly different but related, right?
[405] So we were tossing around on an email list what title it should be.
[406] And so we talked about narrow AI, broad AI, wide AI, narrow AI, narrow AI, general AI.
[407] And I think it was either Shane Legg or Peter Vos on the private email discussion we had, you said, well, why don't we go with AGI, artificial general intelligence?
[408] And Pei Wang wanted to do GAI general artificial intelligence, because in Chinese, it goes in that order.
[409] But we figured gay wouldn't work in U .S. culture at that time, right?
[410] Yeah.
[411] Yeah.
[412] So we went with the AGI.
[413] We used it for the title of that book.
[414] And part of Peter and Shane's reasoning was you have the G factor in psychology, which is IQ.
[415] general intelligence, right?
[416] So you have a meaning of GI, general intelligence in psychology, so then you're looking like artificial GI.
[417] So then we, that makes a lot of sense.
[418] Yeah, we use that for the, we use that for the title of the book.
[419] And so I think I may be both Shane and Peter think they invented the term.
[420] But then later, after the book was published, this guy, Mark Goobrid, came up to me. And he's like, well, I published an essay with the term AGI in like 1997 or something.
[421] And so I'm just waiting for some Russian to come out and say they published that in 1953.
[422] I mean, that term is not dramatically innovative or anything.
[423] It's one of these obvious in hindsight things, which is also annoying in a way because you know, Josh Habak, who you interviewed is a close friend of mine.
[424] He likes the term synthetic intelligence, which I like much better.
[425] but it hasn't actually caught on, right?
[426] Because, I mean, artificial is a bit off to me because artifice is like a tool or something, but not all AGIs are going to be tools.
[427] I mean, they may be now, but we're aiming toward making them agents rather than tools.
[428] And in a way, I don't like the distinction between artificial and natural, because, I mean, we're part of nature also, and machines are part of nature.
[429] I mean, you can look at Evolve versus engineers, but that's a different, that's a different distinction.
[430] Then it should be engineered general intelligence, right?
[431] And then general, well, if you look at Marcus Hooter's book universally, what he argues there is, you know, within the domain of computation theory, which is limited, been interesting.
[432] So if you assume computable environments and computable reward functions, then he articulates what would be a truly general intelligence, a system called AI -XI, which is quite beautiful.
[433] Aixie.
[434] Aixie, and that's the middle name of my latest child, actually.
[435] What's the first name?
[436] First name is Quarksy, Q -O -R -X -I, which my wife came out with, but that's an acronym for quantum -organized rational, expanding intelligence.
[437] A -I -E -C -Zee's the middle name.
[438] His middle name is Xiphanes, actually, which means the former principal underlying A -X -E.
[439] You're giving Elon Musk's new child to run for you.
[440] Well, I did it first.
[441] He copied me with this new freakish name.
[442] But now if I have another baby, I'm going to have to outdo him.
[443] It's becoming an arms race of weird geeky baby names.
[444] We'll see what the babies think about it, right?
[445] Yeah.
[446] But, I mean, my oldest son, Zarathustra, loves his name.
[447] And my daughter, Sharazade, loves her name.
[448] So far, basically, if you give your kids weird names...
[449] They live up to it.
[450] Well, you're obliged to make the kids weird enough that they like the names, right?
[451] It directs their upbringing in a certain way.
[452] But, yeah, anyway, I mean, what Marcus showed in that book is that a truly general intelligence theoretically is possible, but would take infinite computing power.
[453] So then the artificial is a little off.
[454] The general is not really achievable within physics, as we know it.
[455] I mean, physics, as we know, it may be limited, but that's what we have to work with now.
[456] Intellectuals.
[457] Infinitely general, you mean?
[458] Like, information processing perspective, yeah.
[459] Yeah.
[460] Intelligence is not very well defined either.
[461] I mean, what does it mean?
[462] In AI now, it's fashionable to look at it as maximizing an expected reward over the future.
[463] But that sort of definition is pathological in various ways.
[464] And my friend David Weinbaum, aka Weaver, he had a beautiful PhD thesis on open -ended intelligence, trying to conceive intelligence in a...
[465] Without a reward.
[466] Yeah, he's just looking at it differently.
[467] looking at complex self -organizing systems and looking at an intelligent system as being one that, you know, revises and grows and improves itself in conjunction with its environment without necessarily there being one objective function it's trying to maximize.
[468] Although over certain intervals of time, it may act as if it's optimizing a certain objective function.
[469] Very much Solaris from Stanis -Sla -Lum's novels, right?
[470] So, yeah, the point is artificial, general, and intelligence, they're all bad.
[471] On the other hand, everyone knows what AI is and AGI seems immediately comprehensible to people with a technical background.
[472] So I think that the term has served as sociological function.
[473] Now it's out there everywhere which it's stuck.
[474] It baffles me. It's like KFC.
[475] I mean, that's it.
[476] We're stuck with AGI probably for a very long time until AGI systems take over and rename themselves.
[477] Yeah.
[478] And then we'll be biological.
[479] We're stuck with GPUs too, which mostly have nothing to do with graphics anymore, right?
[480] I wonder what the AGI system will call us humans.
[481] That was maybe...
[482] Grandpa.
[483] GPs?
[484] Grandpa processing unit.
[485] Biological grandpa processing units.
[486] Okay, so maybe also just a comment on AGI representing, before even the term existed, representing a kind of community.
[487] You've talked about this in the past, sort of AI has coming in waves, but there's has been this community of people who dream about creating general human levels, super -intelligence systems.
[488] Can you maybe give your sense of the history of this community as it exists today, as it existed before this deep learning revolution, all throughout the winters and the summers of AI?
[489] Sure.
[490] First, I would say, as a side point, the winters and summers of AI are greatly exaggerated by by Americans.
[491] Yeah.
[492] And if you look at the publication record of the artificial intelligence community since, say, the 1950s, you would find a pretty steady growth in advance of ideas and papers.
[493] And what's thought of as an AI winter or summer was sort of how much money is the U .S. military pumping into AI, which was meaningful.
[494] On the other hand, there was AI going on in Germany, UK, and in Japan.
[495] in Russia all over the place while U .S. military got more and less enthused about AI.
[496] So, I mean...
[497] That happened to be, just for people who don't know, the U .S. military happened to be the main source of funding for AI research.
[498] So another way to phrase that is it's up and down of funding for artificial intelligence research.
[499] And I would say the correlation between funding and intellectual advance was not 100%, right?
[500] because, I mean, in Russia, as an example, or in Germany, there was less dollar funding than in the U .S., but many foundational ideas were laid out, but it was more theory than implementation, right?
[501] And U .S. really excelled at sort of breaking through from theoretical papers to working implementations, which did go up and down somewhat with U .S. military funding.
[502] But still, I mean, you can look in the 1980s, Dietrich Derner, in Germany had self -driving cars on the Autobahn, right?
[503] And I mean, this, it was a little early with regard to the car industry, so it didn't catch on such as has happened now.
[504] But I mean, that whole advancement of self -driving car technology in Germany was pretty much independent of AI military summers and winners in the U .S. So there's been more going on in AI globally than not only most people on the planet realize, but then most, new AI PhDs realized because they've come up within a certain sub -field of AI and haven't had to look so much so much beyond that.
[505] But I would say when I got my PhD in 1989 in mathematics, I was interested in AI already.
[506] In Philadelphia.
[507] Yeah, I started at NYU.
[508] Then I transferred to Philadelphia to Temple University.
[509] Good old North Philly.
[510] North Philly.
[511] Yeah, yeah, yeah.
[512] The pearl of the US.
[513] You never stopped at a red light then, because you were afraid if you stopped at a red light, so more carjackie.
[514] So you strap through every red light.
[515] Every day driving or bicycling to temple from my house was like a new adventure.
[516] But yeah, the reason I didn't do a PhD in AI was what people were doing in the academic AI field then was just astoundingly boring and seemed wrong -headed to me. It was really like rule -based experts.
[517] systems and production systems.
[518] And actually, I loved mathematical logic.
[519] I had nothing against logic as the cognitive engine for an AI.
[520] But the idea that you could type in the knowledge that AI would need to think seemed just completely stupid and wrong -headed to me. I mean, you can use logic if you want, but somehow the system has got to be learning, right?
[521] It should be learning from experience.
[522] And the AI field then was not interested in learning from experience.
[523] I mean, some researchers certainly were.
[524] I mean, I remember in mid -80s, I discovered a book by John Andreas, which was about a reinforcement learning system called Purpus, P -U -R -R -R -S -S, which was an acronym that I can't even remember what it was for, purpose anyway.
[525] But, I mean, that was a system that was supposed to be an A -U -S -S -S, which was an acronym.
[526] and basically by some sort of fancy, like, Markov decision process learning, it was supposed to learn everything just from the bits coming into it and learn to maximize it its reward and become intelligent, right?
[527] So that was there in academia back then, but it was like isolated, scattered, weird people.
[528] But all these isolated, scattered weird people in that period, I mean, they laid the intellectual grounds for what happened later.
[529] So you look at John Andreas at University of Canterbury, with his purpose, reinforcement learning Markov system, he was the PhD supervisor for John Cleary in New Zealand.
[530] Now, John Cleary worked with me when I was at Wicado University in 1993 in New Zealand, and he worked with Ian Witten there, and they launched Weka, which was the first open -source machine learning toolkit, which was launched in, I guess, 93 or 190.
[531] when I was at Waikado University.
[532] Written in Java, unfortunately.
[533] Written in Java, which was a cool language back then, though, right?
[534] I guess it's still, well, it's not cool anymore, but it's powerful.
[535] I find, like most programmers now, I find Java unnecessarily bloated.
[536] But back then it was like Java, or C++, basically.
[537] And Java was -oriented, so it's nice.
[538] Java was easier for students.
[539] Amusingly, a lot of the work on Weka, when we were in New Zealand, was funded by a U .S., Sorry, in New Zealand government grant to use machine learning to predict the menstrual cycles of cows.
[540] So in the U .S., all the grant funding for AI was about how to kill people or spy on people.
[541] In New Zealand, it's all about cows or kiwi fruits, right?
[542] Yeah.
[543] So, yeah, anyway, I mean, John Andreas had his probability theory -based reinforcement learning, proto -AGI.
[544] John Cleary was trying to do much more ambitious, probabilistic AGI systems.
[545] Now, John Cleary helped do WECA, which is the first open source machine learning tool, gets to the predecessor for TensorFlow and Torch and all these things.
[546] Also, Shane Legg was at Wicado working with John Cleary and Ian Witten and this whole group and then working with my own companies, my company webmined, an AI company I had in the late 90s with a team there at Waikata University, which is how Shane got his head full of AGI, which led him to go on and with Demosa Saba's found deep mind.
[547] So what you can see through that lineage is, you know, in the 80s and 70s, John Andreas was trying to build probabilistic reinforcement of the AGI systems.
[548] The technology, the computers just weren't there to support it.
[549] His ideas were very similar.
[550] to what people are doing now.
[551] But, you know, although he's long since passed away and didn't become that famous outside of Canterbury, I mean, the lineage of ideas passed on from him to his students to their students.
[552] You can go trace directly from there to me and to deep mind, right?
[553] So there was a lot going on in AGI that did ultimately lay the groundwork for what we have today, but there wasn't a community, right?
[554] And so when I started trying to pull together in the AGI community, it was in, I guess, the early arts when I was living in Washington, D .C. and making a living doing AI consulting for various U .S. government agencies.
[555] And I organized the first AGIA workshop in 2006.
[556] And, I mean, it wasn't like it was literally in my basement or something.
[557] I mean, it was in the conference room at the Marriott.
[558] in Bethesda.
[559] It's not that edgy or underground, unfortunately, but still...
[560] How many people attended?
[561] That's 60 or something.
[562] That's not bad.
[563] I mean, D .C. has a lot of AI going on.
[564] Probably until the last five or ten years, much more than Silicon Valley, although it's just quiet because of the nature of what happens in D .C. Their business isn't driven by PR.
[565] Mostly, when something starts to work really well, it's taken black and becomes even more quiet, right?
[566] But, yeah, the thing is that really had the feeling of a group of starry -eyed mavericks, like, huddled in a basement, like plotting how to overthrow the narrow AI establishment.
[567] And, you know, for the first time, in some cases, coming together with others who shared their passion for AGI and the technical seriousness about working on it, right?
[568] And that, I mean, that's very, very different than what we have today.
[569] I mean, now it's a little bit different.
[570] We have AGI conference every year, and there's several hundred people rather than 50.
[571] Now it's more like this is the main gathering of people who want to achieve AGI and think that large -scale nonlinear regression is not the golden path to AGI.
[572] Yeah, okay, and your own network.
[573] Yeah, yeah, yeah.
[574] Well, certain architectures for learning using neural networks.
[575] So, yeah, the AGI conferences are sort of now the main concentration of people not obsessed with deep neural nets and deep reinforcement learning, but still interested in AGI.
[576] Not the only ones.
[577] I mean, there's other little conferences and groupings interested in human -level AI and cognitive architectures and so forth.
[578] But yeah, it's been a big shift.
[579] Like back then, you couldn't really, it'll be very, very edgy then to give a university department seminar that mentioned AGI or human level AI.
[580] It was more like you had to talk about something more short -term and immediately practical than, you know, in the bar after the seminar, you could bullshit about AGI in the same breath as time travel or the simulation hypothesis or something, right?
[581] Whereas now AGI is not only in the academic seminar room, like you have Vladimir Putin knows what AGI is.
[582] And he's like, Russia needs to become the leader in AGI, right?
[583] So national leaders and CEOs of large corporations, I mean, the CTO of Intel, Justin Ratner, this was years ago, a Singularity Summit conference, 2008 or something.
[584] He's like, we believe Ray Kurzweil, the Singularity will happen in 2045, and it will have Intel inside.
[585] I mean, so it's gone from being something, which is the pursuit of like crazed mavericks, crackpots, and science fiction fanatics, to being, you know, a marketing term for large corporations and the national leaders, which is an astounding transition.
[586] But yeah, in the course of this transition, I think a bunch of sub -communities have formed.
[587] and the community around the AGI conference series is certainly one of them.
[588] It hasn't grown as big as I might have liked it to.
[589] On the other hand, you know, sometimes a modest -sized community can be better for making intellectual progress also.
[590] You get at a Society for Neuroscience Conference.
[591] You have 35 or 40 ,000 neuroscientists.
[592] On the one hand, it's amazing.
[593] On the other hand, you're not going to talk to the leaders of the field.
[594] if you're an outsider.
[595] Yeah, in the same sense, the AAAI, the artificial intelligence, the main kind of generic artificial intelligence conference is too big.
[596] It's too amorphous.
[597] Like, it doesn't make, it doesn't.
[598] Well, yeah, and NIPS has become a company advertising outlet in the holidays.
[599] So, yeah, so, I mean, to comment on the role of AGI in the research community, I'd still, if you look at Neurips, if you look at CVPR, if you look at these eye clear, you know, AGI is still seen as the outcast.
[600] I would still, I would say in these main machine learning, in these main artificial intelligence conferences amongst the researchers, I don't know if it's an accepted term yet.
[601] What I've seen bravely, you mentioned Shane Legg, is deep mind and then open AI are the two places.
[602] that are, I would say, unapologetically, so far, I think it's actually changing, unfortunately, but so far they've been pushing the idea that the goal is to create an AGI.
[603] Well, they have billions of dollars behind them.
[604] So, I mean, in the public mind, that certainly carries some oomph, right?
[605] I mean, but they also have really strong researchers, right?
[606] They do.
[607] They're great teams.
[608] I mean, Deep Mind in particular, yeah.
[609] And they have, I mean, Deep Mind has Marcus Hutter walking around.
[610] I mean, there's all these.
[611] folks who basically their full -time position involves dreaming about creating AGI.
[612] I mean, Google Brain has a lot of amazing AGI -oriented people also.
[613] And I mean, so I'd say from a public marketing view, deep mind and open AI are the two large well -funded organizations that have put the term and concept AGI out there sort of as part of their public image, but I mean, there are certainly not.
[614] There are other groups that are doing research that seems just as AGI -ish to me, I mean, including a bunch of groups in Google's main mountain view office.
[615] So yeah, it's true.
[616] AGI is somewhat away from the mainstream now, but if you compare to where it was, you know, 15 years ago, there's, there's, there's, there's, there's, there's been an amazing mainstreaming.
[617] You could say the same thing about super longevity research, which is one of my application areas that I'm excited about.
[618] I've been talking about this since the 90s, but working on this since 2001.
[619] And back then, really, to say you're trying to create therapies to allow people to live hundreds or thousands of years, you were way, way, way, way out of the industry academic mainstream.
[620] But now, you know, Google had had projects, Calico, Craig Venter at Human Longevity Incorporated and then once the suits come marching in, right?
[621] Once there's big money in it, then people are forced to take it seriously because that's the way modern society works.
[622] So it's still not as mainstream as cancer research.
[623] Just as AGI is not as mainstream as automated driving or something, but the degree of mainstreaming that's happened in the last 10 to 15 years is astounding.
[624] to those of us who've been out of it for a while.
[625] Yeah, but there's a marketing aspect to the term, but in terms of actual full -force research that's going on under the header of AGI, it's currently, I would say, dominated, maybe you can disagree, dominated by neural networks research, that the non -linear regression, as you mentioned.
[626] Like, what's your sense with OpenCog, with your work, in general, I was logic -based systems and expert systems.
[627] For me, always seemed to capture a deep element of intelligence that needs to be there.
[628] Like you said, it needs to learn, it needs to be automated somehow, but that seems to be missing from a lot of research currently.
[629] So what's your sense?
[630] I guess one way to ask this question, what's your sense of, what's your sense of, what kind of things will an AGI system need to have?
[631] Yeah, that's a very interesting topic that I thought about for a long time.
[632] And I think there are many, many different approaches that can work for getting to human -level AI.
[633] So I don't think there's like one golden algorithm, there are one golden design that can work.
[634] And I mean, flying machines, is the much -worn analogy here, right?
[635] Like, I mean, you have airplanes, you have helicopters, you have balloons, you have stealth bombers that don't look like regular airplanes.
[636] You've got all blimps.
[637] Birds, too.
[638] Birds, yeah, and bugs, right?
[639] Yeah.
[640] And I mean, and there are certainly many kinds of flying machines.
[641] And there's a catapult that you can just launch.
[642] There's bicycle -powered, like, flying machines, right?
[643] Nice, yeah.
[644] Yeah, so now these are all analogousable by a basic theory of aerodynamics, right?
[645] Now, so one issue with AGI is we don't yet have the analog of the theory of aerodynamics, and that's what Marcus Hutter was trying to make with the AXE and his general theory of general intelligence, but that theory in its most clearly articulated parts really only works for either infinitely powerful machines or almost, or insanely impractically powerful machines.
[646] So I mean, if you were going to take a theory -based approach to AGI, what you would do is say, well, let's take what's called, say, AXE TL, which is Hutter's AXC machine that can work on merely insanely much processing power rather than infinitely much processing power.
[647] What does TL stand for?
[648] Time and length.
[649] Okay.
[650] So you're basically how it - constrained some of it.
[651] Yeah, yeah, yeah.
[652] So how AXC works, basically, is each, each action that it wants to take before taking that action, it looks at all its history.
[653] And then it looks at all possible programs that it could use to make a decision.
[654] And it decides, like, which decision program would have let it make the best decisions according to its reward function over its history.
[655] And he uses that decision program to take, to make the next decision, right?
[656] It's not afraid of infinite resources.
[657] It's searching through the space of all possible computer programs in between each action and each next action.
[658] Now, XETL searches through all possible computer programs that have runtime less than T and length less than L, which is still an impracticably humongous space, right?
[659] So what you would like to do to make an AGI and what will probably be done 50 years from now to make an AGI is say, okay, well, we have some considerations.
[660] We have these processing power constraints, and, you know, we have space and time constraints on the program.
[661] We have energy utilization constraints, and we have this particular class environments, class of environments that we care about, which may be, say, you know, manipulating physical objects on the surface of the Earth, communicating in human language, I mean, whatever our particular, not annihilating humanity, whatever our particular requirements happen to be.
[662] If you formalize those requirements in some formal specification language, you should then be able to run an automated program specializer on AXETL, specialize it to the computing resource constraints and the particular environment and goal.
[663] And then it will spit out like the specialized version of AXETL to your resource restrictions in your environment, which will be your AGI.
[664] right?
[665] And that I think is how our super AGI will create new AGI systems, right?
[666] But that's a very Russian approach, by the way.
[667] Like the whole field of program specialization came out of Russia.
[668] Can you backtrack?
[669] So what is program specialization?
[670] So it's basically...
[671] Well, take sorting, for example.
[672] You can have a generic program for sorting lists.
[673] But what if all your lists you care about are length, 10 ,000 or less?
[674] You can run an automated program specialized on your sorting algorithm, and it will come up with the algorithm that's optimal for sorting lists of length 1 ,000 or 10 ,000 or less, right?
[675] It's kind of the process of evolution is a program specializer to the environment.
[676] So you're kind of evolving human beings or living creatures.
[677] Your Russian heritage is showing there.
[678] So with Alexander Vitya and Peter Anokin and so on, I mean, there's a, yeah, there's a long history of thinking about evolution that way also, right?
[679] Well, my point is that what we're thinking of as a human -level general intelligence, if you start from narrow AIs, like are being used in the commercial AI field now, then you're thinking, okay, how do we make it more and more general?
[680] On the other hand, if you start from Aixie or Schmidhuber's Girdle Machine, or these infinitely powerful, but practically infeasible AIs, then getting to a human level AGI is a matter of specialization.
[681] It's like how do you take these maximally general learning processes and how do you specialize them so that they can operate within the resource constraints that you have but will achieve the particular things that you care about?
[682] Because we are not, we humans are not maximally general intelligence.
[683] If I ask you to run amaze in 750, dimensions, you'll probably be very slow.
[684] Whereas at two dimensions, you're probably, you're probably way better, right?
[685] So, I mean, we're special, because our, our hippocampus has a two -dimensional map in it, right?
[686] And it does not have a 750 dimensional map in it.
[687] So, I mean, we, we're, you know, a peculiar mix of generality and specialization, right?
[688] We'll probably start quite general at birth.
[689] Not obviously, it's still narrow, but more general than we are at age 20 and 30 and 40 and 50 and 60.
[690] I don't think that I think it's more complex than that because I mean in some sense a young child is less biased and the brain has yet to sort of crystallize into appropriate structures for processing aspects of the physical and social world.
[691] On the other hand, the young child is very tied to their censorium, whereas we We can deal with abstract mathematics, like 750 dimensions, and the young child cannot, because they haven't grown what Piaget called the formal capabilities.
[692] They haven't learned to abstract yet, right?
[693] And the ability to abstract gives you a different kind of generality than what a baby has.
[694] So there's both more specialization and more generalization that comes with the development process, actually.
[695] I mean, I guess just the trajectories of the specialization are most controllable at the young age, I guess, is one way to put it.
[696] Do you have kids?
[697] No. They're not as controllable as you think.
[698] So you think it's interesting.
[699] I think, honestly, I think a human adult is much more generally intelligent than a human baby.
[700] Babies are very stupid.
[701] I mean, they're cute.
[702] Yeah.
[703] Which is why we put up with their repetitiveness and stupidity.
[704] And they have what the Zen guys would call a beginner's mind, which is a beautiful thing.
[705] But that doesn't necessarily correlate with a high level of intelligence.
[706] So on the plot of like cuteness and stupidity there, there's a process that allows us to put up with their stupidity as they get become more intelligent.
[707] So by the time you're an ugly old man like me, you've got to get really, really smart to compensate it.
[708] To compensate, okay, cool.
[709] But yeah, going back to your original question, so the way I look at human level AGI is how do you specialize, you know, unrealistically inefficient superhuman brute force learning processes to the specific goals that humans need to achieve and the specific resources that we have?
[710] And both of these, the goals and the resources and the environments, I mean, all this is important.
[711] And on the resources side, it's important that the hardware resources we're bringing to bear are very different than the human brain.
[712] So the way I would want to implement AGI on a bunch of neurons in a VAT that I could rewire arbitrarily is quite different than the way I would want to create AGI on, say, a modern server farm of CPUs and GPUs.
[713] which in turn may be quite different than the way I would want to implement AGI on, you know, whatever quantum computer will have in 10 years, supposing someone makes a robust quantum turning machine or something, right?
[714] So I think, you know, there's been co -evolution of the patterns of organization in the human brain and the physiological particulars of the human brain over time.
[715] And when you look at neural networks, that is one powerful class of learning algorithms, but it's also a class of learning algorithms that evolve to exploit the particulars of the human brain as a computational substrate.
[716] If you're looking at the computational substrate of a modern server farm, you won't necessarily want the same algorithms that you want on the human brain.
[717] And from the right level of abstraction, you could look at maybe the best algorithms on brain and the best algorithms on a modern computer network as implementing the same abstract learning and representation processes.
[718] But finding that level of abstraction is its own AGI research project then, right?
[719] So that's about the hardware side and the software side, which follows from that.
[720] Then, regarding what are the requirements, I wrote the paper years ago on what I called the embodied communication prior, which was quite similar in intent to Yoshua Benjio's recent paper on the consciousness prior, except I didn't want to wrap up consciousness in it, because to me, the quailia problem and subjective experience is a very interesting issue also, which we can chat about.
[721] But I would rather keep that philosophical debate distinct from the debate of what kind of biases do you want to put in a general intelligence to give it human -like general intelligence?
[722] And I'm not sure Yoshio is really addressing that kind of consciousness.
[723] He's just using the term.
[724] I love Joshua to pieces.
[725] He's by far my favorite of the lines of deep learning.
[726] He's such a good -hearted guy and a creative thinker.
[727] Yeah, for sure.
[728] I am not sure he has plumbed to the depths of the philosophy of consciousness.
[729] No, he's using it as a sexy.
[730] Yeah, yeah, yeah.
[731] So what I called it was the embodied communication prior.
[732] Can you maybe explain it a little bit?
[733] Yeah, yeah.
[734] What I meant was, you know, what are we humans evolved for?
[735] You can say being human, but that's very abstract, right?
[736] I mean, our minds control individual bodies, which are autonomous agents moving around in a world that's composed largely of solid objects, right?
[737] and we've also evolved to communicate via language with other solid object agents that are going around doing things collectively with us in a world of solid objects.
[738] And these things are very obvious, but if you compare them to the scope of all possible intelligences, or even all possible intelligences that are physically realizable, that actually constrains things a lot.
[739] So if you start to look at, you know, how would you realize some specialized or constrained version of universal general intelligence in a system that has, you know, limited memory and limited speed of processing, but whose general intelligence will be biased toward controlling a solid object agent, which is mobile in a solid object world for manipulating solid objects and communicating via language with other.
[740] similar agents in that same world, right?
[741] Then starting from that, you're starting to get a requirements analysis for human, human level general intelligence.
[742] And then that leads you into cognitive science.
[743] And you can look at, say, what are the different types of memory that the human mind and brain has?
[744] And this has matured over the last decades.
[745] And I got into this a lot.
[746] So after I gave my PhD in math, I was an academic for eight years.
[747] I was in department of mathematics, computer science, and psychology.
[748] When I was in the psychology department, the University of Western Australia, I was focused on cognitive science of memory and perception.
[749] Actually, I was teaching neural nets and deep neural nets, and it was multilayer perceptrons, right?
[750] Psychology?
[751] Yeah.
[752] Cognitive science, it was cross -disciplinary among engineering, math, psychology, philosophy, linguistics, computer science.
[753] But yeah, we were teaching psychology students to try to model the data from human cognition experiments using multilayer perceptrons, which was the early version of a deep neural network.
[754] Very, very, recurrent backprop was very, very slow to train back then, right?
[755] So this is the study of these constraint systems that are supposed to deal with physical objects.
[756] So if you look at cognitive psychology, you can see there's multiple types of memory, which are to some extent represented by different subsystems in the human brain.
[757] So we have episodic memory, which takes into account our life history and everything that's happened to us.
[758] We have declarative or semantic memory, which is like facts and beliefs abstracted from the particular situations as they occurred in.
[759] There's sensory memory, which to some extent is sense modality specific and to some extent is unified across sense modalities.
[760] There's procedural memory.
[761] memory of how to do stuff, like how to swing the tennis racket, right, which is there's motor memory, but it's also a little more abstract than motor memory.
[762] It involves cerebellum and cortex working together.
[763] Then there's memory linkage with emotion, which has to do with linkages of cortex and limbic system.
[764] There's specifics of spatial and temporal modeling connected with memory, which has to do with, you know, hippocampus and thalamus connecting to cortex, and the basal ganglia, which influences goals.
[765] So we have specific memory of what goals, sub -goals, and sub -sub -goals we wanted to perceive in which context in the past.
[766] Human brain has substantially different subsystems for these different types of memory and substantially differently tuned learning, like differently tuned modes of long -term potentiation to do with the types of neurons and neurotransmitters and the different parts of the brain correspond to these different types of knowledge.
[767] And these different types of memory and learning in the human brain, I mean, you can back these all into embodied communication for controlling agents in worlds of solid objects.
[768] So if you look at building an AGI system, one way to do it, which starts more from cognitive science than neuroscience, is to say, okay, what are the types of memory that are necessary for this kind of world?
[769] Yeah, yeah, necessary for this sort of, intelligence, what types of learning work well with these different types of memory, and then how do you connect all these things together, right?
[770] And of course, the human brain did it incrementally through evolution, because each of the sub -networks of the brain, I mean, it's not really the lobes of the brain, it's the sub -networks, each of which is widely distributed, which of the, each of the sub -networks of the brain co -evolves with the other sub -networks of the brain, both in terms of its patterns of organization and the particulars of the neurophysiology.
[771] So they all grew up communicating and adapting to each other.
[772] It's not like they were separate black boxes that were then glommed together, right?
[773] Whereas as engineers, we would tend to say, let's make the declarative memory box here and the procedural memory box here and the perception box here and wire them together.
[774] And when you can do that, it's interesting.
[775] I mean, that's how a car is built, right?
[776] But on the other hand, that's clearly not how biological systems are made.
[777] The parts co -evolves so as to adapt and work together.
[778] That's, by the way, how every human engineered system that flies, that we're using that analogy before is built as well.
[779] Yes.
[780] So do you find this at all appealing?
[781] Like, there's been a lot of really exciting, which I find strange that it's ignored work in cognitive architectures, for example, throughout the last few decades.
[782] Do you find that Yeah, I mean, I had a lot to do with that community.
[783] And, you know, Paul Rosenblum, who was one of the, and John Laird, who built the SOAR architecture, our friends of mine.
[784] And I learned SOAR quite well in ACTAR and these different cognitive architectures.
[785] And how I was looking in the AI world about 10 years ago before this whole commercial deep learning explosion was, on the one hand, you had these cognitive architecture guys who were working close with psychologists and cognitive scientists.
[786] who had thought a lot about how the different parts of a human -like mind should work together.
[787] On the other hand, you had these learning theory guys who didn't care at all about the architecture, but we're just thinking about how do you recognize patterns and large amounts of data.
[788] And in some sense, what you needed to do was to get the learning that the learning theory guys were doing and put it together with the architecture that the cognitive architecture guys were doing.
[789] And then you would have what you needed.
[790] Now, you can't, unfortunately, when you look at the details, you can't just do that without totally rebuilding what is happening on both the cognitive architecture and the learning side.
[791] So, I mean, they tried to do that in SOAR, but what they ultimately did is, like, take a deep neural net or something for perception, and you include it as one of the black boxes.
[792] It becomes one of the boxes.
[793] The learning mechanism becomes one of the boxes, as opposed to fundamental...
[794] Yeah, and that doesn't quite work.
[795] You could look at some of the stuff, Deep Mind has done, like the differential neural computer or something, that sort of has a neural net for deep learning perception.
[796] It has another neural net, which is like a memory matrix at store, say, the map of the London subway or something.
[797] So probably Demis Sasabas were thinking about as part of cortex and part of hippocampus, because hippocampus has a spatial map.
[798] And when he was a neuroscientist, he was doing a bunch on cortex hippocampus interconnection.
[799] So there, the DNC would be an example of folks from the deep neural net world trying to take a step in the cognitive architecture direction by having two neural modules that correspond roughly to two different parts of the human brain that deal with different kinds of memory and learning.
[800] But on the other hand, it's super, super, super crude from the cognitive architecture view, right?
[801] Just as what John Laird and Saur did with neural nets was super, super crude from a learning point of view, because the learning was like off to the side, affecting the core representations, right?
[802] I mean, you weren't learning the representation.
[803] You were learning the data that feeds into the...
[804] You were learning abstractions of perceptual data to feed into the representation that was not learned, right?
[805] So, yeah, this was clear to me a while ago, and one of my hopes with the AGI community was to sort of bring people from those two directions together.
[806] That didn't happen much in terms of...
[807] Not yet.
[808] And what I was going to say is it didn't happen in terms of bringing like the lions of cognitive architecture together with the lions of deep learning.
[809] It did work in the sense that a bunch of younger researchers have had their heads filled with both of those ideas.
[810] This comes back to a saying my dad who was a university professor often quoted to me, which was a science advances one funeral at a time, which I'm trying to avoid.
[811] Like, I'm 53 years old, and I'm trying to invent amazing, weird -ass new things that nobody ever thought about, which we'll talk about in a few minutes.
[812] But there is that aspect, right?
[813] Like, the people who've been at AI a long time and have made their career at developing one aspect, like a cognitive architecture or a deep learning approach, it can be hard once you're old and have made your career doing one thing.
[814] it can be hard to mentally shift gears.
[815] I mean, I try quite hard to remain flexible mind.
[816] Have you been successful somewhat in changing?
[817] Maybe have you changed your mind on some aspects of what it takes to build an AGI, like technical things?
[818] The hard part is that the world doesn't want you to.
[819] The world or your own brain?
[820] No, the world, well, that one point is that your brain doesn't want to.
[821] The other part is that the world doesn't want you to.
[822] Like the people who have followed your ideas get massive.
[823] you if you change your mind and and you know the media wants to pigeonhole you as as an avatar of a certain a certain idea but yeah i i've i've changed my mind on on a bunch of things i mean i when i started my career i really thought quantum computing would be necessary for for aGI and i i doubt it's necessary now although i think it will be a super major enhancement but i mean i'm also I'm now in the middle of embarking on a complete rethink and rewrite from scratch of our OpenCog AGI system, together with Alexei Popov and his team in St. Petersburg, who's working with me in SingularityNet.
[824] So now we're trying to go back to basics, take everything we learned from working with the current OpenCog system, take everything everybody else has learned from working with their proto -AGI systems and design the best framework for the next stage.
[825] And I do think there's a lot to be learned from the recent successes with deep neural nets and deep reinforcement systems.
[826] I mean, people made these essentially trivial systems work much better than I thought they would.
[827] And there's a lot to be learned from that.
[828] And I want to incorporate that not.
[829] appropriately in our OpenCog 2 .0 system.
[830] On the other hand, I also think current deep neural net architectures as such will never get you anywhere near AGI.
[831] So I think you want to avoid the pathology of throwing the baby out with the bathwater and like saying, well, these things are garbage because foolish journalists overblow them as being the path to AGI and a few researchers overblow them.
[832] as well.
[833] There's a lot of interesting stuff to be learned there, even though those are not the golden path.
[834] So maybe this is a good chance to step back.
[835] You mentioned OpenCog 2 .0, but...
[836] Go back to OpenCog, 0 .0, which exists now.
[837] Yeah.
[838] Yeah, maybe talk to the history of OpenCod and you're thinking about these ideas.
[839] I would say, OpenCog 2 .0 is a term we're throwing around sort of tongue in cheek because the existing open cog system that we're working on now is not remotely close to what we'd consider we'd consider a one point oh right i mean i mean it's it's an early it's been around what 13 years or something but it's still an early stage research system right and actually we're we are going back to the beginning in terms of theory and implementation because we feel like that's the right thing to do.
[840] But I'm sure what we end up with is going to have a huge amount in common with the current system.
[841] I mean, we all still like the general approach.
[842] So first of all, what is open cog?
[843] Sure.
[844] Open cog is an open source software project that I launched together with several others in 2008.
[845] And probably the first code written toward that was written in 2001 or two or something that was developed as a proprietary code base within my AI company, Novamante LLC.
[846] Then we decided to open source it in 2008, cleaned up the code throughout some things.
[847] I added some new things.
[848] And what language is it written?
[849] It's C++.
[850] Primarily, there's a bunch of scheme as well, but most of it's C++.
[851] And it's separate from something we'll also talk about is singularity net.
[852] So it was born as a non -networked thing.
[853] Correct, correct.
[854] Well, there are many levels of networks involved here, right?
[855] No connectivity to the Internet.
[856] Or no?
[857] At birth.
[858] Yeah.
[859] I mean, singularity net is a separate project and a separate body of code, and you can use singularity net as part of the infrastructure for a distributed open -cog system.
[860] But there are different layers.
[861] Yeah.
[862] So OpenCog, on the one hand, as a software framework, could be used to implement a variety of different AI architectures and algorithms.
[863] But in practice, there's been a group of developers, which I've been leading together with Linus Vepstas, Nill Geisweiler, and a few others, which have been using the OpenCog platform and infrastructure to, to implement certain ideas about how to make an AGI.
[864] So there's been a little bit of ambiguity about OpenCog, the software platform versus OpenCog, the AGI design.
[865] Because in theory, you could use that software to do, you could use it to make a neural net.
[866] You could use it to make a lot of different AGI.
[867] What kind of stuff does the software platform provide?
[868] Like in terms of utilities, so it was like, what?
[869] Yeah, let me first tell about OpenCog as a software platform.
[870] And then I'll tell you the specific AGI.
[871] R &D, we've been building on top of it.
[872] So the core component of OpenCog is a software platform is what we call the atom space which is a weighted labeled hypergraph.
[873] A -T -O -M, atom space.
[874] Yeah, yeah.
[875] Not Adam, like Adam and Eve.
[876] Although that would be cool too.
[877] Yeah, so you have a hypergraph, which is like a, so a graph in this sense is a bunch of nodes with links between them.
[878] A hypergraph graph is like a graph, but links can go between more than two nodes.
[879] You have a link between three nodes.
[880] And in fact, in fact, OpenCode's Adam Space would properly be called a metagraph because you can have links pointing to links or you could have links pointing to whole subgraphs, right?
[881] So it's an extended hypergraph or a metagraph.
[882] Is metagraph a technical term?
[883] It is now a technical term.
[884] Interesting.
[885] But I don't think it was yet a technical term when we started calling.
[886] this is a generalized hypergraph, but in any case, it's a weighted labeled generalized hypergraph or weighted labeled metagraph.
[887] The weights and labels mean that the nodes and links can have numbers and symbols attached to them, so they can have types on them, they can have numbers on them that represent, say, a truth value or an importance value for a certain purpose.
[888] And of course, like with all things, you can reduce that to a hypergraph, and then the graph, and you could use a graph to an adjacency matrix.
[889] So, I mean, there's always multiple representations.
[890] But there's a layer of representation that seems to work well here.
[891] Got it.
[892] Right, right, right.
[893] And so similarly, you could have a link to a whole graph, because a whole graph could represent, say, a body of information.
[894] And I could say, I reject this body of information.
[895] Then one way to do that is make that link go to that whole subgraph representing the body of information, right?
[896] I mean, there are many alternate representations, but that's, anyway, what we have in OpenCog, we have an atom space, which is this weighted, labeled, generalized hypergraph, knowledge store, it lives in RAM, there's also a way to back it up to disk, there are ways to spread it among multiple different machines.
[897] Then there are various utilities for dealing with that.
[898] So there's a pattern matcher, which lets you specify a sort of abstract pattern.
[899] and then search through a whole atom space way labeled hypergraph to see what sub -hypergraphs may match that pattern for an example.
[900] Then there's something called the COG server in OpenCog, which lets you run a bunch of different agents or processes in a scheduler, and each of these agents basically it reads stuff from the ABN space and it writes stuff to the atom space.
[901] So this is sort of the basic operational model.
[902] That's the software framework.
[903] And of course, there's a lot there just from a scalable software engineering standpoint.
[904] So you could use this, I don't know if you've, have you looked into the Stephen Wolfram's physics project recently with the hypergrass and stuff?
[905] Could you theoretically use the software framework to play?
[906] You certainly could, although Wolfram would rather die than use anything but Mathematica for his work.
[907] Well, that's, yeah, but there's a big community.
[908] of people who are, you know, would love integration.
[909] Like you said, the young minds love the idea of integrating, of connecting things.
[910] Yeah, that's right.
[911] And I would add on that note, the idea of using hypergraph type models in physics is not very new.
[912] Like if you look at...
[913] The Russians did it first.
[914] Well, I'm sure they did.
[915] And a guy named Ben Drybus, who's a mathematician, a professor in Louisiana or somewhere, had a beautiful book on quantum sets and hypergorithms.
[916] graphs and algebraic topology for discrete models of physics and carried it much farther than than Wolfram has, but he's not rich and famous, so it didn't get in the headlines.
[917] But yeah, Wolfram aside, yes, certainly that's a good way to put it.
[918] The whole open cog framework, you could use it to model biological networks and simulate biology processes.
[919] You could use it to model physics on discrete graph models of physics.
[920] So you could use it to do, say, biologically realistic neural networks, for example.
[921] And that's, so that's a framework.
[922] What do agents and processes do?
[923] Do they grow the graph?
[924] What kind of computations just to get a sense?
[925] So they, in theory, they could do anything they want to do.
[926] They're just C++ processes.
[927] On the other hand, the computation framework is sort of designed for agents where most of their processing time is taken up with reads and rights to the atom space.
[928] And so that's a very different processing model than, say, the matrix multiplication -based model as underlies most deep learning systems, right?
[929] So you could, I mean, you could create an agent that just factored numbers for a billion years.
[930] It would run within the open cog platform, but it would be pointless, right?
[931] I mean, the point of doing open cog is because you want to make agents.
[932] agents that are cooperating via reading and writing into this weighted labeled hypergraph.
[933] And that has both cognitive architecture importance because then this hypergraph is being used as a sort of shared memory among different cognitive processes, but it also has software and hardware implementation implications because current GPU architectures are not so useful for open cog, whereas a graph chip would be included.
[934] incredibly useful, right?
[935] And I think Graph Corps has those now, but they're not ideally suited for this, but I think in the next, let's say three to five years, we're going to see new chips where like a graph is put on the chip and, you know, the back and forth between multiple processes acting CIMD and MIMD on that graph is going to be fast.
[936] And then that may do for open cog type architectures what GPUs did for deep neural architecture.
[937] small tangent, can you comment on thoughts about neuromorphic computing?
[938] So like hardware implementations of all these different kind of, are you interested?
[939] Are you excited by that possibility?
[940] I'm excited by graph processors because I think they can massively speed up open cog, which is a class of architectures that I'm working on.
[941] I think if, you know, in principle, neuromorphic computing should be amazing.
[942] I haven't yet been fully sold on any of the systems that are out.
[943] They're like memoristers should be amazing too, right?
[944] So a lot of these things have obvious potential, but I haven't yet put my hands on the system that seemed to manifest.
[945] Yeah, Marxism should be amazing, but the current systems have not been great.
[946] For example, if you wanted to make a biologically realistic hardware neural network, like taking, making a circuit in hardware that emulated like the Hodgkin -Huxley equation or the Ijikovic equation like differential equations for a biologically realistic neuron and putting that in hardware on the chip that would seem that it would make more feasible to make a large scale truly biologically realistic neural network.
[947] Now, what's been done so far is not like that.
[948] So I guess personally as a researcher, I mean, I've done a bunch of work in cognitive neuroscience, sorry, in computational neuroscience, where I did some work with IARPA in D .C. Intelligence and Venture Research Project Agency, we were looking at how do you make a biologically realistic simulation of seven different parts of the brain cooperating with each other, using like realistic non -linear dynamical models of neurons, and how do you get that to simulate what's going on in the mind of a geo -int intelligence?
[949] analyst while they're trying to find terrorists on a map, right?
[950] So if you want to do something like that, having neuromorphic hardware that really let you simulate like a realistic model of the neuron would be amazing.
[951] But that's sort of with my computational neuroscience head on, right?
[952] With an AGI hat on, I'm just more interested in these hypergraph knowledge representation based architectures, which would benefit more from various types of graph processors.
[953] Because the main processing bottleneck is reading, writing to RAM.
[954] It's reading writing to the graph and RAM.
[955] The main processing bottleneck for this kind of proto -AGI architecture is not multiplying matrices.
[956] And for that reason, GPUs, which are really good at multiplying matrices, don't apply as well.
[957] There are frameworks like Gunrock and others.
[958] that try to boil down graph processing to matrix operations, and they're cool, but you're still putting a square peg into a round hole in a certain one.
[959] The same is true.
[960] I mean, current quantum machine learning, which is very cool.
[961] It's also all about how to get matrix and vector operations in quantum mechanics.
[962] And I see why that's natural to do.
[963] I mean, quantum mechanics is all unitary matrices and vectors, right?
[964] On the other hand, you could also try to make graph -centric quantum computer, which I think is where things will go.
[965] And then we can have, then we can make, like, take the OpenCog implementation layer, implement it in a uncollops state inside a quantum computer.
[966] But that may be the singularity squared, right?
[967] I'm not, I'm not sure we need that to get to human level.
[968] Singularity scale.
[969] That's already beyond the first singularity.
[970] But can we just?
[971] Yeah, let's go back to OpenCog.
[972] No, no, yeah, and the hypergraph and OpenCock.
[973] That's the software framework, right?
[974] So the next thing is our cognitive architecture tells us particular algorithms to put there.
[975] Got it.
[976] Can we backtrack on the kind of, is this graph designed, is it in general supposed to be sparse and the operations constantly grow and change the graph?
[977] Yeah, the graph is sparse.
[978] But is it constantly adding links and so on?
[979] It is a self -modifying hypergraph.
[980] So it's not, so the write and read operations, do you remember?
[981] referring to.
[982] This isn't just a fixed graph to which you changed away.
[983] It's a disgusting growing graph.
[984] Yeah, that's true.
[985] It is different model than, say, current deep neural nets and have a fixed neural architecture, and you're updating the weights, although there have been like cascade correlational neural net architectures that grow new nodes and links.
[986] But the most common neural architectures now have a fixed neural architecture, you're updating the weights.
[987] And then open cog you can update the weights and that certainly happens a lot but adding new nodes adding new links removing nodes and links is an equally critical part of the system's operations got it so now when you start to add these cognitive algorithms on top of this open cog architecture what does that look like so what yeah so that the within this framework then creating a cognitive architecture is basically two things it's it's choosing what type system you want to put on the nodes and links in the hypergraph, what types of nodes and links you want, and then it's choosing what collection of agents, what collection of AI algorithms or processes are going to run to operate on this hypergraph.
[988] And of course, those two decisions are closely connected to each other.
[989] So in terms of the type system, there are some links that are more neural net -like.
[990] They just, like, have weights to get updated by Hebbian learning and activation spreads along them.
[991] There are other links that are more logic -like and nodes that are more logic -like.
[992] So you could have a variable node, and you can have a node representing a universal or existential quantifier, as in predicate logic or term logic.
[993] So you can have logic -like nodes and links, or you can have neural -like nodes and links.
[994] You can also have procedure -like nodes and links, as in say combinator logic or lambda calculus representing programs.
[995] So you can have nodes and links representing many different types of semantics, which means you could make a horrible, ugly mess, or you could make a system where these different types of knowledge all interpenetrate and synergized with each other beautifully, right?
[996] So the hypergraph can contain programs?
[997] Yeah, it can contain programs, although in the current version, it is a very inefficient way to guide the execution of programs, which is one thing that we are aiming to resolve with our rewrite of the system now.
[998] So what to you is the most beautiful aspect of OpenCog?
[999] Just to you personally, some aspect that captivates your imagination from beauty or power?
[1000] What fascinates me is finding a common representation that underlies abstract declarative knowledge and sensory knowledge and movement knowledge and procedure knowledge and episodic knowledge, finding the right level of representation where all these types of knowledge are stored in a sort of universal and interconvertible yet practically manipulable way, right?
[1001] So that's, to me, to me, that's the core, because once you've done that, then the different learning algorithms can help each other out.
[1002] Like what you want is, if you have a logic engine that helps with declarative knowledge, and you have a deep neural net that gathers perceptual knowledge, and you have, say, an evolutionary learning system that learns procedures, you want these to not only interact on the level of sharing results and passing inputs and outputs to each other, you want the logic engine when it gets stuck to be able to share its intermediate state with the neural net and with the evolutionary learning algorithm so that they can help each other out of bottlenecks and help each other solve combinatorial explosions by intervening inside each other's cognitive processes.
[1003] But that can only be done if the intermediate state of a logic engine, the evolutionary learning engine, and a deep neural net are represented in the same form.
[1004] And that's what we figured out how to do by putting the right type system on top of this weighted labeled hypergraph.
[1005] So is there, can you maybe elaborate on what are the different characteristics of a type system that can coexist amongst all these different kinds of knowledge that needs to be represented?
[1006] And is, I mean, like, is it hierarchical, just any kind of insights you can give on that kind of type system?
[1007] Yeah, so this gets very nitty -gritty and mathematical, of course.
[1008] But one key part is switching from predicate logic to term logic.
[1009] What is predicate logic?
[1010] What is term logic?
[1011] So term logic was invented by Aristotle, or at least that's the oldest recollection we have of it.
[1012] But term logic breaks down basic logic into basically simple links between nodes.
[1013] It's like an inheritance link between node A and node B. So in term logic, the basic deduction operation is A implies B, B implies C, therefore A implies C. Whereas in predicate logic, the basic operation is modus ponens, like A, A implies B, therefore B. So it's a slightly different way of breaking down logic.
[1014] But by breaking down logic into term logic, you get a nice way of breaking logic.
[1015] down into nodes and links.
[1016] So your concepts can become nodes, the logical relations become links.
[1017] And so then inference is like, so if this link is A implies B, this link is B implies C, then deduction builds a link A implies C. And your probabilistic algorithm can assign a certain weight there.
[1018] Now, you may also have like a Hebbian neural link from A to C, which is the degree to which thinking the degree to which A being the focus of attention should make B the focus of attention, right?
[1019] So you could have then a neural link and you could have a symbolic like logical inheritance link in your term logic and they have separate meaning but they could be used to guide each other as well.
[1020] Like if there's a large amount of neural weight on the link between A and B that may direct your logic engine to think about well what is the relation or are they similar?
[1021] Is there an inheritance relation?
[1022] Are they similar in some context?
[1023] On the other hand, if there's a logical relation between A and B, that may direct your neural component to think, well, when I'm thinking about A, should I be directing some attention to be also?
[1024] Because there's a logical relation.
[1025] So in terms of logic, there's a lot of thought that went into how do you break down logic relations, including basic sort of propositional logic relation, as Aristotelian term logic deals with and then quantify our logic relations also, how do you break those down elegantly into a hypergraph?
[1026] I mean, you can boil logic expression to do a graph in many different ways.
[1027] Many of them are very ugly, right?
[1028] We tried to find elegant ways of sort of hierarchically breaking down complex logic expression into nodes and links so that if you have, say, different nodes representing, you know, Ben, AI, Lex, interview, or whatever, the logic relations between those things are compact in the node and link representation, so that when you have a neural net acting on those same nodes and links, the neural net and the logic engine can sort of interoperate with each other.
[1029] And also interpretable by humans?
[1030] Is that an important...
[1031] That's tough.
[1032] In simple cases, it's interpreted by humans.
[1033] But honestly, you know, I would say logic systems give...
[1034] more potential for transparency and comprehensibility than neural net systems, but you still have to work at it.
[1035] Because I mean, if I show you a predicate logic proposition with like 500 nested universal and existential quantifiers and 217 variables, that's no more comprehensible than the weight matrix of a neural network, right?
[1036] So I'd say the logic expressions and AI learns from its experience are mostly totally opaque to human beings, and maybe even harder to understand the neural net.
[1037] Because, I mean, when you have multiple nested quantifier bindings, it's a very high level of abstraction.
[1038] There is a difference, though, in that within logic, it's a little more straightforward to pose the problem of, like, normalize this and boil this down to a certain form.
[1039] I mean, you can do that in neural nets, too.
[1040] Like, you can distill a neural net to a simpler form, but that's more often done to make a neural net that'll run on an embedded device or something.
[1041] I think it's harder to distill a net to a comprehensible form than is to simplify a logic expression to a comprehensible form, but it doesn't come for free.
[1042] Like what's in the AI's mind is incomprehensible to a human unless you do some special work to make it comprehensible.
[1043] So on the procedural side, there's some different and sort of interesting voodoo there.
[1044] I mean, if you're familiar in computer science, there's something called the Curry Howard correspondence, which is a one -to -one mapping.
[1045] between proofs and programs.
[1046] So every program can be mapped into a proof.
[1047] Every proof can be mapped into a program.
[1048] You can model this using category theory and a bunch of nice math.
[1049] But we want to make that practical, right?
[1050] So that if you have an executable program that moves a robot's arm or figures out in what order to say things in a dialogue, that's a procedure represented in OpenCoggs hypergraph.
[1051] But if you want to reason on how to improve that procedure.
[1052] You need to map that procedure into logic using Curry Howard, the isomorphism, so then the logic engine can reason about how to improve that procedure, and then map that back into the procedural representation that is efficient for execution.
[1053] So again, that comes down to not just can you make your procedure into a bunch of nodes and links, because I mean, that can be done trivially.
[1054] A C++ compiler has nodes and links inside it.
[1055] Can you boil down your procedure into a bunch of nodes and links in a way that's like hierarchically decomposed and simple enough it can reasonable.
[1056] Yeah, yeah, that given the resource constraints at hand, you can map it back and forth to your term logic, like fast enough and without having a bloated logic expression, right?
[1057] So there's just a lot of there's a lot of nitty -gritty particulars there, but by the same token, if you ask a chip designer, like how do you make the Intel I -7 chips so good.
[1058] There's a long list of technical answers there, which will take a while to go through, right?
[1059] And this has been decades of work.
[1060] I mean, the first AI system of this nature I tried to build was called WebMind in the mid -1990s, and we had a big graph, a big graph operating in RAM implemented with Java 1 .1, which is a terrible, terrible implementation idea.
[1061] And then each node had its own processing.
[1062] So, like, there, the core loop loop through all nodes in the network and let each node enact what its little thing was doing.
[1063] And we had logic and neural nets in there, and evolutionary learning.
[1064] But we hadn't done enough of the math to get them to operate together very cleanly.
[1065] So it was really, it was quite a horrible mess.
[1066] So as well as shifting an implementation where the graph is its own object and the agents are separately scheduled, We've also done a lot of work on how do you represent programs, how do you represent procedures, how do you represent genotypes for evolution in a way that the interoperability between the different types of learning associated with these different types of knowledge actually works.
[1067] And that's been quite difficult.
[1068] It's taken decades and it's totally off to the side of what the commercial mainstream of the AI field is doing, which isn't thinking about representation at all, really.
[1069] Although you could see, like in the DNC, they had to think a little bit about how do you make representation of a map in this memory matrix, work together with a representation needed for, say, visual pattern recognition in the hierarchical neural network.
[1070] But I would say we have taken that direction of taking the types of knowledge you need for different types of learning, like declarative, procedural, attentional, and how do you make these types of knowledge represent in a way that allows cross -learning across these different types of memory.
[1071] We've been prototyping and experimenting with this within OpenCog and before that webmines since the mid -1990s.
[1072] Now, disappointingly, to all of us, this has not yet been cashed out in an AGI system, right?
[1073] I mean, we've used this system within our consulting business.
[1074] So we've built natural language processing and robot control and financial analysis.
[1075] We've built a bunch of sort of vertical market -specific proprietary AI projects that use open cog on the back end.
[1076] But we haven't, that's not the AGI goal, right?
[1077] It's interesting, but it's not the AGI goal.
[1078] So now what we're looking at with our rebuild at the system, 2 .0.
[1079] Yeah, we're also calling it True AGI, so we're not quite sure what the name is yet.
[1080] We made a website for 2AGI .io, but we haven't put anything on there yet.
[1081] We may come up with an even better name, but...
[1082] It's kind of like the real AI starting point for your AI book.
[1083] Yeah, but I like True better, because True has, like, you can be true -hearted, right?
[1084] You can be true to your girlfriend, so True has a number, and it also has logic in it, right?
[1085] Because logic is a key point.
[1086] I like it, yeah.
[1087] So, yeah, with the true AGI system, we're sticking with the same basic architecture, but we're trying to build on what we've learned.
[1088] And one thing we've learned is that, you know, we need type checking among dependent types to be much faster and among probabilistic dependent types to be much faster.
[1089] So as it is now, you can have complex types on the nodes and links.
[1090] but if you want to put, like if you want types to be first -class citizens so that you can have, the types can be variables, and then you do type checking among complex, higher -order types.
[1091] You can do that in the system now, but it's very slow.
[1092] This is stuff like it's done in cutting -edge program languages like Agda or something, these obscure research languages.
[1093] On the other hand, we've been doing a lot tying together deep neural nets with symbolic learning.
[1094] So we did a project for Cisco, for example, which was on, this was street scene analysis, but they had deep neural models for a bunch of cameras watching street scenes, but they trained a different model for each camera because they couldn't get the transfer learning to work between camera A and camera B. So we took what came out of all the deep neural models for the different cameras.
[1095] We fed it into an open -called symbolic representation.
[1096] Then we did some pattern mining and some reasoning on what came out of all the different cameras within the symbolic graph.
[1097] And that worked well for that application.
[1098] I mean, Hugo Latapie from Cisco gave a touching on that at last year's AGI conference.
[1099] It was in Shenzhen.
[1100] On the other hand, we learned from there, it was kind of clunky to get the deep neural models to work well with the symbolic system because we were using torch.
[1101] And torch keeps a sort of computation graph, but you needed real -time access to that computation graph within our hypergraph.
[1102] We certainly did it.
[1103] Alexei Polopov, who leads our St. Petersburg team, wrote a great.
[1104] paper on cognitive modules in open cog explaining sort of how do you deal with a torch compute graph inside open cog.
[1105] But in the end, we realized that just hadn't been one of our design thoughts when we built open cog, right?
[1106] So between wanting really fast dependent type checking and wanting much more efficient interoperation between the computation graphs of deep neural net frameworks and open cog's hypergraph and adding on top of that, wanting to more effectively run an open cog hypergraph distributed across RAM in 10 ,000 machines, which is, we're doing dozens of machines now, but it's just not, we, we didn't architect it with that sort of modern scalability in mind.
[1107] So these performance requirements are what have driven us to want to re -architect the base, but the core AGI paradigm doesn't really change.
[1108] Like the mathematics is the same.
[1109] It's just, we can't scale to the level that we want.
[1110] in terms of distributed processing or speed of various kinds of processing with the current infrastructure that was built in the phase 2001 to 2008, which is hardly shocking.
[1111] Well, I mean, the three things you mentioned are really interesting.
[1112] So what do you think about, in terms of interoperability, communicating with computational graph of neural networks, what do you think about the representations that neural networks form?
[1113] They're bad.
[1114] But there's many ways that you could deal with that.
[1115] So I've been wrestling with this a lot in some work on supervised grammar induction, and I have a simple paper on that that I'll give it the next AGI conference, the online portion of which is next week, actually.
[1116] What is grammar induction?
[1117] So this isn't AGI either, but it's sort of on the verge between Neri I and AGI or something.
[1118] on supervised grammar induction is the problem throw your AI system a huge body of text and have it learn the grammar of the language that produced that text.
[1119] So you're not giving it labeled examples so you're not giving it like a thousand sentences where the parses were marked up by graduate students so it's just got to infer the grammar from the text.
[1120] It's like the Rosetta Stone but worse, right?
[1121] Because you only have the one language.
[1122] And you have to figure out what is the grammar.
[1123] So that's not really AGI because, I mean, the way a human learns language is not that, right?
[1124] I mean, we learn from language that's used in context.
[1125] So it's a social embodied thing.
[1126] We see how a given sentence is grounded in observation.
[1127] As an interactive element, I guess.
[1128] Yeah, yeah, yeah.
[1129] On the other hand, so I'm more interested in that.
[1130] I'm more interested in making an AGI system learn language from its social and embodied experience.
[1131] On the other hand, that's also more of a pain to do.
[1132] And that would lead us into Hanson Robotics and their robotics work I've known much.
[1133] We'll talk about in a few minutes.
[1134] But just as an intellectual exercise, as a learning exercise, trying to learn grammar from a corpus is very, very interesting, right?
[1135] And that's been a field in AI for a long time.
[1136] No one can do it very well.
[1137] So we've been looking at transformer neural networks and tree transformers, which are, are amazing.
[1138] These came out of Google Brain, actually.
[1139] And actually, on that team was Lucas Kaiser, who used to work for me in the one in the period 2005 through 8 or something.
[1140] So it's been fun to see my former sort of AGI employees disperse and do all these amazing things.
[1141] Way too many sucked into Google, actually.
[1142] Well, yeah, anyway.
[1143] We'll talk about that, too.
[1144] Lucas Kaiser and a bunch of these guys, they create transformer networks, that classic paper, like attention is all you need and all these things following on from that.
[1145] So we're looking at transformer networks and like these are able to, I mean, this is what underlies GPT2 and GPT3 and so on, which are very, very cool and have absolutely no cognitive understanding of any of the texts they're looking at.
[1146] Like they're very intelligent idiots, right?
[1147] So sorry to take, but the small, I'll bring us back, but do you think GPT3 understands?
[1148] No, no, no, it understands nothing.
[1149] It's a complete idiot, but it's a brilliant idiot.
[1150] You don't think GPT 20 will understand.
[1151] No, no, no. So size is it's not going to buy you understanding.
[1152] Any more than a faster car is going to get you to Mars.
[1153] It's a completely different kind of thing.
[1154] I mean, these networks are very cool, and as an entrepreneur, I can see many highly valuable uses for them.
[1155] And as an artist, I love them, right?
[1156] So, I mean, we're using our own neural model, which is along those lines, to control the Philip K. Dick robot now.
[1157] And it's amazing to, like, train a neural model on the robot Philip K. Dick and see it come up with, like, C .Rae's stoned philosopher pronouncements, very much like what Philip K. Dick might have said, right?
[1158] These models are super cool, and I'm working with Hanson Robotics now on using a similar but more sophisticated one for Sophia, which we haven't launched yet.
[1159] So I think it's cool.
[1160] But it's not understanding.
[1161] These are recognizing a large number of shallow patterns.
[1162] They're not forming an abstract representation.
[1163] And that's the point I was coming to when we're looking at grammar induction.
[1164] we tried to mine patterns out of the structure, the transformer network.
[1165] And you can, but the patterns aren't what you want.
[1166] They're nasty.
[1167] So, I mean, if you do supervised learning, if you look at sentences where you know the correct parts of a sentence, you can learn a matrix that maps between the internal representation of the transformer and the parse of the sentence.
[1168] And so then you can actually train something that will output the sentence parse from the Transformer Network's internal state.
[1169] And we did this, I think Christopher Manning, some others have not done this also.
[1170] But, I mean, what you get is that the representation is horribly ugly and is scoured all over the network and doesn't look like the rules of grammar that you know are the right rules of grammar, right?
[1171] It's kind of ugly.
[1172] So what we're actually doing is we're using a symbolic grammar learning algorithm, but we're using the Transformer Neural Network as a sentence probability oracle.
[1173] So, like, when you, if you have a rule of grammar and you aren't sure if it's a correct rule of grammar or not, you can generate a bunch of senses using that rule of grammar and a bunch of senses violating that rule of grammar.
[1174] And you can see the transformer model doesn't think the sense is obeying the rule of grammar are more probable than the sense is disobeying the rule of grammar.
[1175] So in that way, you can use the neural model as a sentence probability oracle to guide a symbolic grammar learning process.
[1176] And that seems to work better than trying to milk the grammar out of the neural network that doesn't have it in there.
[1177] So I think the thing is these neural nets are not getting a semantically meaningful representation internally by and large.
[1178] So one line of research is to try to get them to do that.
[1179] And InfoGAM was trying to do that.
[1180] So like if you look back like two years ago, there was always, these papers on, like Ed, Edward, this probabilistic programming neural net framework that Google had, which came out of infugan.
[1181] So the idea there was, like, you could train an infugan neural net model, which is a generative associative network to recognize and generate faces, and the model would automatically learn a variable for how long the nose is, and automatically learn a variable for how wide the eyes are or how big the lips are or something, right?
[1182] So it automatically learned these variables, which have a semantic meaning.
[1183] So that was rare case where a neural net trained with a fairly standard GAN method was able to actually learn a semantic representation.
[1184] So for many years, many of us tried to take that the next step and get a GAN -type neural network that would have not just a list of semantic latent variables, but would have, say, a bas -net of semantic latent variables with dependencies between them.
[1185] The whole programming framework, Edward, was made for that.
[1186] I mean, no one got to work, right?
[1187] Do you think it's possible?
[1188] I don't know.
[1189] It might be that back propagation just won't work for it because the gradients are too screwed up.
[1190] Maybe you could get to work using CMAES or some like floating point evolutionary algorithm.
[1191] We tried, we didn't get it to work.
[1192] Eventually we just paused that rather than gave it up.
[1193] We paused that and said, well, okay, let's try more innovative ways to learn what are the representations implicit in that network without trying to make it grow inside that network.
[1194] And I described how we're doing that in language.
[1195] You can do similar things in vision, right?
[1196] Use it as an Oracle.
[1197] Yeah, yeah, yeah.
[1198] So you can, that's one way, is you use a structure learning algorithm, which is symbolic, and then you use the deep neural net as an Oracle to guide the structure learning algorithm.
[1199] The other way to do it is like InfoGem was trying to do and try to tweak the neural network to have the symbolic representation inside it.
[1200] I tend to think what the brain is doing is more like using the deep neural net type thing as an oracle.
[1201] Like I think the visual cortex or the cerebellum are probably learning a non -semantically meaningful, opaque, tangled representation.
[1202] And then when they interface with the more cognitive parts of the cortex, the cortex is sort of using those as an oracle and learning the abstract representation.
[1203] So if you do sports, take, for example, serving in tennis, right?
[1204] I mean, my tennis serve is okay, not great, but I learned it by trial and error, right?
[1205] And, I mean, I learned music by trial and error too.
[1206] I just sit down and play.
[1207] But then if you're an athlete, which I'm not a good athlete, I mean, then you'll watch videos of yourself serving and your coach will help you think about what you're doing and you'll then form a declarative representation, but your cerebellum maybe didn't have a declarative representation.
[1208] Same way with music.
[1209] Like, I will hear something in my head.
[1210] I'll sit down and play the thing like I heard it.
[1211] And then I will try to study what my fingers did to see, like, what did you just play?
[1212] Like, how did you do that, right?
[1213] Because if you're composing, you may want to see how you did it, and then declaratively morph that in some way that your fingers wouldn't think of, right?
[1214] but the physiological movement may come out of some opaque like cerebellar reinforcement learned thing, right?
[1215] And so that's, I think, trying to milk the structure of a neural net by treating it as an oracle, maybe more like how your declarative mind post -processes, what your visual or motor cortex.
[1216] I mean, in vision, it's the same way.
[1217] like you can recognize beautiful art much better than you can say why you think that piece of art is beautiful.
[1218] But if you're trained as an art critic, you do learn to say why.
[1219] And some of it's bullshit, but some of it isn't, right?
[1220] Some of it is learning to map sensory knowledge into declarative and linguistic knowledge, yet without necessarily making the sensory system itself use a transparent and easily communicable representation.
[1221] Yeah, that's fascinating.
[1222] To think of neural networks as like dumb question answers that you can just milk to build up a knowledge base.
[1223] And there could be multiple networks, I suppose, from different...
[1224] Yeah, yeah.
[1225] So I think if a group like DeepMind or OpenA
[1226].I
[1227] were to
[1228] build
[1229] AGI, and I
[1230] think