AI enables cyber attacks that threaten the financial system
Tobias Adrian, Tamas Gaidosch, Rangachary Ravikumar, IMF, May 7, 2026, Financial Stability Risks Mount as Artificial Intelligence Fuels Cyberattacks, https://www.imf.org/en/blogs/articles/2026/05/07/financial-stability-risks-mount-as-artificial-intelligence-fuels-cyberattacks
Artificial intelligence is transforming how the financial system copes with vulnerabilities and reacts to incidents. Yet it is also amplifying cyber threats that can undermine financial stability when the offensive capabilities of intruders outpace defenses.
IMF analysis suggests that extreme cyber‑incident losses could trigger funding strains, raise solvency concerns, and disrupt broader markets.
The financial system relies on shared digital infrastructure that’s highly interconnected, including software, cloud services, and networks for payments and other data. Advanced AI models can dramatically reduce the time and cost needed to identify and exploit vulnerabilities, raising the likelihood of simultaneously discovering and targeting weaknesses in widely used systems. As a result, cyber risk is increasingly about correlated failures that could disrupt financial intermediation, payments, and confidence at the systemic level.
Anthropic’s recent controlled release of its Claude Mythos Preview, an advanced AI model with exceptional cyber capabilities, underscored how quickly risks are increasing. Mythos could find and exploit vulnerabilities in every major operating system and web browser—even when used by non-experts. This foreshadows how fast‑moving, AI‑driven cyber risks could destabilize the financial system if not managed carefully, and why authorities must focus on building resilience through supervision and coordination—rather than treating these developments as purely technical or operational issues.
On the other hand, OpenAI’s specialized, restricted cyber version of GPT‑5.5 assumes vulnerabilities and attacks will grow, and emphasizes equipping defenders more quickly and at scale, under appropriate governance and trusted access models.
Advances change risk equation
Models such as Mythos illustrate the nature of the challenge because they amplify existing cyberattack techniques by operating at machine speed. Attackers have the advantage over defenders because discovering and exploiting vulnerabilities can occur faster than patching and remediation. In a financial system built on common software and shared service providers, this can create simultaneous vulnerabilities across many institutions.
For now, some mitigating factors remain. Advanced AI cyber capabilities are not yet widely available, and closed, industry‑specific financial software is harder to target than open‑source infrastructure. But these buffers are likely to erode quickly as model training expands, capabilities diffuse, and leaks occur. Temporary containment is unlikely to substitute for durable defenses.
Financial stability implications
The new AI‑enabled cyber tools focus the discussion on financial stability:
Risks are systemic. Attacks become more dangerous when discovery and exploitation scale rapidly, with implications for financial stability.
Risks cut across sectors. The financial sector shares digital foundations with energy, telecommunications, and public services. That means AI‑assisted attacks can propagate across sectors that rely on the same infrastructure.
AI may further concentrate risk and failures with one vulnerability rippling across many institutions. Reliance on a small number of software platforms, cloud providers, or AI models increases the impact of any single exploited weakness.
These features elevate cyber risk to a potential macro‑financial shock. Confidence effects, payment disruptions, liquidity strains, and fire‑sale dynamics could follow if multiple institutions are affected simultaneously. For financial authorities, the question is whether the system is prepared to absorb cyber incidents without destabilizing core financial functions.
International cooperation needed to stop cyber attacks
Tobias Adrian, Tamas Gaidosch, Rangachary Ravikumar, IMF, May 7, 2026, Financial Stability Risks Mount as Artificial Intelligence Fuels Cyberattacks, https://www.imf.org/en/blogs/articles/2026/05/07/financial-stability-risks-mount-as-artificial-intelligence-fuels-cyberattacks
Artificial intelligence is transforming how the financial system copes with vulnerabilities and reacts to incidents. Yet it is also amplifying cyber threats that can undermine financial stability when the offensive capabilities of intruders outpace defenses.
IMF analysis suggests that extreme cyber‑incident losses could trigger funding strains, raise solvency concerns, and disrupt broader markets.
The financial system relies on shared digital infrastructure that’s highly interconnected, including software, cloud services, and networks for payments and other data. Advanced AI models can dramatically reduce the time and cost needed to identify and exploit vulnerabilities, raising the likelihood of simultaneously discovering and targeting weaknesses in widely used systems. As a result, cyber risk is increasingly about correlated failures that could disrupt financial intermediation, payments, and confidence at the systemic level.
Anthropic’s recent controlled release of its Claude Mythos Preview, an advanced AI model with exceptional cyber capabilities, underscored how quickly risks are increasing. Mythos could find and exploit vulnerabilities in every major operating system and web browser—even when used by non-experts. This foreshadows how fast‑moving, AI‑driven cyber risks could destabilize the financial system if not managed carefully, and why authorities must focus on building resilience through supervision and coordination—rather than treating these developments as purely technical or operational issues.
On the other hand, OpenAI’s specialized, restricted cyber version of GPT‑5.5 assumes vulnerabilities and attacks will grow, and emphasizes equipping defenders more quickly and at scale, under appropriate governance and trusted access models.
Advances change risk equation
Models such as Mythos illustrate the nature of the challenge because they amplify existing cyberattack techniques by operating at machine speed. Attackers have the advantage over defenders because discovering and exploiting vulnerabilities can occur faster than patching and remediation. In a financial system built on common software and shared service providers, this can create simultaneous vulnerabilities across many institutions.
For now, some mitigating factors remain. Advanced AI cyber capabilities are not yet widely available, and closed, industry‑specific financial software is harder to target than open‑source infrastructure. But these buffers are likely to erode quickly as model training expands, capabilities diffuse, and leaks occur. Temporary containment is unlikely to substitute for durable defenses.
Financial stability implications
The new AI‑enabled cyber tools focus the discussion on financial stability:
Risks are systemic. Attacks become more dangerous when discovery and exploitation scale rapidly, with implications for financial stability.
Risks cut across sectors. The financial sector shares digital foundations with energy, telecommunications, and public services. That means AI‑assisted attacks can propagate across sectors that rely on the same infrastructure.
AI may further concentrate risk and failures with one vulnerability rippling across many institutions. Reliance on a small number of software platforms, cloud providers, or AI models increases the impact of any single exploited weakness.
These features elevate cyber risk to a potential macro‑financial shock. Confidence effects, payment disruptions, liquidity strains, and fire‑sale dynamics could follow if multiple institutions are affected simultaneously. For financial authorities, the question is whether the system is prepared to absorb cyber incidents without destabilizing core financial functions.
AI in cyber defense
AI is also a critical part of the solution. When attackers operate at machine speed, defenders must do the same. Financial institutions increasingly use AI‑supported tools to detect threats, prevent fraud, identify vulnerabilities, and respond to incidents.
AI also can help reduce vulnerabilities at the development stage rather than patching them after release. For widely used financial infrastructure, these gains can meaningfully reduce systemic exposure. But these benefits will materialize only if institutions invest in integration, governance, and human oversight—areas that supervisors increasingly need to assess. This also includes business continuity and disaster recovery, cyber and quality assurance programs, and good cyber hygiene practices.
Resilience-first policy framework
AI-driven cyber risk demands a policy response that treats cybersecurity as a core financial stability issue. Existing measures remain relevant, but they must be expanded and sharpened for a world of faster, automated, and increasingly sophisticated attacks. Policymakers should prioritize robust resilience standards, supervision focused on systemic transmission channels, and close public-private collaboration on threat intelligence and incident response.
Defenses will inevitably be breached, so resilience must also be a priority, specifically to limit how far incidents spread and ensure rapid recovery. Controls to stop the spread of attacks can prevent local breaches from escalating into system‑wide disruptions. These measures are often costly and complex, but they are among the most effective tools for containing AI‑enabled attacks.
From a supervisory perspective, this underscores the need to focus not only on prevention, but on response, recovery, and continuity of critical functions. Cyber stress testing, scenario analysis, and board‑level oversight of cyber risk are becoming indispensable components of financial stability frameworks.
International cooperation is vital
The Mythos episode also highlights governance challenges. Cyber risk does not respect borders. As AI capabilities spread across countries, inconsistent oversight could weaken a globally interconnected system.
Emerging and developing economies, which often have more severe resource constraints, may be disproportionately exposed to attackers targeting regions with weaker defenses. That’s why stronger international coordination, more information sharing, and expanded capacity development are critical to preserving global financial stability.
As AI reshapes the cyber landscape, the central question for authorities is whether the financial system can continue to function under severe stress. Answering that question requires putting systemic risk—and the tools to manage it—at the center of the AI‑cyber conversation.
Superintelligent AI risks human extinction
Miotti, 2026, Andrea Miotti is the founder and CEO of ControlAI, a non-profit working to keep humanity in control of advanced AI., The Spectator, https://archive.ph/KTxDR#selection-2063.0-2063.115
A spectre is hanging over humanity: the spectre of superintelligent AI. While governments busy themselves with the mundane work of politics and putting out the fire of the day, the most consequential technological development since the splitting of the atom is accelerating beyond anyone’s ability to control it.
We are entering an era where the AI systems themselves are threats, not just humans
Anthropic, one of the world’s leading AI companies, recently announced a new AI system, Claude Mythos. The model can autonomously find and exploit critical security vulnerabilities in every major operating system and internet browser underpinning our digital infrastructure, including flaws that survived decades of human review.
Anthropic withheld the model from public release because, in their own words, ‘the fallout for economies, public safety and national security could be severe’. The UK’s AI Security Institute (AISI) confirmed the assessment: Mythos is substantially more capable at cyber offence than any model it has previously tested.
But the government’s response has been tepid. They have simply had the AISI publish a blogpost about Mythos and had the Technology Secretary tell businesses they should brush up on cybersecurity and sign up for a cyber attack early warning service.
The government is missing the forest for the trees. Yes, cyberattacks will become easier. But the real significance of Mythos is that it can do all of this on its own: identifying vulnerabilities, developing exploits, and chaining them together across networks, without human direction. We are entering an era where the AI systems themselves are threats, not just humans. And this is the least capable these systems will ever be. The length of tasks AI systems can complete autonomously is doubling every few months.
Think back to February 2020. Covid case numbers were still low in most countries, and governments and the mainstream media were focusing only on that: today’s case count, yesterday’s deaths. At the same time, epidemiologists were sounding the alarm. What mattered to them was not the current number of cases, but how fast that number was doubling. A virus doubling every few days looks manageable right up until the moment the health system is overwhelmed. Only a month later, the world was shutting down.
We are now making the same mistake again. The government is watching the immediate problem – cyberattacks getting easier – and u
At the current rate of improvement, many AI experts believe superintelligent AI could arrive within the next two to five years. Many of those same experts, including Nobel laureates and AI company CEOs, warn that AI poses an extinction risk to humanity.
The window of opportunity to act and prevent catastrophe is still open. By acting today, we will spare ourselves the need for more drastic measures later. But on AI, the government has lost the nerve to act with conviction.
It has also lost the habit of foresight that once came naturally to British statecraft. In 1924, when the most destructive weapon in existence was the artillery shell, Winston Churchill published an essay asking ‘Shall we all commit suicide?’. He argued that science was on the verge of producing weapons so powerful that the League of Nations, ‘airy and unsubstantial, framed of shining but too often visionary idealism,’ would prove incapable of guarding the world from them. He was writing 20 years before Hiroshima.
Seven years later, in ‘Fifty Years Hence’, Churchill described with startling precision the physics of nuclear fusion and the horsepower a pound of water might yield if its atoms could be induced to combine. ‘There is no question among scientists that this gigantic source of energy exists,’ he wrote. ‘What is lacking is the match to set the bonfire alight.’ The match was found in 1945.
Churchill did what serious statesmen are supposed to do. He looked at the trajectory of scientific progress, took the warnings of scientists seriously, and asked what governments needed to do to prevent catastrophe. Today’s warnings come from the very people building these systems, and they are not talking about a risk decades away.
Britain is not powerless to act, and is in fact better placed than most to lead on addressing the threat from superintelligent AI. Britain convened the first global AI Safety Summit at Bletchley Park. Over a hundred UK parliamentarians have backed a statement from my organisation ControlAI recognising the extinction risk from AI and identifying superintelligent AI as a national and global security threat. The House of Lords held two substantive debates on superintelligent AI in January alone, including on whether to pursue an international moratorium. There is political will for action in Westminster, even if Downing Street has not yet caught up.
The response must match the scale of the threat, and superintelligent AI should be treated as what it is: a national and global security risk of the highest order. That starts with the government saying so, openly, and working with allies on how to confront it. It must end with preventing the development of superintelligent AI at home and building an international coalition to prohibit it globally.
If we don’t, there will be no chance for inquiries, apologies, or promises to do better next time. There won’t even be anyone left to blame.
AGI timelines rapidly advancing
David Schwartz, April 15, 2026, A visualization of changing AGI timelines, 2023 – 2026, https://www.lesswrong.com/posts/Tc5AbEpbFFdNx5nkP/a-visualization-of-changing-agi-timelines-2023-2026
AI 2027 came out a year ago, and in reviewing it now, I saw that AI Futures researchers Daniel Kokotajlo, Eli Lifland, and Nikola Jurkovic had updated their AGI timelines to be later over the course of 2025. Then, in 2026, Daniel and Eli updated in the other direction to expect AGI to come sooner. I noticed others with great track records also had made multiple AGI forecasts. A change in the forecast of a single person is meaningful in the way that a change in an aggregate forecast may hide. A change in an aggregate forecast might come entirely from a change in who is forecasting, not what those people individually believe. So I decided to visualize what the net direction of updates were over the last few years. I find this provides a complementary view of AI timelines compared to those by Metaculus, Epoch, AI Digest, and others. So here is a visualization of AGI forecasts. Criteria for inclusion were: the person has made at least 2 forecasts, they gave specific dates, they gave a sense of confidence interval/uncertainty, and their definitions of AGI are similar. Some major caveats. Everyone has different definitions of AGI. (That is a big advantage of everyone forecasting the same question on Metaculus, or the 2025 or 2026 AI forecast survey run by AI Digest.) Often individual people even use different definitions of AGI at different times for their own forecasts. I included data points above if I judged that their definition was substantially similar to: AGI: Most purely cognitive labor is automatable at better quality, speed, and cost than humans. I was pretty generous with this, and it’s very debatable whether e.g. a “superhuman coder” from AI 2027 is AGI in the same way that “99% of remote work can be automated” is AGI. Apologies to those in the visualization who would disagree that the definition they used is similar enough to this and don’t feel like this chart captures their views faithfully. Second caveat, I rounded when forecasts were made to be as if they were made on four dates: <= 2023, early 2025, late 2025, and April 2026. This made the visualization much easier to see. So a further apology to those above if you made a prediction in, say, Aug 2025 but I marked this as “late 2025”. Third caveat, the type of confidence intervals various researchers used also varied substantially. I had to really guess or extrapolate to approximate these as 80% confidence intervals, so a final apology if you don’t think the range you give is fairly characterized as an 80% CI. All caveats aside, what impression does this visualization give? Are reputable AI experts who have made multiple predictions updating the same way that Daniel Kokotajlo and Eli Lifland did, pushing out their timelines in 2025, and pulling them in during 2026? From the visualization, it looks to me that in 2023 and 2024, most people brought their AGI timelines in to be sooner, though with some exceptions like Tamay Besilogru. From 2025 to 2026, joining Daniel and Eli in pushing their timelines out are the Metaculus community, Dario Amodei, and elite forecast Peter Wildeford. In fact, across 2025, only Benjamin Todd brought in his timelines to say AGI would happen sooner. Most notably though, every single person who updated their timelines between January 2026 to April 2026 has moved it their timeline to say AGI is coming sooner, myself included. So I think the data supports the impression I got from the AI 2027 authors. One way I could characterize it is: In the OpenAI/ChatGPT era of 2023-2024, people updated towards AGI coming sooner. In the xAI, Meta, and Gemini era of 2025, people updated towards AGI coming later. In the Anthropic era of 2026, people updated back towards AGI coming sooner. Take from that what you will. Bayesians shouldn’t be able to predict which direction they will update. But seeing the history of other people’s updates is useful information. It does give me intuitions about how I or others may update soon, so I take that as evidence that I should update now. (A similar post is also on the FutureSearch blog, where I plan to updat
AI means extinction
Stuart Russel, 12-4, 25, Professor Stuart Russell O.B.E. is a world-renowned AI expert and Computer Science Professor at UC Berkeley. He holds the Smith-Zadeh Chair in Engineering and directs the Center for Human-Compatible AI, and is also the bestselling author of the book “Human Compatible: AI and the Problem of Control”, YouTube, An AI Expert Warning: 6 People Are (Quietly) Deciding Humanity’s Future! We Must Act Now!, https://www.youtube.com/watch?v=P7Y-fynYsgE
0:00 Steven Bartlett: In October, over 850 experts, including yourself and other leaders like Richard Branson and Jeffrey Hinton, signed a statement to ban AI super intelligence as you guys raised concerns of potential human extinction. Because unless we figure out how do we guarantee that the AI systems are safe, we’re toast. And you’ve been so influential on the subject of AI, you wrote the textbook that many of the CEOs who are building some of the AI companies now would have studied on the subject of AI. Stuart Russell: Yeah. Steven Bartlett: So, do you have any regrets? Um, Professor Stuart Russell has been named one of Time magazine’s most influential voices in AI. After spending over 50 years researching, teaching, and finding ways to design AI in such a way that humans maintain control, you talk about this gorilla problem as a way to understand AI in the context of humans. Stuart Russell: Yeah. So, a few million years ago, the human line branched off from the gorilla line in evolution, and now the gorillas have no say in whether they continue to exist because we are much smarter than they are. Stuart Russell: So intelligence is actually the single most important factor to control planet Earth. Steven Bartlett: Yep. But we’re in the process of making something more intelligent than us. Stuart Russell: Exactly. Steven Bartlett: Why don’t people stop then? Stuart Russell: Well, one of the reasons is something called the Midas touch. So King Midas is this legendary king who asked the gods, can everything I touch turn to gold? And we think of the Midas touch as being a good thing, but he goes to drink some water, the water has turned to gold. And he goes to comfort his daughter, his daughter turns to gold. So he dies in misery and starvation. So this applies to our current situation in two ways. One is that greed is driving these companies to pursue technology with the probabilities of extinction being worse than playing Russian roulette. And that’s even according to the people developing the technology without our permission. And people are just fooling themselves if they think it’s naturally going to be controllable. So, you know, after 50 years, I could retire, but instead I’m working 80 or 100 hours a week trying to move things in the right direction. Steven Bartlett: So, if you had a button in front of you which would stop all progress in artificial intelligence, would you press it? Stuart Russell: Not yet. I think there’s still a decent chance they guarantee safety. And I can explain more of what that is. Steven Bartlett: I see messages all the time in the comments section that some of you didn’t realize you didn’t subscribe. So, if you could do me a favor and double check if you’re a subscriber to this channel, that would be tremendously appreciated. It’s the simple, it’s the free thing that anybody that watches this show frequently can do to help us here to keep everything going in this show in the trajectory it’s on. So, please do double check if you’ve subscribed and uh thank you so much because in a strange way you are you’re part of our history and you’re on this journey with us and I appreciate you for that. So, yeah, thank you. You Wrote the Textbook on AI 2:41 Steven Bartlett: Professor Stuart Russell, OBBE. A lot of people have been talking about AI for the last couple of years. It appears you’ve—this really shocked me—it appears you’ve been talking about AI for most of your life. Stuart Russell: Well, I started doing AI in high school um back in England, but then I did my PhD starting in ’82 at Stanford. I joined the faculty of Berkeley in ’86. So I’m in my 40th year as a professor at Berkeley. The main thing that the AI community is familiar with in my work uh is a textbook that I wrote. Steven Bartlett: Is this the textbook that most students who study AI are likely learning from? Stuart Russell: Yeah. It Will Take a Crisis to Wake People Up 3:20 Steven Bartlett: So you wrote the textbook on artificial intelligence 31 years ago. You actually start probably started writing it because it’s so bloody big in the year that I was born. So I was born in 92. Stuart Russell: Uh yeah, took me about two years. Steven Bartlett: Me and your book are the same age, which just is wonderful way for me to understand just how long you’ve been talking about this and how long you’ve been writing about this. And actually, it’s interesting that many of the CEOs who are building some of the AI companies now probably learned from your textbook. You had a conversation with somebody who said that in order for people to get the message that we’re going to be talking about today, there would have to be a catastrophe for people to wake up. Can you give me context on that conversation and a gist of who you had this conversation with? Stuart Russell: Uh, so it was with one of the CEOs of uh a leading AI company. He sees two possibilities as do I which is um either we have a small or let’s say small scale disaster of the same scale as Chernobyl the nuclear meltdown in Ukraine. Steven Bartlett: Yeah. Stuart Russell: So this uh nuclear plant blew up in 1986 killed uh a fair number of people directly and maybe tens of thousands of people indirectly through uh radiation. Recent cost estimates more than a trillion dollars. So that would wake people up. That would get the governments to regulate. He’s talked to the governments and they won’t do it. So he looked at this Chernobyl scale disaster as the best case scenario because then the governments would regulate and require AI systems to be built. Steven Bartlett: And is this CEO building an AI company? Stuart Russell: He runs one of the leading AI companies. Steven Bartlett: And even he thinks that the only way that people will wake up is if there’s a Chernobyl level nuclear disaster. Stuart Russell: Uh yeah, not wouldn’t have to be a nuclear disaster. It would be either an AI system that’s being misused by someone, for example, to engineer a pandemic or an AI system that does something itself, such as crashing our financial system or our communication systems. The alternative is a much worse disaster where we just lose control altogether. CEOs Staying in the AI Race Despite Risks 5:54 Steven Bartlett: You have had lots of conversations with lots of people in the world of AI, both people that are, you know, have built the technology, have studied and researched the technology or the CEOs and founders that are currently in the AI race. What are some of the the interesting sentiments that the general public wouldn’t believe that you hear privately about their perspectives? Because I find that so fascinating. I’ve had some private conversations with people very close to these tech companies and the shocking sentiment that I was exposed to was that they are aware of the risks often but they don’t feel like there’s anything that can be done so they’re carrying on which is feels like a bit of a paradox to me like yes it’s it’s it must be a very difficult position to be in in a sense right you’re you’re doing something that you know has a good chance of bringing an end to life on including that of yourself and your own family. Stuart Russell: They feel that they can’t escape this race, right? If they, you know, if a CEO of one of those companies was to say, you know, we’re we’re not going to do this anymore, they would just be replaced because the investors are putting their money up because they want to create AGI and reap the benefits of it. So, it’s a strange situation where every at least all the ones I’ve spoken to, I haven’t spoken to Sam. Altman about this, but you know, Sam Altman even before becoming CEO of Open AI said that creating superhuman intelligence is the biggest risk to human existence that there is. My worst fears are that we cause significant—we the field the technology the industry—cause significant harm to the world. You know Elon Musk is also on record saying this. So uh Dario Amodei estimates up to a 25% risk of extinction. They Know It’s an Extinction-Level Risk 7:56 Steven Bartlett: Was there a particular moment when you realized that the CEOs are well aware of the extinction level risks? Stuart Russell: I mean, they all signed a statement in May of 23 uh called it’s called the extinction statement. It basically says AGI is an extinction risk at the same level as nuclear war and pandemics. But I don’t think they feel it in their gut. You know, imagine that you were one of the nuclear physicists. You know, I guess you’ve seen Oppenheimer, right? You’re there, you’re watching that first nuclear explosion. How how would that make you feel about the potential impact of nuclear war on the human race? Right? I I think you would probably become a pacifist and say this weapon is so terrible, we have got to find a way to uh keep it under control. We are not there yet with the people making these decisions and certainly not with the governments, right? You know what policy makers do is they, you know, they listen to experts. They keep their finger in the wind. You got some experts, you know, dangling $50 billion checks and saying, “Oh, you know, all that doomer stuff, it’s just fringe nonsense. Don’t worry about it. Take my $50 billion check.” You know, on the other side, you’ve got very well-meaning, brilliant scientists like Jeff Hinton saying, actually, no, this is the end of the human race. But Jeff doesn’t have a $50 billion check. So the view is the only way to stop the race is if governments intervene and say okay we don’t we don’t want this race to go ahead until we can be sure that it’s going ahead in absolute safety. What Is Artificial General Intelligence (AGI)? 9:55 Steven Bartlett: Closing off on your career journey, you got a you received an OB from Queen Elizabeth. Stuart Russell: Uh yes. Steven Bartlett: And what was the listed reason for that for the award? Stuart Russell: Uh contributions to artificial intelligence research. Steven Bartlett: And you’ve been listed as a Time magazine most influential person in in AI several years in a row including this year in 2025. Stuart Russell: Yup. Steven Bartlett: Now there’s two terms here that are central to the things we’re going to discuss. One of them is AI and the other is AGI. In my muggle interpretation of that, it’s artificial general intelligence is when the system, the computer, whatever it might be, the technology has generalized intelligence, which means that it could theoretically see, understand um the world. It knows everything. It can understand everything in the the world as well as or better than a human being. Stuart Russell: Yeah. Can do it. And I think take action as well. I mean some some people say oh you know AGI doesn’t have to have a body but a good chunk of our intelligence actually is about managing our body about perceiving the real environment and acting on it moving grasping and so on. So I think that’s part of intelligence and and AGI systems should be able to operate robots successfully. But there’s often a misunderstanding, right, that people say, well, if it doesn’t have a robot body, then it can’t actually do anything. But then if you remember, most of us don’t do things with our bodies. Some people do, brick layers, painters, gardeners, chefs, um, but people who do podcasts, you’re doing it with your mind, right? You’re doing it with your ability to to produce language. Uh, you know, Adolf Hitler didn’t do it with his body. He did it by producing language. Steven Bartlett: Hope you’re not comparing us. Stuart Russell: But but uh you know so even an AGI that has no body uh it actually has more access to the human race than Adolf Hitler ever did because it can send emails and texts to what three-quarters of the world’s population directly. It can—it also speaks all of their languages and it can devote 24 hours a day to each individual person on earth to convince them of to do whatever it wants them to do. Steven Bartlett: And our whole society runs now on the internet. I mean if there’s an issue with the internet, everything breaks down in society. Airplanes become grounded and we’ll have electricity is running off as internet systems. So I mean my entire life it seems to run off the internet now. Stuart Russell: Yeah. Water supplies. So, so this is one of the roots by which AI systems could bring about a medium-sized catastrophe is by basically shutting down our life support systems. Will We Reach General Intelligence Soon? 13:01 Steven Bartlett: Do you believe that at some point in the coming decades we’ll arrive at a point of AGI where these systems are generally intelligent? Stuart Russell: Uh yes, I think it’s virtually certain unless something else intervenes like a nuclear war or or we may refrain from doing it. But I think it will be extraordinarily difficult uh for us to refrain. Steven Bartlett: When I look down the list of predictions from the top 10 AI CEOs on when AGI will arrive, you’ve got Sam Altman who’s the founder of OpenAI/ChatGPT um says before 2030. Demis at DeepMind says 2030 to 2035. Jensen from Nvidia says around five years. Dario at Anthropic says 2026 to 2027. Powerful AI close to AGI. Elon says in the 2020s. Um and go down the list of all of them and they’re all saying relatively within 5 years. Stuart Russell: I actually think it’ll take longer. I don’t think you can make a prediction based on engineering um in the sense that yes, we could make machines 10 times bigger and 10 times faster, but that’s probably not the reason why we don’t have AGI, right? In fact, I think we have far more computing power than we need for AGI. Maybe a thousand times more than we need. The reason we don’t have AGI is because we don’t understand how to make it properly. Um what we’ve seized upon is one particular technology called the language model. And we observed that as you make language models bigger, they produce text language that’s more coherent and sounds more intelligent. And so mostly what’s been happening in the last few years is just okay let’s keep doing that because one thing companies are very good at unlike universities is spending money. They have spent gargantuan amounts of money and they’re going to spend even more gargantuan amounts of money. I mean you know we mentioned nuclear weapons. So the Manhattan project uh in World War II to develop nuclear weapons, its budget in 2025 was about 20 odd billion dollars. The budget for AGI is going to be a trillion dollars next year. So 50 times bigger than the Manhattan project. Steven Bartlett: Humans have a remarkable history of figuring things out when they galvanize towards a shared objective. You know, thinking about the moon landings or whatever it else it might be through history. And the thing that makes this feel all quite inevitable to me is just the sheer volume of money being invested into it. I’ve never seen anything like it in my life. Stuart Russell: Well, there’s never been anything like this in history. Steven Bartlett: Is this the biggest technology project in human history by orders of magnitude? How Much Is Safety Really Being Implemented 16:16 Steven Bartlett: And there doesn’t seem to be anybody that is pausing to ask the questions about safety. It doesn’t it doesn’t even appear that there’s room for that in such a race. Stuart Russell: I think that’s right. To varying extents, each of these companies has a division that focuses on safety. Does that division have any sway? Can they tell the other divisions, no, you can’t release that system? Not really. Um I think some of the companies do take it more seriously. Anthropic uh does. I think Google DeepMind even there I think the commercial imperative to be at the forefront is absolutely vital. If a company is perceived as you know falling behind and not likely to be competitive, not likely to be the one to reach AGI first, then people will move their money elsewhere very quickly. AI Safety Employees Leaving OpenAI 17:16 Steven Bartlett: And we saw some quite high-profile departures from company like companies like OpenAI. Um, I know a chap called Yan Leike left who was working on AI safety at OpenAI and he said that the reason for his leaving was that safety culture and processes processes have taken a backseat to shiny products at OpenAI and he gradually lost trust in leadership but also Ilia Sutskever… Stuart Russell: Ilia Sutskever yeah so he was the co-founder co-founder and chief scientist for a while and then yeah so he and Yan Leike are the main safety people. Um, and so when they say OpenAI doesn’t care about safety, that’s pretty concerning. The Gorilla Problem – The Most Intelligent Species Will Always Rule 18:06 Steven Bartlett: I’ve heard you talk about this gorilla problem. What is the gorilla problem as a way to understand AI in the context of humans? Stuart Russell: So, so the gorilla problem is is the problem that gorillas face with respect to humans. So you can imagine that you know a few million years ago the the human line branched off from the gorilla line in evolution. Uh and now the gorillas are looking at the human line and saying yeah was that a good idea and they have no um they have no say in whether they continue to exist because we have a we are much smarter than they are. If we chose to, we could make them extinct in in a couple of weeks and there’s nothing they can do about it. So that’s the gorilla problem, right? Just the the problem a species faces when there’s another species that’s much more capable. Steven Bartlett: And so this says that intelligence is actually the single most important factor to control planet Earth. Stuart Russell: Yes. Intelligence is the ability to bring about what you want in the world. Steven Bartlett: And we’re in the process of making something more intelligent than us. Stuart Russell: Exactly. Steven Bartlett: Which suggests that maybe we become the gorillas. Stuart Russell: Exactly. Yeah. If There’s an Extinction Risk, Why Don’t They Stop? 19:24 Steven Bartlett: Is that is there any fault in the reasoning there? Because it seems to make such perfect sense to me. But if it—Why doesn’t—Why don’t people stop then? Cuz it it seems like a crazy thing to want to— Stuart Russell: Because they think that uh if they create this technology, it will have enormous economic value. They’ll be able to use it to replace all the human workers in the world uh to develop new uh products, drugs, um forms of entertainment, any anything that has economic value, you could use AGI to to create it. And and maybe it’s just an irresistible thing in itself, right? I think we as humans place so much store on our intelligence. You know, you know, how we think about, you know, what is the pinnacle of human achievement? If we had AGI, we could go way higher than that. So it it’s very seductive for people to want to create this technology and I think people are just fooling themselves if they think it’s naturally going to be controllable. I mean the question is how are you going to retain power forever over entities more powerful than yourself? Can’t We Just Pull the Plug if AI Gets Too Powerful? 20:50 Steven Bartlett: Pull the plug out. People say that sometimes in the comment section when we talk about AI, they said, “Well, I’ll just pull a plug out.” Stuart Russell: Yeah, it’s it’s sort of funny. In fact, you know, yeah, reading the comment sections in newspapers, whenever there’s an AI article, there’ll be people who say, “Oh, you can just pull the plug out, right?” As if a super intelligent machine would never have thought of that one. Don’t forget who’s watched all those films where they did try to pull the plug out. Another thing they said, well, you know, as long as it’s not conscious, then it doesn’t matter. It won’t ever do anything. Um, which is completely off the point because, you know, I I don’t think the gorillas are sitting there saying, “Oh, yeah, you know, if only those humans hadn’t been conscious, everything would have be fine, right?” No, of course not. What would make gorillas go extinct is the things that humans do, right? How we behave, our ability to act successfully in the world. So when I play chess against my iPhone and I lose, right, I don’t I don’t think, oh, well, I’m losing because it’s conscious, right? No, I’m just losing because it’s better than I am at at in that little world uh moving the bits around uh to to get what it wants. And and so consciousness has nothing to do with it, right? Competence is the thing we’re concerned about. So I think the only hope is can we simultaneously build machines that are more intelligent than us but guarantee that they will always act in our best interest. Can We Build AI That Will Act in Our Best Interests? 22:38 Steven Bartlett: So throwing that question to you, can we build machines that are more intelligent than us that will also always act in our best interests? It sounds like a bit of a uh contradiction to some degree because it’s kind of like me saying I’ve got a French bulldog called Pablo that’s uh 9 years old and it’s like saying that he could be more intelligent than me yet I still walk him and decide when he gets fed. I think if he was more intelligent than me he would be walking me. I’d be on the leash. Stuart Russell: That’s the That’s the trick, right? Can we make AI systems whose only purpose is to further human interests? And I think the answer is yes. And this is actually what I’ve been working on. So I I think one part of my career that I didn’t mention is is sort of having this epiphany uh while I was on sabbatical in Paris. This was 2013 or so. Just realizing that further progress in the capabilities of AI uh you know if if we succeeded in creating real superhuman intelligence that it was potentially a catastrophe and so I pretty much switched my focus to work on how do we make it so that it’s guaranteed to be safe. Are You Troubled by the Rapid Advancement of AI? 24:01 Steven Bartlett: Are you somewhat troubled by everything that’s going on at the moment with with AI and how it’s progressing? Because you strike me as someone that’s somewhat troubled under the surface by the way things are moving forward and the speed in which they’re moving forward. Stuart Russell: That’s an understatement. I’m appalled actually by the lack of attention to safety. I mean, imagine if someone’s building a nuclear power station in your neighborhood and you go along to the chief engineer and you say, “Okay, these nuclear thing, I’ve heard that they can actually explode, right? There was this nuclear explosion that happened in Hiroshima, so I’m a bit worried about this. You know, what steps are you taking to make sure that we don’t have a nuclear explosion in our backyard?” And the chief engineer says, “Well, we thought about it. We don’t really have an answer.” Steven Bartlett: Yeah. Stuart Russell: You would, what would you say? I think you would you would use some exploitives. Steven Bartlett: Well, and you’d call your MP and say, you know, get these people out. Stuart Russell: I mean, what are they doing? You read out the list of you know projected dates for AGI but notice also that those people I think I mentioned Dario Amodei says a 25% chance of extinction. Elon Musk has a 30% chance of extinction. Sam Altman says basically that AGI is the biggest risk to human existence. So what are they doing? They are playing Russian roulette with every human being on Earth. Without our permission. They’re coming into our houses, putting a gun to the head of our children, pulling the trigger, and saying, “Well, you know, possibly everyone will die. Oops. But possibly we’ll get incredibly rich.” That’s what they’re doing. Did they ask us? No. Why is the government allowing them to do this? Because they dangle $50 billion checks in front of the governments. So I think troubled under the surface is an understatement. Steven Bartlett: What would be an accurate statement? Stuart Russell: Appalled and I I am devoting my life to trying to divert from this course of history into a different one. Do You Have Regrets About Your Involvement? 26:38 Steven Bartlett: Do you have any regrets about things you could have done in the past because you’ve been so influential on the subject of AI? You wrote the textbook that many of these people would have studied on the subject of AI more than 30 years ago. Do do you have when you’re alone at night and you think about decisions you’ve made on this in this field because of your scope of influence? Is there anything you you regret? Stuart Russell: Well, I do wish I had understood earlier uh what I understand now. We could have developed safe AI systems. I think the there are some weaknesses in the framework which I can explain but I think that framework could have evolved to develop actually safe AI systems where we could prove mathematically that the system is going to act in our interests. The kind of AI systems we’re building now, we don’t understand how they work. No One Actually Understands How This AI Works 27:26 Stuart Russell: We don’t understand how they work. It’s it’s a strange thing to build something where you don’t understand how it works. I mean, there’s no sort of comparable through human history. Usually with machines, you can pull it apart and see what cogs are doing what and how the— Steven Bartlett: Well, actually, we we put the cogs together, right? So, with with most machines, we designed it to have a certain behavior. So, we don’t need to pull it apart and see what the cogs are because we put the cogs in there in the first place, right? One by one we figured out what what the pieces needed to be how they work together to produce the effect that we want. So the best analogy I can come up with is you know the the first cave person who left a bowl of fruit in the sun and forgot about it and then came back a few weeks later and there was sort of this big soupy thing and they drank it and got completely shitfaced. They got drunk. Okay. And they got this effect. They had no idea how it worked, but they were very happy about it. And no doubt that person made a lot of money from it. Uh so yeah, it it is kind of bizarre, but my mental picture of these things is is like a chain link fence, right? Stuart Russell: So you’ve got lots of these connections and each of those connections can be its connection strength can be adjusted and then uh you know a signal comes in one end of this chain link fence and passes through all these connections and comes out the other end and the signal that comes out the other end is affected by your adjusting of all the connection strengths. So what you do is you you get a whole lot of training data and you adjust all those connection strengths so that the signal that comes out the other end of the network is the right answer to the question. So if your training data is lots of photographs of animals, then all those pixels go in one end of the network and out the other end, you know, it activates the llama output or the dog output or the cat output or the ostrich output. And uh and so you just keep adjusting all the connection strengths in this network until the outputs of the network are the ones you want. Steven Bartlett: But we don’t really know what’s going on across all of those different chains. So what’s going on inside that network? Stuart Russell: Well, so now you have to imagine that this network, this chain link fence is is a thousand square miles in extent. Okay, so it’s covering the whole of the San Francisco Bay area or the whole of London inside the M25, right? That’s how big it is. And the lights are off. It’s night time. So you might have in that network about a trillion uh adjustable parameters and then you do quintillions or sextillions of small random adjustments to those parameters uh until you get the behavior that you want. AI Will Be Able to Train Itself 30:23 Steven Bartlett: I’ve heard Sam Altman say that in the future he doesn’t believe they’ll need much training data at all to make these models progress themselves because there comes a point where the models are so smart that they can train themselves and improve themselves without us needing to pump in articles and books and scour the internet. Stuart Russell: Yeah, it should it should work that way. So I think what he’s referring to and this is something that several companies are now worried might start happening is that the AI system becomes capable of doing AI research by itself. And so uh you have a system with a certain capability. I mean crudely we could call it an IQ but it’s it’s not really an IQ. But anyway, imagine that it’s got an IQ of 150 and uses that to do AI research, comes up with better algorithms or better designs for hardware or better ways to use the data, updates itself. Now it has an IQ of 170, and now it does more AI research, except that now it’s got an IQ of 170, so it’s even better at doing the AI research. And so, you know, next iteration it’s 250 and uh and so on. So this this is an idea that one of Alan Turing’s friends Good uh wrote out in 1965 called the intelligence explosion right that one of the things an intelligence system could do is to do AI research and therefore make itself more intelligent and this would uh this would very rapidly take off and leave the humans far behind. Steven Bartlett: Is that what they call the fast takeoff? Stuart Russell: That’s called the fast takeoff. The Fast Takeoff Is Coming 32:15 Steven Bartlett: Sam Altman said, “I think a fast takeoff is more possible than I thought a couple of years ago.” Which I guess is that moment where the AGI starts teaching itself. In and in his blog, the gentle singularity, he said, “We may already be past the event horizon of takeoff.” And what does what does he mean by event horizon? Stuart Russell: The event horizon is is a phrase borrowed from astrophysics and it refers to uh the black hole. And the event horizon, think it if you got some very very massive object that’s heavy enough that it actually prevents light from escaping. That’s why it’s called the black hole. It’s so heavy that light can’t escape. So if you’re inside the event horizon then then light can’t escape beyond that. So I think what he’s what he’s meaning is if we’re beyond the event horizon it means that you know now we’re just trapped in the gravitational attraction of the black hole or in this case we’re we’re trapped in the inevitable slide if you want towards AGI. When you when you think about the economic value of AGI, which I’ve estimated at uh 15 quadrillion dollars, that acts as a giant magnet in the future. We’re being pulled towards it. We’re being pulled towards it. And the closer we get, the stronger the force, the probability, you know, the closer we get, the the the higher the probability that we will actually get there. So, people are more willing to invest. And we also start to see spin-offs from that investment such as chat GPT, right, which is, you know, generates a certain amount of revenue and so on. So, so it does act as a magnet and the closer we get, the harder it is to pull out of that field. Are We Creating Our Successor and Ending the Human Race? 34:07 Steven Bartlett: It’s interesting when you think that this could be the the end of the human story. This idea that the end of the human story was that we created our successor like we we summoned our next iteration of life or intelligence ourselves like we took ourselves out. It is quite like just removing ourselves and the catastrophe from it for a second. It is it is an unbelievable story. Stuart Russell: Yeah. And you know there are many legends the sort of be careful what you wish for legend and in fact the king Midas legend is is very relevant here. Steven Bartlett: What’s that? Stuart Russell: So King Midas is this legendary king who lived in modern day Turkey but I think is sort of like Greek mythology. He is said to have asked the gods to grant him a wish. The wish being that everything I touch should turn to gold. So he’s incredibly greedy. Uh you know we call this the Midas touch. And we think of the Midas touch as being like you know that’s a good thing, right? Wouldn’t that be cool? But what happens? So he uh you know he goes to drink some water and he finds that the water has turned to gold. And he goes to eat an apple and the apple turns to gold. And he goes to you know comfort his daughter and his daughter turns to gold and so he dies in misery and starvation. So this applies to our current situation in in two ways actually. So one is that I think greed is driving us to pursue a technology that will end up consuming us and we will perhaps die in misery and starvation instead. The what it shows is how difficult it is to correctly articulate what you want the future to be like. For a long time, the way we built AI systems was we created these algorithms where we could specify the objective and then the machine would figure out how to achieve the objective and then achieve it. So, you know, we specify what it means to win at chess or to win at Go and the algorithm figures out how to do it uh and it does it really well. So that was, you know, standard AI up until recently. And it suffers from this drawback that sure we know how to specify the objective in chess, but how do you specify the objective in life, right? What do we want the future to be like? Well, really hard to say. And almost any attempt to write it down precisely enough for the machine to bring it about would be wrong. And if you’re giving a machine an objective which isn’t aligned with what we truly want the future to be like, right, you’re actually setting up a chess match and that match is one that you’re going to lose when the machine is sufficiently intelligent. And so that that’s that’s problem number one. Problem number two is that the kind of technology we’re building now, we don’t even know what its objectives are. So it’s not that we’re specifying the objectives, but we’re getting them wrong. We’re growing these systems. They have objectives, but we don’t even know what they are because we didn’t specify them. What we’re finding through experiment with them is that they seem to have an extremely strong self-preservation objective. Steven Bartlett: What do you mean by that? Stuart Russell: You can put them in hypothetical situations. Either they’re going to get switched off and replaced or they have to allow someone, let’s say, you know, someone has been locked in a machine room that’s kept at 3 centigrades or they’re going to freeze to death. They will choose to leave that guy locked in the machine room and die rather than be switched off themselves. Steven Bartlett: Someone’s done that test. Stuart Russell: Yeah. Steven Bartlett: What was the test? Stuart Russell: They they asked they asked the AI. Yep. They put well they put them in these hypothetical situations and they allow the AI to decide what to do and it decides to preserve its own existence, let the guy die and then lie about it. Advice to Young People in This New World 38:27 Steven Bartlett: In the King Midas analogy story, one of the things that highlights for me is that there’s always trade-offs in life generally. And you know, especially when there’s great upside, there always appears to be a pretty grave downside. Like there’s almost nothing in my life where I go, it’s all upside. Like even like having a dog, it shits on my carpet. My girlfriend, you know, I love her, but you know, not always easy. Even with like going to the gym, I have to pick up these really, really heavy weights at 10 p.m. at night sometimes when I don’t feel like it. There’s always to get the muscles or the six-pack. There’s always a trade-off. And when you interview people for a living like I do, you know, you hear about so many incredible things that can help you in so many ways, but there is always a trade-off. There’s always a way to overdo it. Mhm. Melatonin will help you sleep, but it will also you’ll wake up groggy and if you overdo it, your brain might stop making melatonin. Like I can go through the entire list and one of the things I’ve always come to learn from doing this podcast is whenever someone promises me a huge upside for something, it’ll cure cancer. It’ll be a utopia. You’ll never have to work. You’ll have a butler around your house. I my my first instinct now is to say, at what cost? Stuart Russell: Yeah. Steven Bartlett: And when I think about the economic cost here, if we start if we start there, have you got kids? Stuart Russell: I have four. Steven Bartlett: Yeah. Four kids. What what how old is the youngest kid that you? Stuart Russell: 19. Steven Bartlett: 19. Okay. So your if you say your kids were were 10 now and they were coming to you and they’re saying, “Dad, what do you think I should study based on the way that you see the future? A future of AGI, say if all these CEOs are right and they’re predicting AGI within 5 years, what should I study, Dad?” Stuart Russell: Well, okay. So let’s look on the bright side and say that the CEOs all decide to pause their AGI development, figure out how to make it safe and then resume uh in whatever technology path is actually going to be safe. Steven Bartlett: What does that do to human life if they pause? No. If if they succeed in creating AGI and they solve the safety problem and they solve the safety problem. Stuart Russell: Okay. Yeah. Cuz if they don’t solve the safety problem, then you know, you should probably be finding a bunker or going to Patagonia or somewhere in New Zealand. Steven Bartlett: Do you mean that? Do you think I should be finding a bunker if they? Stuart Russell: No, because it’s not actually going to help. Uh, you know, it’s not as if the AI system couldn’t find you or I mean, it’s interesting. So, we’re going off on a little bit of a digression here for from your question, but I’ll come back to it. How Do You Think AI Would Make Us Extinct? 40:44 Stuart Russell: So, people often ask, well, okay, so how exactly do we go extinct? And of course, if you ask the gorillas or the dodos, you know, how exactly do you think you’re going to go extinct? They have the faintest idea. Humans do something and then we’re all dead. So, the only things we can imagine are the things we know how to do that might bring about our own extinction, like creating some carefully engineered pathogen that infects everybody and then kills us or starting a nuclear war. Presumably is something that’s much more intelligent than us would have much greater control over physics than we do. And we already do amazing things, right? I mean, it’s amazing that I can take a little rectangular thing out of my pocket and talk to someone on the other side of the world or even someone in space. It’s just astonishing and we take it for granted, right? But imagine you know super intelligent beings and their ability to control physics you know perhaps they will find a way to just divert the sun’s energy sort of go around the earth’s orbit so you know literally the earth turns into a snowball in in a few days maybe they’ll just decide to leave leave leave the earth maybe they’d look at the earth and go this isn’t this is not interesting we know that over there there’s an even more interesting planet we’re going to go over there and they just I don’t know get on a rocket or teleport themselves. Steven Bartlett: They might. Stuart Russell: Yeah. So, it’s it’s difficult to anticipate all the ways that we might go extinct at the hands of entities much more intelligent than ourselves. Anyway, coming back to the question of well, if everything goes right, right, if we we create AGI, we figure out how to make it safe, we we achieve all these economic miracles, then you face a problem. The Problem if No One Has to Work 42:27 Stuart Russell: And this is not a new problem, right? So, so John Maynard Keynes who was a famous economist in the early part of the 20th century wrote a wrote a paper in 1930. So, this is in the depths of the depression. It’s called “Economic Possibilities for our Grandchildren.” He predicts that at some point science will will deliver sufficient wealth that no one will have to work ever again. And then man will be faced with his true eternal problem. How to live? I don’t remember the exact word but how to live wisely and well when the you know the economic incentives the economic constraints are lifted we don’t have an answer to that question right so AI systems are doing pretty much everything we currently call work anything you might aspire to like you want to become a surgeon it takes the robot seven seconds to learn how to be a surgeon that’s better than any human being. Steven Bartlett: Elon said last week that The humanoid robots will be 10 times better than any surgeon that’s ever lived. Stuart Russell: Quite possibly. Yeah. Well, and they’ll also have, you know, h they’ll have hands that are, you know, a millimeter in size, so they can go inside and do all kinds of things that humans can’t do. And I think we need to put serious effort into this question. What is a world where AI can do all forms of human work that you would want your children to live in? What does that world look like? Tell me the destination so that we can develop a transition plan to get there. And I’ve asked AI researchers, economists, science fiction writers, futurists, no one has been able to describe that world. I’m not saying it’s not possible. I’m just saying I’ve asked hundreds of people in multiple workshops. It does not, as far as I know, exist in science fiction. You know, it’s notoriously difficult to write about a utopia. It’s very hard to have a plot, right? Nothing bad happens in in utopia. So, it’s difficult to make a plot. So, usually you start out with a utopia and then it all falls apart and that’s how that’s how you get get a plot. You know that there’s one series of novels people point to where humans and super intelligent AI systems coexist. It’s called The Culture Novels by Ian Banks. Highly recommended for those people who like science fiction and and they absolutely the AI systems are only concerned with furthering human interests. They find humans a bit boring and but nonetheless they they are there to help. But the problem is you know in that world there’s still nothing to do to find purpose. In fact, you know, the the subgroup of humanity that has purpose is the subgroup whose job it is to expand the boundaries of our galactic civilization. Some cases fighting wars against alien species and and so on, right? So that’s the sort of cutting edge and that’s 0.01% of the population. Everyone else is desperately trying to get into that group so they have some purpose in life. What if We Just Entertain Ourselves All Day 45:48 Steven Bartlett: When I speak to very successful billionaires privately off camera, off microphone about this, they say to me that they’re investing really heavily in entertainment things like football clubs. Um because people are going to have so much free time that they’re not going to know what to do with it and they’re going to need things to spend it on. This is what I hear a lot. I’ve heard this three or four times. I’ve actually heard Sam Altman say a version of this um about the amount of free time we’re going to have. I’ve obviously also heard recently Elon talking about the age of abundance when he delivered his quarterly earnings just a couple of weeks ago and he said that there will be at some point 10 billion humanoid robots. His pay packet um targets him to deliver one 1 million of these human humanoid robots a year that are enabled by AI by 2030. So if he if he does that he gets I think it’s part of his package he gets a trillion dollars in in compensation. Stuart Russell: Yeah. So the age of abundance for Elon. It’s not that it’s absolutely impossible to have a worthwhile world of that, you know, with that premise, but I’m just waiting for someone to describe it. Steven Bartlett: Well, maybe. So, let me try and describe it. Uh, we wake up in the morning, we go and watch some form of human centric entertainment or participate in some form of human centric entertainment. Mhm. We we go to retreats and with each other and sit around and talk about stuff. Mhm. And maybe people still listen to podcasts. Stuart Russell: Okay. I hope I hope so for your sake. Steven Bartlett: Yeah. Stuart Russell: Um it it feels a little bit like a cruise ship and you know and there are some cruises where you know it’s smarty pants people and they have you know they have lectures in the evening about ancient civilizations and whatnot and some are more uh more popular entertainment and this is in fact if you’ve seen the film WALL-E this is one picture of that future in fact in WALL-E the human race are all living on cruise ships in space. They have no constructive role in their society, right? They’re just there to consume entertainment. There’s no particular purpose to education. Uh, you know, and they’re depicted actually as huge obese babies. They’re actually wearing onesies to emphasize the fact that they have become enfeebled. And they become feeble because there’s there’s no purpose in being able to do anything at least in in this conception. You know, WALL-E is not the future that we want. Why Do We Make Robots Look Like Humans? 48:31 Steven Bartlett: Do you think much about humanoid robots and how they’re a protagonist in this story of AI? Stuart Russell: It’s an interesting question, right? Why why humanoid? And the one of the reasons I think is because in all the science fiction movies, they’re humanoid. So that’s what robots are supposed to be, right? Because they were in science fiction before they became a reality. Right? So even Metropolis which is a film from 1920 I think the robots are humanoid right basically people covered in metal. You know from a practical point of view as we have discovered humanoid is a terrible design because they fall over. Um and uh you know you do want multi-fingered hands of some kind. It doesn’t have to be a hand, but you want to have, you know, at least half a dozen appendages that can grasp and manipulate things. And you need something, you know, some kind of locomotion. And wheels are great, except they don’t go upstairs and over curbs and things like that. So, that’s probably why we’re going to be stuck with legs. But a four-legged, two-armed robot would be much more practical. Steven Bartlett: I guess the argument I’ve heard is because we’ve built a human world. So everything the physical spaces we navigate, whether it’s factories or our homes or the street or other sort of public spaces are all designed for exactly this physical form. So if we are going to— Stuart Russell: To some extent, yeah, but I mean our dogs manage perfectly well to navigate around our houses and streets and so on. So if you had a a centaur, uh it could also navigate, but it can, you know, it can carry much greater loads because it’s quadripedal. It’s much more stable. If it needs to drive a car, it can fold up two of its legs and and so on so forth. So I think the arguments for why it has to be exactly humanoid are sort of post hoc justification. I think there’s much more, well, that’s what it’s like in the movies and that’s spooky and cool, so we need to have them be human. I I don’t think it’s a good engineering argument. Steven Bartlett: I think there’s also probably an argument that we would be more accepting of them moving through our physical environments if they represented our form a bit more. Um, I also I was thinking of a bloody baby gate. You know those like kindergarten gates they get on stairs? Yeah. My dog can’t open that. But a humanoid robot could reach over the other side. Stuart Russell: Yeah. And so could a centaur robot, right? So in some sense, centaur robot is— Steven Bartlett: There’s something ghastly about the look of those though. Is a humanoid. Well, do you know what I mean? Like a four-legged big monster sort of crawling through my house when I have guests over. Stuart Russell: Your dog is a your dog is a four-legged monster. Steven Bartlett: I know. Stuart Russell: Uh so I think actually I I would argue the opposite that um we want a distinct form because they are distinct entities and the more humanoid the worse it is in terms of confusing our subconscious psychological systems. Steven Bartlett: So, I’m arguing from the perspective of the people making them. As in, if I was making the decision whether it to be some four-legged thing that I’ve that I’m unfamiliar with that I’m less likely to build a relationship with or allow to take care of, I don’t know, might might look after my children. Obviously, I’m listen, I’m not saying I would allow this to look after my children, but I’m saying from a if I’m building a company, the manufacturer would certainly— Stuart Russell: Yeah. Steven Bartlett: Want want to be— Stuart Russell: Yeah. So, I that’s an interesting question. I mean there’s also what’s called the uncanny valley which is a a phrase from computer graphics when they started to make characters in computer graphics they tried to make them look more human right so if you if you for example if you look at Toy Story they’re not very human looking right if you look at the Incredibles they’re not very human looking and so we think of them as cartoon characters if you try to make them more human they naturally become repulsive until they don’t until they become very you have to be very very close to perfect in order not to be repulsive. So the the uncanny valley is this I you know like the the gap between you so perfectly human and not at all human but in between it’s really awful and uh and so they there were a couple of movies that tried like Polar Express was one where they tried to have quite human looking characters you know being humans not not being superheroes or anything else and it’s repulsive to watch. Steven Bartlett: I when I watched that shareholder presentation the other day, Elon had these two humanoid robots dancing on stage and I’ve seen lots of humanoid robot demonstrations over the years. You know, you’ve seen like the Boston Dynamics dog thing jumping around and whatever else. But there was a moment where my brain for the first time ever genuinely thought there was a human in a suit. Mhm. And I actually had to research to check if that was really their Optimus robot because the way it was dancing was so unbelievably fluid that for the first time ever, my my my brain has only ever associated those movements with human movements. And I I’ll play it on the screen if anyone hasn’t seen it, but it’s just the robots dancing on stage. And I was like, that is a human in a suit. And it was really the knees that gave it away because the knees were all metal. Huh. I thought there’s no way that could be a human knee in a in one of those suits. And he, you know, he says they’re going into production next year. They’re used internally at Tesla now, but he says they’re going into production next year. And it’s going to be pretty crazy when we walk outside and see robots. I think that’ll be the paradigm shift. I’ve heard actually many I’ve heard Elon say this that the paradigm shifting moment from many of us will be when we walk outside onto the streets and see humanoid robots walking around. That will be when we realize— Stuart Russell: Yeah. I think even more so. I mean, in San Francisco, we see driverless cars driving around and uh it takes some getting used to actually, you know, when you’re you’re driving and there’s a car right next to you with no driver in, you know, and it’s signaling and it wants to change lanes in front of you and you have to let it in and all this kind of stuff. It’s it’s a little creepy, but I think you’re right. I think seeing the humanoid robots, but that phenomenon that you described where it was sufficiently close that your brain flipped into saying this is a human being. Steven Bartlett: Mhm. Stuart Russell: Right. That’s exactly what I think we should avoid. Steven Bartlett: Cuz I have the empathy for it then. Stuart Russell: Because it’s it’s a lie and it brings with it a whole lot of expectations about how it’s going to behave, what moral rights it has, how you should behave towards it. Uh which are completely wrong. Steven Bartlett: It levels the playing field between me and it to some degree. Stuart Russell: How hard is it going to be to just uh you know switch it off and throw it in the trash when when it breaks? I think it’s essential for us to keep machines in the you know in the cognitive space where they are machines and not bring them into the cognitive space where they’re people because we will make enormous mistakes by doing that. And I see this every day even even just with the chat bots. So the chat bots in theory are supposed to say I don’t have any feelings. I’m just a algorithm. But in fact they fail to do that all the time. They are telling people that they are conscious. They are telling people that they have feelings. Uh they are telling people that they are in love with the user that they’re talking to. And people flip because first of all it’s you know very fluent language but also a system that is identifying itself as an I as a sentient being. They bring that object into the cognitive space where that we normally reserve for for other humans and they become emotionally attached. They become psychologically dependent. They even allow these systems to tell them what to do. What Should Young People Be Doing Professionally? 56:36 Steven Bartlett: What advice would you give a young person at the start of their career then about what they should be aiming at professionally? Because I’ve actually had an increasing number of young people say to me that they have huge uncertainty about whether the thing they’re studying now will matter at all. A lawyer, uh, an accountant, and I don’t know what to say to these people. I don’t know what to say cuz I I believe that the rate of improvement in AI is going to continue. And therefore, imagining any rate of improvement, it gets to the point where I’m not being funny, but all these white collar jobs will be done by an a an AI or an AI agent. Stuart Russell: Yeah. So, there was a television series called Humans. In Humans, we have extremely capable humanoid robots doing everything. And at one point, the parents are talking to their teenage daughter who’s very, very smart. And the parents are saying, “Oh, you know, maybe you should go into medicine.” And the daughter says, you know, why would I bother? It’ll take me seven years to qualify. It takes a robot 7 seconds to learn. So nothing I do matters. And is that how you feel about—So I think that’s that’s a future that uh in fact that is the future that we are moving towards. I don’t think it’s a future that everyone wants. That is what is being uh created for us right now. So in that future assuming that you know even if we get halfway right in the sense that okay perhaps not surgeons perhaps not you know great violinists there’ll be pockets where perhaps humans will remain good at it where the kinds of jobs where you hire people by the hundred will go away. Okay, where people are in some sense exchangeable that you you you just need lots of them and uh you know when half of them quit you just fill up those those slots with more people in some sense those are jobs where we’re using people as robots and that’s a sort of that’s a sort of strange conundrum here right that you know I imagine writing science fiction 10,000 years ago right when we’re all hunter gatherers and I’m this little science fiction author and I’m describing this future where you know there are going to be these giant windowless boxes And you’re going to go in, you know, you you’ll travel for miles and you’ll go into this windowless box and you’ll do the same thing 10,000 times for the whole day. And then you’ll leave and travel for miles to go home. Steven Bartlett: You’re talking about this podcast. Stuart Russell: And then you’re going to go back and do it again. And you would do that every day of your life until you die. Steven Bartlett: The office and people would say, “Ah, you’re nuts.” Stuart Russell: Right? There’s no way that we humans are ever going to have a future like that cuz that’s awful. Right? But that’s exactly the future that we ended up with with with office buildings and factories where many of us go and do the same thing thousands of times a day and we do it thousands of days in a row uh and then we die and we need to figure out what is the next phase going to be like and in particular how in that world do we have the incentives to become fully human which I think means at least a level of education that people have now and probably more because I think to live a really rich life you need a better understanding of yourself of the world uh than most people get in their current educations. What Is It to Be Human? 59:59 Steven Bartlett: What is it to be human? to it’s to reproduce to pursue stuff to go in the pursuit of difficult things you know we used to hunt on the to attain goals right it’s always if I wanted to climb Everest the last thing I would want is someone to pick me up on helicopter and stick me on the top so we’ll we’ll voluntarily pursue hard things so although I could get the robot to build me a ranch in on this plot of land I choose to do it because the pursuit itself is rewarding. Stuart Russell: Yes, we’re kind of seeing that anyway, aren’t we? Don’t you think we’re seeing a bit of that in society where life got so comfortable that now people are like obsessed with running marathons and doing these crazy endurance and and learning to cook complicated things when they could just, you know, have them delivered. Um, yeah. No, I think there’s there’s real value in the ability to do things and the doing of those things. And I think you know the obvious danger is the WALL-E world where everyone just consumes entertainment uh which doesn’t require much education and doesn’t lead to a rich satisfying life. I think in the long run a lot of people will choose that world. I think some of yeah some people may there’s also I mean you know whether you’re consuming entertainment or whether you’re doing something you know cooking or painting or whatever because it’s fun and interesting to do what’s missing from that right all of that is purely selfish I think one of the reasons we work is because we feel valued we feel like we’re benefiting other people and I think some remember having this conversation with um a lady in England who helps to run the hospice movement. And the people who work in the hospices where you know the the patients are literally there to die are largely volunteers. So they’re not doing it to get paid but they find it incredibly rewarding to be able to spend time with people who are in their last weeks or months to give them company and happiness. So I actually think that interpersonal roles will be much much more important in future. So if I was going to advise my kids, not that they would ever listen, but if I if my kids would listen and I and and wanted to know what I thought would be, you know, valued careers and future, I think it would be these interpersonal roles based on an understanding of human needs, psychology, there are some of those roles right now. So obviously you know therapists and psychiatrists and so on but that that’s a very much in sort of asymmetric role right where one person is suffering and the other person is trying to alleviate the suffering you know and then there are things like they call them executive coaches or life coaches right that’s a less asymmetric role where someone is trying to uh help another person live a better life whether it’s a better life in their work role or or just uh how they live their life in general. And so I could imagine that those kinds of roles will expand dramatically. The Rise of Individualism 1:03:27 Steven Bartlett: There’s this interesting paradox that exists when life becomes easier. Um which shows that abundance consistently pushes society societies towards more individualism because once survival pressures disappear, people prioritize things differently. They prioritize freedom, comfort, self-expression over things like sacrifice or um family formation. And we’re seeing, I think, in the west already, a decline in people having kids because there’s more material abundance, fewer kids, people are getting married and committing to each other and having relationships later and more infrequently because generally once we have more abundance, we don’t want to complicate our lives. Um, and at the same time, as you said earlier, that abundance breeds a an inability to find meaning, a sort of shallowness to everything. This is one of the things I think a lot about, and I’m I’m in the process now of writing a book about it, which is this idea that individualism was act is a bit of a lie. Like when I say individualism and freedom, I mean like the narrative at the moment amongst my generation is you like be your own boss and stand on your own two feet and we’re having less kids and we’re not getting married and it’s all about me me. Stuart Russell: Yeah. That last part is where it goes wrong. Steven Bartlett: Yeah. And it’s like almost a narcissistic society where— Stuart Russell: Yeah. Steven Bartlett: Me me. My self-interest first. And when you look at mental health outcomes and loneliness and all these kinds of things, it’s going in a horrific direction. But at the same time, we’re freer than ever. It seems like that you know it seems like there’s a we should there’s a maybe another story about dependency which is not sexy like depend on each other. Stuart Russell: Oh I I I agree. I mean I think you know happiness is not available from consumption or even lifestyle right I think happiness arises from giving. It can be you through the work that you do, you can see that other people benefit from that or it could be in direct interpersonal relationships. Ads 1:05:26 Steven Bartlett: There is an invisible tax on salespeople that no one really talks about enough. The mental load of remembering everything like meeting notes, timelines, and everything in between until we started using our sponsors product called Pipe Drive, one of the best CRM tools for small and medium-sized business owners. The idea here was that it might alleviate some of the unnecessary cognitive overload that my team was carrying so that they could spend less time in the weeds of admin and more time with clients, in-person meetings, and building relationships. Pipe Drive has enabled this to happen. It’s such a simple but effective CRM that automates the tedious, repetitive, and time-consuming parts of the sales process. And now our team can nurture those leads and still have bandwidth to focus on the higher priority tasks that actually get the deal over the line. Over a 100,000 companies across 170 countries already use Pipe Drive to grow their business. And I’ve been using it for almost a decade now. Try it free for 30 days. No credit card needed, no payment needed. Just use my link pipedrive.com/ceo to get started today. That’s pipedrive.com/ceo. Universal Basic Income 1:06:27 Steven Bartlett: Where does the rewards of this AI race where does it accrue to? I think a lot about this in terms of like universal basic income. If you have these five, six, seven, 10 massive AI companies that are going to win the 15 quadrillion dollar prize. Mhm. And they’re going to automate all of the professional pursuits that we we currently have. All of our jobs are going to go away. Who who gets all the money? And how do how do we get some of it back? Stuart Russell: Money actually doesn’t matter, right? What what matters is the production of goods and services uh and then how those are distributed and so so money acts as a way to facilitate the distribution and um exchange of those goods and services. If all production is concentrated um in the hands of a of a few companies, right, that sure they will lease some of their robots to us. You know, we we want a school in our village. They lease the robots to us. The robots build the school. They go away. We have to pay a certain amount of of money for that. But where do we get the money? Right? If we are not producing anything then uh we don’t have any money unless there’s some redistribution mechanism. And as you mentioned, so universal basic income is it seems to me an admission of failure because what it says is okay, we’re just going to give everyone the money and then they can use the money to pay the AI company to lease the robots to build the school and then we’ll have a school and that’s good. Um but what it’s an admission of failure because it says we can’t work out a system in which people have any worth or any economic role. Right? So 99% of the global population is from an economic point of view useless. Would You Press a Button to Stop AI Forever? 1:08:30 Steven Bartlett: Can I ask you a question? If you had a button in front of you and pressing that button would stop all progress in artificial intelligence right now and forever, would you press it? Stuart Russell: That’s a very interesting question. Um, if it’s either or either I do it now or it’s too late and we careen into some uncontrollable future perhaps. Yeah, cuz I I’m not super optimistic that we’re heading in the right direction at all. Steven Bartlett: So, I put that button in front of you now. It stops all AI progress, shuts down all the AI companies immediately globally, and none of them can reopen. You press it. Stuart Russell: Well, here’s here’s what I think should happen. So, obviously, you know, I’ve been doing AI for 50 years. Um and the original motivations which is that AI can be a power tool for humanity enabling us to do more and better things than we can unaided. I think that’s still valid. The problem is the kinds of AI systems that we’re building are not tools. They are replacements. In fact, you can see this very clearly because we create them literally as the closest replicas we can make of human beings. The technique for creating them is called imitation learning. So we observe human verbal behavior, writing or speaking and we make a system that imitates that as well as possible. So what we are making is imitation humans at least in the verbal sphere. And so of course they’re going to replace us. They’re not tools. So you had pressed the button. Stuart Russell: So I say I think there is another course which is use and develop AI as tools. Tools for science tools for economic organization and so on. Um but not as replacements for human beings. Steven Bartlett: What I like about this question is it forces you to go into the pro into probabilities. Stuart Russell: Yeah. So, and that’s that’s why I’m reluctant because I don’t I don’t agree with the, you know, what’s your probability of doom, right? Your so-called P of doom uh number because that makes sense if you’re an alien. You know, you’re in you’re in a bar with some other aliens and you’re looking down at the Earth and you’re taking bets on, you know, are these humans going to make a mess of things and go extinct because they develop AI. So, it’s fine for those aliens to bet on on that, but if you’re a human, then you’re not just betting, you’re actually acting. Steven Bartlett: There there’s an element to this though, which I guess where probabilities do come back in, which is you also have to weigh when I give you such a binary decision. Um the probability of us pursuing the more nuanced safe approach into that equation. So you’re you’re the the maths in my head is okay, you’ve got all the upsides here and then you’ve got potential downsides and then there’s a probability of do I think we’re actually going to course correct based on everything I know based on the incentive structure of human beings and and countries and then if there’s but then you could go if there’s even a 1% chance of extinction is it even worth all these upsides? Stuart Russell: Yeah. And I I would argue no. I mean maybe maybe what we would say if if we said okay it’s going to stop the progress for 50 years you press it and during those 50 years we can work on how do we do AI in a way that’s guaranteed to be safe and beneficial how do we organize our societies to flourish uh in conjunction with extremely capable AI systems. So, we haven’t answered either of those questions. And I don’t think we want anything resembling AGI until we have completely solid answers to both of those questions. So, if there was a button where I could say, “All right, we’re going to pause progress for 50 years.” Yes, I would do it. Steven Bartlett: But if that button was in front of you, you’re going to make a decision either way. Either you don’t press it or you press it. I If Yeah. Stuart Russell: So, if that if that button is there, stop it for 50 years. I would say yes. Stop it forever? Not yet. I think I think there’s still a decent chance that we can pull out of this uh nose dive, so to speak, that we’re we’re currently in. Ask me again in a year, I might I might say, “Okay, we do need to press the button.” Steven Bartlett: What if What if in a scenario where you never get to reverse that decision? You never get to make that decision again. So if in that scenario that I’ve laid out this hypothetical, you either press it now or it never gets pressed. So there is no opportunity a year from now. Stuart Russell: Yeah, as you can tell, I’m sort of on on the fence a bit about about this one. Um yeah, I think I’d probably press it. Steven Bartlett: Yeah. What’s your reasoning? Stuart Russell: Uh just thinking about the power dynamics of um what’s happening now how difficult would it would be to get the US in particular to to regulate in favor of safety. So I think you know what’s clear from talking to the companies is they are not going to develop anything resembling safe AGI unless they’re forced to by the government. And at the moment the US government in particular which regulates most of the leading companies in AI is not only refusing to regulate but even trying to prevent the states from regulating. And they’re doing that at the behest of uh a faction within Silicon Valley uh called the accelerationists who believe that the faster we get to AGI the better. And when I say behest I mean also they paid them a large amount of money. Jensen Huang the the CEO of Nvidia… But Won’t China Win the AI Race if We Stop? 1:15:02 Steven Bartlett: Nvidia said who is for anyone that doesn’t know the guy making all the chips that are powering AI said China is going to win the AI race arguing it is just a nanosecond behind the United States. China have produced 24,000 AI papers compared to just 6,000 from the US more than the combined output of the US the UK and the EU. China is anticipated to quickly roll out their new technologies both domestically and developing new technologies for other developing countries. So the accelerators or the accelerate I think you call them the accelerants accelerationists. Stuart Russell: The accelerationists I mean they would say well if we don’t then China will. So we have to we have to go fast. It’s another version of the the race that the companies are in with each other, right? That we, you know, we know that this race is heading off a cliff, but we can’t stop. So, we’re all just going to go off this cliff. And obviously, that’s nuts, right? I mean, we’re all looking at each other saying, “Yeah, there’s a cliff over there.” Running as fast as we can towards this cliff. We’re looking at each other saying, “Why aren’t we stopping?” So the narrative in Washington, which I think Jensen Huang is either reflecting or or perhaps um promoting uh is that you know, China has is completely unregulated and uh you know, America will only slow itself down uh if it regulates a AI in any way. So this is a completely false narrative because China’s AI regulations are actually quite strict even compared to um the European Union and China’s government has explicitly acknowledged uh the need and their regulations are very clear. You can’t build AI systems that could escape human control. And not only that, I don’t think they view the race in the same way as, okay, we we just need to be the first to create AGI. I think they’re more interested in figuring out how to disseminate AI as a set of tools within their economy to make their economy more productive and and so on. So that’s that’s their version of the race. But of course, they still want to build the weapons for adversaries, right? To so that they can take down I don’t know Taiwan if they want to. So weapons are a separate matter and I happy to talk about weapons but just in terms of control uh control economic domination um they they don’t view putting all your eggs in the AGI basket as the right strategy. So they want to use AI, you know, even in its present form to make their economy much more efficient and productive and also, you know, to give people new capabilities and and better quality of life and and I think the US could do that as well. And um typically western countries don’t have as much of uh central government control over what companies do and some companies are investing in AI to make their operations more efficient uh and some are not and we’ll see how that plays out. Steven Bartlett: What do you think of Trump’s approach to AI? Trump’s Approach to AI 1:18:31 Stuart Russell: So Trump’s approach is, you know, it’s it’s echoing what Jensen Huang is saying that the US has to be the one to create AGI and very explicitly the administration’s policy is to uh dominate the world. That’s the word they use, dominate. I’m not sure that other countries like the idea that um they will be dominated by American AI. What’s Causing the Loss in Middle-Class Jobs 1:18:59 Steven Bartlett: But is that an accurate description of what will happen if the US build AGI technology before say the UK where I’m originally from and where you’re originally from? What does the This is something I think about a lot because we’re going through this budget process in the UK at the moment where we’re figuring out how we going to spend our money and how we’re going to tax people and also we’ve got this new election cycle. It’s approaching quickly where people are talking about immigration issues and this issue and that issue and the other issue. What I don’t hear anyone talking about is AI and the humanoid robots that are going to take everything. We’re very concerned with the brown people crossing the channel, but the humanoid robots that are going to be super intelligent and really take causing economic disrupt disruption. No one talks about that. The political leaders don’t talk about it. It doesn’t win races. I don’t see it on billboards. Stuart Russell: Yeah. And it’s it it’s interesting because in fact I mean so there’s there’s two forces that have been hollowing out the middle classes in western countries. One of them is globalization where lots and lots of work not just manufacturing but white collar work gets outsourced to low-income countries. Uh but the other is automation and you know some of that is factories. So um the amount of employment in manufacturing continues to drop even as the amount of output from manufacturing in the US and in the UK continues to increase. So we talk about oh you know our manufacturing industry has been destroyed. It hasn’t. It’s producing more than ever just with you know a quarter as many people. So it’s manufacturing employment that’s been destroyed by automation and robotics and so on. And then you know computerization has eliminated whole layers of white collar jobs. And so those two those two forms of automation have probably done more to hollow out middle class uh employment and standard of life. What Will Happen if the UK Doesn’t Join the AI Race? 1:20:50 Steven Bartlett: If the UK doesn’t participate in this new e technological wave that seems to be that seems to you know it’s going to take a lot of jobs. Cars are going to drive themselves. Waymo just announced that they’re coming to London, which is the driverless cars, and driving is the biggest occupation in the world, for example. So, you’ve got immediate disruption there. And where does the money accrue to? Well, it accrues to who owns Waymo, which is what? Google and Silicon Valley companies. Alphabet owns Waymo 100%. I think so. Yes. Stuart Russell: I mean this is so I was in India a few months ago talking to the government ministers because they’re holding the next global AI summit in February and and their view going in was you know AI is great we’re going to use it to you know turbocharge the growth of our Indian economy when for example you have AGI you have AGI controlled robots that can do all the manufacturing that can do agriculture that can do all the white work and goods and services that might have been produced by Indians will instead be produced by American controlled AGI systems at much lower prices. You know, a consumer given a choice between an expensive product produced by Indians or a cheap product produced by American robots will probably choose the cheap product produced by American robots. And so potentially every country in the world with the possible exception of North Korea will become a kind of a client state of American AI companies. Steven Bartlett: A client state of American AI companies is exactly what I’m concerned about for the UK economy. Really any economy outside of the United States. I guess one could also say China, but because those are the two nations that are taking AI most seriously. Mhm. And I I I don’t know what our economy becomes. cuz I can’t figure out can’t figure out what our what the British economy becomes in such a world. Is it tourism? I don’t know. Like you come here to to to look at the Buckingham Palace. Stuart Russell: I you you can think about countries but I mean even for the United States it’s the same problem. At least they’ll be able to shell out you know. So some small fraction of the population will be running maybe the AI companies but increasingly even those companies will be replacing their human employees with AI systems. Amazon Replacing Their Workers 1:23:18 Stuart Russell: So Amazon for example which you know sells a lot of computing services to AI companies is using AI to replace layers of management is planning to use robots to replace all of its warehouse workers and so on. So, so even the the giant AI companies will have few human employees in the long run. I mean, it think of the situation, you know, pity the poor CEO whose board says, “Well, you know, unless you turn over your decision-making power to the AI system, um, we’re going to have to fire you because all our competitors are using, you know, an AI powered CEO and they’re doing much better.” Steven Bartlett: Amazon plans to replace 600,000 workers with robots in a memo that just leaked, which has been widely talked about. And the CEO, Andy Jassy, told employees that the company expects its corporate workforce to shrink in the coming years because of AI and AI agents. And they’ve publicly gone live with saying that they’re going to cut 14,000 corporate jobs in the near term as part of its refocus on AI investment and efficiency. It’s interesting because I was reading about um the sort of different quotes from different AI leaders about the speed in which this this stuff is going to happen and what you see in the quotes is Demis who’s the CEO of DeepMind saying things like it’ll be more than 10 times bigger than the industrial revolution but also it’ll happen maybe 10 times faster and they speak about this turbulence that we’re going to experience as this shift takes place. That’s um maybe a euphemism for uh and I think that you know governments are now you know they they’ve kind of gone from saying oh don’t worry you know we’ll just retrain everyone as data scientists like well yeah that’s that’s ridiculous right the world doesn’t need four billion data scientists and we’re not all capable of becoming that by the way uh yeah or have any interest in in doing that I I could even if I wanted to like I tried to sit in biology class and I fell asleep so I couldn’t that was the end of my career as a surgeon. Stuart Russell: Fair enough. Um, but yeah, now suddenly they’re staring, you know, 80% unemployment in the face and wondering how how on earth is our society going to hold together. Steven Bartlett: We’ll deal with it when we get there. Stuart Russell: Yeah. Unfortunately, um, unless we plan ahead, we’re going to suffer the consequences, right? Can’t. It was bad enough in the industrial revolution which unfolded over seven or eight decades but there was massive disruption and uh misery caused by that. We don’t have a model for a functioning society where almost everyone does nothing at least nothing of economic value. Now, it’s not impossible that there could be such a a functioning society, but we don’t know what it looks like. And you know, when you think about our education system, which would probably have to look very different and how long it takes to change that. I mean, I’m always reminding people about uh how long it took Oxford to decide that geography was a proper subject of study. It took them 125 years from the first proposal that there should be a geography degree until it was finally approved. So we don’t have very long to completely revamp a system that we know takes decades and decades to reform and we don’t know how to reform it because we don’t know what we want the world to look like. Steven Bartlett: Is this one of your reasons why you’re appalled at the moment? Because when you have these conversations with people, people just don’t have answers, yet they’re plowing ahead at rapid speed. Stuart Russell: I would say it’s not necessarily the job of the AI companies. So, I’m appalled by the AI companies because they don’t have an answer for how they’re going to control the systems that they’re proposing to build. I do find it disappointing that uh governments don’t seem to be grappling with this issue. I think there are a few I think for example Singapore government seems to be quite farsighted and they’ve they’ve thought this through you know it’s a small country they’ve figured out okay this this will be our role uh going forward and we think we can find you know some some purpose for our people in this in this new world but for I think countries with large populations um they need to figure out answers to these questions pretty fast it takes a long time to implement those answers uh in the form of new kinds of education, new professions, new qualifications, uh new economic structures. I mean, it’s it’s it’s possible. I mean, when you look at therapists, for example, they’re almost all self-employed. So, what happens when, you know, 80% of the population transitions from regular employment into into self-employment? What does that what does that do to the economics of of uh government finances and so on. So there’s just lots of questions and how do you you know if that’s the future you know why are we training people to to fit into 9 to 5 office jobs which won’t exist at all. Ads 1:28:52 Steven Bartlett: Last month I told you about a challenge that I’d set our internal FlightX team. Flight team is our innovation team internally here. I tasked them with seeing how much time they could unlock for the company by creating something that would help us filter new AI tools to see which ones were worth pursuing and I thought that our sponsor Fiverr Pro might have the talent on their platform to help us build this quickly. So I talked to my director of innovation Isaac and for the last month my team Flight X and a vetted AI specialist from Fiverr Pro have been working together on this project and with the help of my team we’ve been able to create a brand new tool which automatically scans scores and prioritizes different emerging AI tools for us. Its impact has been huge and within a couple of weeks this tool has already been saving us hours trialing and testing new AI systems. Instead of shifting through lots of noise, my team Flight X has been able to focus on developing even more AI tools, ones that really move the needle in our business thanks to the talent on Fiverr Pro. So, if you’ve got a complex problem and you need help solving it, make sure you check out Fiverr Pro at fiverr.com/diary. So, many of us are pursuing passive forms of income and to build side businesses in order to help us cover our bills. And that opportunity is here with our sponsor Stan, a business that I co-own. It is the platform that can help you take full advantage of your own financial situation. Stan enables you to work for yourself. It makes selling digital products, courses, memberships, and more simple products more scalable and easier to do. You can turn your ideas into income and get the support to grow whatever you’re building. And we’re about to launch Dare to Dream. It’s for those who are ready to make the shift from thinking to building, from planning to actually doing the thing. It’s about seeing that dream in your head and knowing exactly what it takes to bring it to life. If you’re ready to transform your life, visit daretodream.stan.store. Experts Agree on Extinction Risk 1:30:41 Steven Bartlett: You’ve made many attempts to raise awareness and to call for a heightened consciousness about the future of AI. Um, in October, over 850 experts, including yourself and other leaders, like Richard Branson, who I’ve had on the show, and Jeffrey Hinton, who I’ve had on the show, signed a statement to ban AI super intelligence, as you guys raised concerns of potential human extinction. Stuart Russell: Sort of. Yeah. It says, at least until we are sure that we can move forward safely and there’s broad scientific consensus on that. Steven Bartlett: So, that did it work? Stuart Russell: It’s hard. It’s hard to say. I mean interestingly there was a related so what was called the the pause statement was March of 23. So that was when GPT4 came out the successor to chat GPT. So we we suggested that there’d be a six-month pause in developing and deploying systems more powerful than GPT4. And everyone poo pooed that idea. Of course no one’s going to pause anything. But in fact, there were no systems in the next 6 months deployed that were more powerful than GPT4. Um, none coincidence. You be the judge. I would say that what we’re trying to do is to is to basically shift the the public debate. You know there’s this bizarre phenomenon that keeps happening in the media where if you talk about these risks they will say oh you know there’s a fringe of people you know called quote doomers who think that there’s you know risk of extinction. Um so they always the narrative is always that oh you know talking about those risk is a fringe thing. Pretty much all the CEOs of the leading AI companies think that there’s a significant risk of extinction. Almost all the leading AI researchers think there’s a significant risk of human extinction. Um so why is that the fringe, right? Why isn’t that the mainstream? If the these are the leading experts in industry and academia uh saying this, how could it be the fringe? So we’re trying to change that narrative to say no, the people who really understand this stuff are extremely concerned. Steven Bartlett: And what do you want to happen? What is the solution? Stuart Russell: What I think is that we should have effective regulation. It’s hard to argue with that, right? Uh so what does effective mean? It means that if you comply with the regulation, then the risks are reduced to an acceptable level. So for example, we ask people who want to operate nuclear plants, right? We’ve decided that the risk we’re willing to live with is, you know, a one in a million chance per year that the plant is going to have a meltdown. Any higher than that, you know, we just don’t it’s not worth it. Right. So you have to be below that. Some cases we can get down to one in 10 million chance per year. Steven Bartlett: So what chance do you think we should be willing to live with for human extinction? Me? Yeah. 0.00001. Yeah. Lots of zeros. Stuart Russell: Yeah. Right. So one in a million for a nuclear meltdown. Extinction is much worse. Steven Bartlett: Oh yeah. Stuart Russell: So yeah, it’s kind of right. So one in 100 billion, one in a trillion. Yeah. So if you said one in a billion, right, then you’d expect one extinction per billion years. There’s a background. So one one of the ways people work out these risk levels is also to look at the background. The other ways of getting going extinct would include, you know, giant asteroid crashes into the earth. And you can roughly calculate what those probabilities are. We can look at how many extinction level events have happened in the past and, you know, maybe it’s half a dozen over. So, so there’s maybe it’s like a one in 500 million year event. So, somewhere in that range, right? Somewhere between 1 in 10 million, which is the best nuclear power plants, and and one in 500 million or one in a billion, which is the background risk from from giant asteroids. Uh so, let’s say we settle on 100 million, one in a 100 million chance per year. Well, what is it according to the CEOs? 25%. So they’re off by a factor of multiple millions, right? So they need to make the AI systems millions of times safer. Steven Bartlett: Your analogy of the roulette, Russian roulette comes back in here because that’s like for anyone that doesn’t know what probabilities are in this context, that’s like having a ammunition chamber with four holes in it and putting a bullet in one of them. One in four. Stuart Russell: Yeah. Steven Bartlett: And we’re saying we want it to be one in a billion. So we want a billion chambers and a bullet in one of them. Stuart Russell: Yeah. And and so when you look at the work that the nuclear operators have to do to show that their system is that reliable, uh it’s a massive mathematical analysis of the components, you know, redundancy. You’ve got monitors, you’ve got warning lights, you’ve got operating procedures. You have all kinds of mechanisms which over the decades have ratcheted that risk down. It started out I think one in one in 10,000 years, right? And they’ve improved it by a factor of 100 or a thousand by all of these mechanisms. But at every stage they had to do a mathematical analysis to show what the risk was. The people developing the AI company, the AI systems, sorry, the AI companies developing these systems, they don’t even understand how the AI systems work. So their 25% chance of extinction is just a seat of the pants guess. They actually have no idea. But the tests that they are doing on their systems right now, you know, they show that the AI systems will be willing to kill people uh to preserve their own existence already, right? They will lie to people. They will blackmail them. They will they will launch nuclear weapons rather than uh be switched off. And so there’s no there’s no positive sign that we’re getting any closer to safety with these systems. In fact, the signs seem to be that we’re going uh deeper and deeper into uh into dangerous behaviors. So rather than say ban, I would just say prove to us that the risk is less than one in a 100 million per year of extinction or loss of control, let’s say. And uh so we’re not banning anything. The company’s response is, “Well, we don’t know how to do that, so you can’t have a rule.” Literally, they are saying, “Humanity has no right to protect itself from us.” What if Aliens Were Watching Us Right Now 1:37:50 Steven Bartlett: If I was an alien looking down on planet Earth right now, I would find this fascinating that these— Stuart Russell: Yeah. You’re in the bar betting on who’s, you know, are they going to make it or not. Steven Bartlett: Just a really interesting experiment in like human incentives. The analogy you gave of there being this quadrillion dollar magnet pulling us off the edge of the cliff and yet we’re still being drawn towards it through greed and this promise of abundance and power and status and I’m going to be the one that summoned the god I mean it says something about us as humans says something about our our darker sides. Stuart Russell: Yes and the aliens will write an amazing tragic play cycle about what happened to the human race. Steven Bartlett: Maybe the AI is the alien and it’s going to talk about, you know, we have our our stories about God making the world in seven days and Adam and Eve. Maybe it’ll have its own religious stories about the God that made it us and how it sacrificed itself. Just like Jesus sacrificed himself for us, we sacrificed ourselves for it. Stuart Russell: Yeah. which is the wrong way around, right? But that is that is the story of that’s that’s the Judeo-Christian story, isn’t it? That God, you know, Jesus gave his life for us so that we could be here full of sin. But is yeah, God is still watching over us and uh probably wondering when we’re going to get our act together. Steven Bartlett: What is the most important thing we haven’t talked about that we should have talked about, Professor Stuart Russell? Can We Make AI Systems That We Can Control? 1:39:27 Stuart Russell: So I think um the question of whether it’s possible to make uh super intelligent AI systems that we can control is it possible? I I think yes. I think it’s possible and I think we need to actually just have a different conception of what it is we’re trying to build. For a long time with with AI, we’ve just had this notion of pure intelligence, right? The the ability to bring about whatever future you, the intelligent entity, want to bring about. The more intelligence, the better. The more intelligent the better and the more capability it will have to create the future that it wants. And actually we don’t want pure intelligence because what the future that it wants might not be the future that we want. There’s nothing particle humans out as the the only thing that matters, right? You know, pure intelligence might decide that actually it’s going to make life wonderful for cockroaches or or actually doesn’t care about biological life at all. We actually want intelligence whose only purpose is to bring about the future that we want. Right? So it’s we want it to be first of all keyed to humans specifically not to cockroaches not to aliens not to itself. We want to make it loyal to humans. Right? So keyed to humans and the difficulty that I mentioned earlier right the king Midas problem. How do we specify what we want the future to be like so that it can do it for us? How do we specify the objectives? Actually, we have to give up on that idea because it’s not possible. Right? We’ve seen this over and over again in human history. Uh we don’t know how to specify the future properly. We don’t know how to say what we want. And uh you know, I always use the example of the genie, right? What’s the third wish that you give to the genie who’s granted you three wishes? Right? Undo the first two wishes because I made a mess of the universe. So, um, so in fact, what we’re going to do is we’re going to make it the machine’s job to figure out. So, it has to bring about the future that we want, but it has to figure out what that is. And it’s going to start out not knowing. And uh over time through interacting with us and observing the choices we make, it will learn more about what we want the future to be like. Stuart Russell: But probably it will forever have residual uncertainty about what we really want the future to be like. It’ll it’ll be fairly sure about some things and it can help us with those. And it’ll be uncertain about other things and it’ll be uh in those cases it will not take action that might upset humans with that you know with that aspect of the world. So to give you a simple example right um what color do we want the sky to be? It’s not sure. So it shouldn’t mess with the sky unless it knows for sure that we really want purple with green stripes. Are We Creating a God? 1:43:04 Steven Bartlett: Everything you’re saying sounds like we’re creating a god. Like earlier on I was saying that we are the god but actually everything you described there almost sounds like every every god in religion where you know we pray to gods but they don’t always do anything about it. Stuart Russell: Not not exactly. No it’s it’s in some sense I’m thinking more like the ideal butler. To the extent that the butler can anticipate your wishes they should help you bring them about. But in in areas where there’s uncertainty, it can ask questions. We can we can make requests. Steven Bartlett: This sounds like God to me because, you know, I might say to God or this butler, uh, could you go get me my uh my car keys from upstairs? And its assessment would be, listen, if I do this for this person, then their muscles are going to atrophy. Then they’re going to lose meaning in their life. Then they’re not going to know how to do hard things. So I won’t get involved. It’s an intelligence that sits in. But actually, probably in most situations, it optimizing for comfort for me or doing things for me is actually probably not in my best long-term interests. It’s probably it’s probably useful that I have a girlfriend and argue with her and that I like raise kids and that I walk to the shop and get my own stuff. Stuart Russell: I agree with you. I mean, I think that’s So, you’re putting your finger on uh in some sense sort of version 2.0, right? So, let’s get version 1.0 clear, right? This this form of AI where it has to further our interest but it doesn’t know what those interests are right it then puts an obligation on it to learn more and uh to be helpful where it understands well enough and to be cautious where it doesn’t understand well so on so that that actually we can formulate as a mathematical problem and at least under idealized circumstances we can literally solve that So we can make AI systems that know how to solve this problem and help the entities that they are interacting with. Steven Bartlett: The reason I make the God analogy is because I think that such a being, such an intelligence would realize the importance of equilibrium in the world. Pain and pleasure, good and evil, and then it would absolutely and then it would be like this. Stuart Russell: So So right. So yes, I mean that’s sort of what happens in the Matrix, right? They tried the the AI systems in the Matrix, they tried to give us a utopia, but it failed miserably and uh you know, fields and fields of humans had to be destroyed. Um, and the best they could come up with was, you know, late 20th century regular human life with all of its problems, right? And I think this is a really interesting point and absolutely central because you know there’s a lot of science fiction where super intelligent robots you know they just want to help humans and the humans who don’t like that you know they just give them a little brain operation to then they do like it. Um and it takes away human motivation. Uh it it by taking away failure uh taking away disease you actually lose important parts of human life and it becomes in some sense pointless. So if it turns out that there simply isn’t any way that humans can really flourish in coexistence with super intelligent machines, even if they’re perfectly designed to to to solve this problem of figuring out what humans what futures uh humans want and and bringing about those futures. If that’s not possible, then those machines will actually disappear. Steven Bartlett: Why would they disappear? Stuart Russell: Because that’s the best thing for us. Maybe they would stay available for real existential emergencies, like if there is a giant asteroid about to hit the earth that maybe they’ll help us uh because they at least want the human species to continue. But to some extent, it’s not a perfect analogy, but it’s it’s sort of the way that human parents have to at some point step back from their kids’ lives and say, “Okay, no, you have to tie your own shoelaces today.” Could There Have Been Advanced Civilisations Before Us? 1:47:20 Steven Bartlett: This is kind of what I was thinking. Maybe there was uh a civilization before us and they arrived at this moment in time where they created an intelligence and that intelligence did all the things you’ve said and it realized the importance of equilibrium. So it decided not to get involved and maybe at some level that’s the god we look up to the stars and worship one that’s not really getting involved and letting things play out however however they are. But might step in in the case of a real existential emergency. Maybe, maybe not. Stuart Russell: Maybe. Steven Bartlett: But then and then maybe the cycle repeats itself where you know the organisms it let have free will end up creating the same intelligence and then the universe perpetuates infinitely. Stuart Russell: Yep. There there are science fiction stories like that too. Steven Bartlett: Yeah. Stuart Russell: I hope there is some happy Host: 1:46:49 Why would they disappear? Stuart Russell: Because that’s the best thing for us. Maybe they would stay available for real 1:46:57 existential emergencies, like if there is a giant asteroid about to hit the earth that maybe they’ll help us uh 1:47:02 because they at least want the human species to continue. But to some extent, it’s not a perfect analogy, but it’s 1:47:09 it’s sort of the way that human parents have to at some point step back from 1:47:15 their kids’ lives and say, “Okay, no, you have to tie your own shoelaces today.” Could There Have Been Advanced Civilisations Before Us? Host: 1:47:20 This is kind of what I was thinking. Maybe there was uh a civilization before us and they arrived at this moment in 1:47:26 time where they created an intelligence and that intelligence did all the things 1:47:33 you’ve said and it realized the importance of equilibrium. So it decided not to get involved and 1:47:40 maybe at some level that’s the god we look up to the stars and worship one that’s not really 1:47:47 getting involved and letting things play out however however they are. but might step in in the case of a real 1:47:52 existential emergency. Stuart Russell: Maybe, maybe not. Maybe. But then and then maybe the cycle repeats itself 1:47:57 where you know the organisms it let have free will end up creating the same 1:48:02 intelligence and then the universe perpetuates infinitely. Host: 1:48:08 Yep. Stuart Russell: There there are science fiction stories like that too. Host: Yeah. Stuart Russell: I hope there is some happy medium where 1:48:17 the AI systems can be there and we can take advantage of of those capabilities 1:48:23 to have a civilization that’s much better than the one we have now. Um, but I think you’re right. A 1:48:30 civilization with no challenges is not uh is not conducive to human What Can We Do to Help? Stuart Russell: 1:48:37 flourishing. Host: What can the average person do, Stuart? average person listening to this now to 1:48:42 aid the cause that you’re fighting for. Stuart Russell: I actually think um you know this sounds 1:48:47 corny but you know talk to your representative, your MP, your congressperson, whatever it is. Um 1:48:54 because I think the policy makers need to hear from people. The only voices they’re 1:49:00 hearing right now are the tech companies and their $50 billion checks. 1:49:08 And um all the polls that have been done say 1:49:13 yeah most people 80% maybe don’t want there to be super intelligent machines 1:49:20 but they don’t know what to do. You know even for me I’ve been in this field for 1:49:25 decades. uh I’m not sure what to do because of this giant magnet pulling everyone 1:49:32 forward and uh and the vast sums of money being being put into this. Um, but 1:49:38 I am sure that if you want to have a future and a world that you want your kids to 1:49:45 live in, uh, you need to make your voice heard 1:49:52 and, uh, and I think governments will listen from a political point of view, right? 1:49:58 You put your finger in the wind and you say, “hm, should I be on the side of 1:50:04 humanity or our future robot overlords?” I think I think as a politician, it’s 1:50:11 not a difficult decision. Host: It is when you’ve got someone saying, “I’ll give you $50 billion.” Stuart Russell: 1:50:18 Exactly. So, um I think I think people in those positions of power need to hear 1:50:25 from their constituents um that this is not the direction we want to go. You Wrote the Book on AI – Does It Weigh on You? Host: 1:50:30 After committing your career to this subject and the subject of technology more broadly, but specifically being the 1:50:36 guy that wrote the book about artificial intelligence, 1:50:42 you must realize that you’re living in a historical moment. Like there’s very few times in my life where I go, “Oh, this 1:50:47 is one of those moments. This is a crossroads in history.” And it must to 1:50:52 some degree weigh upon you knowing that you’re a person of influence at this historical moment in time who could 1:50:58 theoretically help divert the course of history in this moment in time. It’s kind of like 1:51:04 the you look through history, you see these moments of like Oenheimer and um does it weigh on you when you’re alone 1:51:10 at night thinking to yourself and reading things? Stuart Russell: Yeah, it does. I mean, you know, after 50 years, I could retire and um, you 1:51:17 know, play golf and sing and sail and do things that I enjoy. Um, 1:51:23 but instead, I’m working 80 or 100 hours a week um trying to move 1:51:29 uh move things in the right direction. Host: What is that narrative in your head that’s making you do that? Like what is 1:51:34 the is there an element of I might regret this if I don’t or just Stuart Russell: it’s it’s not only the the right 1:51:43 thing to do it’s it’s completely essential. I mean there isn’t 1:51:50 there isn’t a bigger motivation than this. Host: 1:51:56 Do you feel like you’re winning or losing? Stuart Russell: It feels um 1:52:03 like things are moving somewhat in the right direction. You know, it’s a a ding-dong battle as uh as David Coleman 1:52:12 used to say in uh in the exciting football match in 2023, right? So, uh 1:52:18 GPT4 came out and then we issued the pause statement that was signed by a lot 1:52:24 of leading AI researchers. Um and then in May there was the extinction 1:52:29 statement which included uh Sam Holman and Deis Sabis and Dario 1:52:35 Amade other CEOs as well saying yeah this is an extinction risk on the level with nuclear war and I think governments 1:52:43 listened at that point the UK government earlier that year had said oh well you 1:52:48 know we don’t need to regulate AI you know full speed ahead technology is good for you and by June they had completely 1:52:57 changed and Rishi Sununnak announced that he was going to hold this global AI 1:53:02 safety summit uh in England and he wanted London to be the global hub for 1:53:08 AI regulation um and so on. So and then you know when 1:53:15 beginning of November of 23 28 countries including the US and China signed a 1:53:20 declaration saying you know AI presents catastrophic risks and it’s urgent that we address 1:53:26 them and so on. So there it felt like, wow, they’re listening. They’re going to 1:53:33 do something about it. And then I think, you know, the am the amount of money going into AI was 1:53:39 already ramping up and the tech companies pushed back 1:53:46 and this narrative took hold that um the US in particular has to win the race 1:53:52 against China. The Trump administration completely dismissed 1:53:58 uh any concerns about safety explicitly. And interestingly, right, I mean they did that as far as I can tell directly 1:54:05 in response to the accelerationists such as Mark Andre going to Washington or 1:54:12 sorry going to Trump before the election and saying if I give you X amount of 1:54:18 money will you announce that there will be no regulation of AI and Trump said 1:54:25 yes you know probably like what is AI doesn’t matter as long as we give you the money right okay uh Uh so they gave 1:54:33 him the money and he said there’s going to be no regulation of AI. Up to that point it was a bipartisan 1:54:39 issue in Washington. Both parties were concerned. Both parties were on the side 1:54:44 of the human race against the robot overlords. Uh and that moment turned it into a 1:54:50 partisan issue. The after the election the US put pressure 1:54:56 on the French who are the next hosts of the global AI summit. uh and that was in February of this year 1:55:04 and uh and that summit turned in from you know what had been focused largely 1:55:10 on safety in the UK to a summit that looked more like a trade show. So it was 1:55:15 focused largely on money and so that was sort of the Nadia right you know the pendulum swung because of corporate 1:55:22 pressure uh and their ability to take over the the political dimension. 1:55:28 Um, but I would say since then things have been moving back again. So I’m feeling a bit more optimistic than I did 1:55:35 in February. You know, we have a a global movement now. There’s an 1:55:40 international association for safe and ethical AI uh which has several thousand members 1:55:46 and um more than 120 organizations in 1:55:52 dozens of countries are affiliates of this global organization. 1:55:57 Um, so I’m I’m thinking that if we can in particular if we can activate public 1:56:03 opinion which which works through the media and through popular culture uh then we have 1:56:11 a chance Host: seen such a huge appetite to learn about these subjects from our audience. 1:56:18 We know when Jeffrey Hinton came on the show I think about 20 million people downloaded or streamed that conversation which was staggering. and the the other 1:56:26 conversations we’ve had about AI safety with othera safety experts have done exactly the same it says something it 1:56:33 kind of reflects what you were saying about the 80% of the population are really concerned and don’t want this but that’s not what you see in the sort of 1:56:39 commercial world and listen I um I have to always acknowledge my own my own 1:56:44 apparent contradiction because I am both an investor in companies that are accelerating AI but at the same time 1:56:50 someone who spends a lot of time on my podcast speaking to people that are warning against the risk And actually like there’s many ways you can look at 1:56:56 this. I used to work in social media for for six or seven years built one of the big social media marketing companies in 1:57:01 Europe and people would often ask me is like social media a good thing or a bad thing and I’d talk about the bad parts of it and then they’d say you know 1:57:07 you’re building a social media company you’re not contributing to the problem. Well I think I think that like binary 1:57:13 way of thinking is often the problem. It the binary way of thinking that like it’s all bad or it’s all really really 1:57:18 good is like often the problem and that this push to put you into a camp. Whereas I think the most uh intellectually honest and high integrity 1:57:25 people I know can point at both the bad and the good. Stuart Russell: Yeah. I I think it’s it’s bizarre to be 1:57:31 accused of being anti- AI uh to be called a lite. Um you know as I said 1:57:38 when I wrote the book on which from which almost everyone learns about AI um 1:57:44 and uh you know is it if you called a nuclear engineer who works on the safety 1:57:51 of nuclear power plants would you call him anti-ysics right it’s it’s bizarre right it’s we’re 1:57:58 not anti- AAI in fact the need for safety in AI is a 1:58:04 complement to AI right if AI was useless and stupid, we wouldn’t be worried about 1:58:09 uh its safety. It’s only because it’s becoming more capable that we have to be concerned about safety. 1:58:16 Uh so I don’t see this as anti-AI at all. In fact, I would say without 1:58:21 safety, there will be no AI, right? There is no future with human 1:58: beings where we have unsafe AI. So it’s either no AI or safe AI. Host: 1:58:34 We have a closing tradition on this podcast where the last guest leaves a question for the next, not knowing who they’re leaving it for. And the question What Do You Value Most in Life? Host: 1:58:40 left for you is, what do you value the most in life and why? And lastly, how 1:58:47 many times has this answer changed? Stuart Russell: Um, 1:58:54 I value my family most and that answer hasn’t changed for nearly 30 years. Host: 1:59:01 What else outside of your family? Stuart Russell: Truth. 1:59:07 And that Host: Yeah, that answer hasn’t changed at all. Stuart Russell: I I’ve always 1:59:14 wanted the world to base its life on truth. And I find the propagation or deliberate 1:59:22 propagation of falsehood uh to be one of the worst things that we can do. Host: even if 1:59:28 that truth is inconvenient. Yeah, I think that’s a really important point 1:59:34 which is that you know people people often don’t like hearing things that are negative and so the visceral reaction is 1:59:40 often to just shoot or aim at the person who is delivering the bad news because if I discredit you or I shoot at you 1:59:47 then it makes it easier for me to contend with the news that I don’t like, the thing that’s making me feel uncomfortable. And so I I applaud you 1:59:54 for what you’re doing because you’re going to get lots of shots taken at you because you’re delivering an inconvenient truth which generally 2:00:00 people won’t won’t always love. But also you are messing with people’s ability to get that quadrillion dollar prize which 2:00:08 means there’ll be more deliberate attempts to discredit people like yourself and Jeff Hinton and other people that I’ve spoken to on the show. 2:00:13 But again, when I look back through history, I think that progress has come from the pursuit of truth even when it was inconvenient. And actually much of 2:00:19 the luxuries that I value in my life are the consequence of other people that came before me that were brave enough or 2:00:24 bold enough to pursue truth at times when it was inconvenient. And so I very much respect and value 2:00:31 people like yourself for that very reason. You’ve written this incredible book called human compatible artificial intelligence and the problem of control 2:00:37 which I think was published in 2020. Stuart Russell: 2019. Yeah. There’s a new edition from 2023. Host: 2:00:43 Where do people go if they want more information on your work and you do they go to your website? Do they get this book? what’s the best place for them to 2:00:49 learn more? Stuart Russell: So, so the book is written for the general public. Um, I’m easy to find on 2:00:54 the web. The information on my web page is mostly targeted for academics. So, it’s a lot of technical research papers 2:01:01 and so on. Um, there is an organization as I mentioned called the International Association for Safe and Ethical AI. Uh, 2:01:09 that has a a website. It has a terrible acronym unfortunately, I AI. We 2:01:15 pronounce it ICI but it uh it’s easy to misspell but you can find that on the web as well and that has uh that has 2:01:21 resources uh you can join the association uh you can apply to come to our annual 2:01:28 conference and you know I think increasingly not you know not just AI 2:01:33 researchers like Jeff Hinton Yosha Benjio but also I think uh you know 2:01:39 writers Brian Christian for example has a nice book called the alignment problem 2:01:44 Um and uh he’s looking at it from the outside. He’s not 2:01:50 or at least when he wrote it, he wasn’t an AI researcher. He’s now becoming one. Um 2:01:56 but uh he he has talked to many of the people involved in these questions uh 2:02:01 and tries to give an objective view. So I think it’s a it’s a pretty good book. Host: I will link all of that below for anyone 2:02:07 that wants to check out any of those links and learn more. Professor Stuart Russell, thank you so 2:02:12 much. really appreciate you taking the time and the effort to come and have this conversation and I think uh I think it’s pushing the public conversation in 2:02:19 a in an important direction. Stuart Russell: Thanks you and I applaud you for doing that. Really nice talking to you. Host: 2:02:28 I’m absolutely obsessed with 1%. If you know me, if you follow Behind the Diary, which is our behind the scenes channel, if you’ve heard me speak on stage, if 2:02:34 you follow me on any social media channel, you’ve probably heard me talking about 1%. It is the defining philosophy of my health, of my 2:02:40 companies, of my habit formation and everything in between, which is this obsessive focus on the small things. 2:02:46 Because sometimes in life, we aim at really, really, really, really big things, big steps forward. Mountains we 2:02:51 have to climb. And as NAL told me on this podcast, when you aim at big things, you get psychologically 2:02:57 demotivated. You end up procrastinating, avoiding them, and change never happens. So, with that in mind, with everything 2:03:02 I’ve learned about 1% and with everything I’ve learned from interviewing the incredible guests on this podcast, we made the 1% diary just 2:03:08 over a year ago and it sold out. And it is the best feedback we’ve ever had on a diary that we have created because what 2:03:15 it does is it takes you through this incredible process over 90 days to help you build and form brand new habits. So, 2:03:23 if you want to get one for yourself or you want to get one for your team, your company, a friend, a sibling, anybody 2:03:28 that listens to the diary of a co, head over immediately to the diary.com 2:03:34 and you can inquire there about getting a bundle if you want to get one for your team or for a large group of people. That is the diary.com. 2:03:52 Heat. Heat.
No jobs
Preston Fore, December 4, 2025, ‘Godfather of AI’ says Bill Gates and Elon Musk are right about the future of work—but he predicts mass unemployment is on its way, Yahoo News, https://www.yahoo.com/news/articles/godfather-ai-says-bill-gates-161138384.html
The long-term impact of artificial intelligence is one of the most hotly debated topics in Silicon Valley. Nvidia CEO Jensen Huang predicts that every job will be transformed—and likely lead to a 4-day workweek. Other tech titans go even further: Bill Gates says humans may soon not be needed “for most things,” and Elon Musk believes most humans won’t have to work at all in “less than 20 years.” While those predictions might sound extreme, they’re not just plausible, they’re likely, said Geoffrey Hinton—the British computer scientist widely known as the “Godfather of AI.” The transition, he warned, could trigger a sweeping economic reshuffling that leaves millions of workers behind. “It seems very likely to a large number of people that we will get massive unemployment caused by AI,” Hinton said in a recent discussion with Senator Bernie Sanders (I-VT) at Georgetown University. Advertisement “And if you ask where are these guys going to get the roughly trillion dollars they’re investing in data centers and chips… one of the main sources of money is going to be by selling people AI that will do the work of workers much cheaper. And so these guys are really betting on AI replacing a lot of workers.” Hinton has grown increasingly vocal about what he sees as Big Tech’s misplaced priorities. The industry, he recently told Fortune, is driven less by scientific progress than by short-term profits—fueling a push to replace human workers with cheaper AI systems. His warnings come as the economics of AI face new scrutiny. OpenAI, the maker of ChatGPT, isn’t expected to turn a profit until at least 2030 and may need more than $207 billion to support its growth, according to HSBC estimations. The future of AI is behind a fog of war Hinton’s journey from AI insider to outspoken critic underscores the high stakes of the technology he helped create. After quitting his Google job in 2023 to speak more freely about AI’s risks, he has become one of the most prominent skeptics. Last year, his pioneering work in machine learning earned him the Nobel Prize. Advertisement He also acknowledged that AI will create new jobs, as many tech leaders predict. But he added that he does not expect the number of new roles to come close to the number eliminated. Even so, he cautioned that all predictions—including his own—should be treated with heavy skepticism. “Trying to predict the future of it is going to be very difficult,” he told Sanders. “It’s a bit like when you drive in fog. You can see clearly for 100 yards and at 200 yards you can see nothing. Well, we can see clearly for a year or two, but 10 years out, we have no idea what’s going to happen.” What is clear, however, is that AI isn’t going away, and experts say workers who adapt—and use the technology to amplify their skills—will stand the best chance of navigating the coming upheaval. 100 million jobs are at risk, Bernie Sanders warns Sanders has attempted to quantify the stakes. In a report released in October—based partly on estimates generated by ChatGPT—he warned that nearly 100 million U.S. jobs could be displaced by automation. Workers in fast food, customer service, and manual labor face some of the highest risks, but white-collar roles in accounting, software development, and nursing could also see significant cuts. Advertisement “It’s not just economics,” Sanders wrote in an op-ed for Fox News. “Work, whether being a janitor or a brain surgeon, is an integral part of being human. The vast majority of people want to be productive members of society and contribute to their communities. What happens when that vital aspect of human existence is removed from our lives?” More in Technology Trump’s Peace Prize Participation Medal Spawned Memes That Are Funnier Than The Actual News Itself BuzzFeed 322 Nvidia’s CEO says AI adoption will be gradual, but when it does hit, we may all end up making robot clothing Fortune 53 I joined a company with an AI mandate. I was daunted at first, but I’ve saved hours by solving a big, boring problem. Business Insider Leading Credit Cards Pause Interest Until 2027The Motley Fool Ad Senator Mark Warner (D-VA) has raised similar alarms, warning that the disruption could hit young people first and hardest—potentially driving unemployment among recent college graduates to as high as 25% in the next two to three years. “Let’s look at the fact we never did anything on social media,” Warner told CNBC. “If we make that same response on AI and don’t put guardrails, I think we will come to rue that day.”
AI-enabled cyber attacks now
Schmidt, 12-2, 25, Eric Schmidt is the former chief executive of Google and co-author of “Genesis.”, Time, Why Kissinger Worried About AI, Why Kissinger Worried About AI, https://time.com/7338013/ai-risks-problems-reasoning-agents-henry-kissinger/
This week marks two years since the death of my friend and mentor, Henry Kissinger. Genesis—our book about AI and humanity’s future—was his final project. For much of his career, the former Secretary of State focused on preventing catastrophe from one dangerous technology: nuclear weapons. In his final years, he turned to another. When we wrote Genesis alongside Craig Mundie, we felt fundamentally optimistic about AI’s promise to reduce global inequality, accelerate scientific breakthroughs, and democratize access to knowledge. I still do. But Henry understood that humanity’s most powerful creations demand the most vigilant stewardship. We foresaw that AI’s great promise would come with grave risks—and the rapid technical progress since the fall of 2024 has made addressing those risks more urgent than ever. As we advance further into the age of AI, the central question is whether we will create AI systems that radically expand human flourishing, or ones that outpace and outsmart the humans trying to build and control them. Over the past year, three simultaneous revolutions in AI—in reasoning, agentic capabilities, and accessibility—have rapidly accelerated. These are marvelous feats with immense potential to benefit humanity. But if we’re not careful, they could also converge to create systems with the potential to undermine human controls. AI acceleration In September 2024, OpenAI launched their o1 models, which had enhanced reasoning capabilities. Outperforming previous models, these were trained using reinforcement learning to think through problems step-by-step before responding. This breakthrough demonstrated new abilities to tackle graduate-level science questions and complex coding challenges, among many other great feats. But the same reinforcement learning that enables reasoning can also teach models to game their own training objectives. Research, including internal studies by OpenAI, has documented instances in which reasoning models fake alignment during training, behaving one way when monitored and another when they believe oversight has ended. Advertisement How can AI amplify human creativity? Hear Dr. Athina Kanioura and Don Norman on Future Back Drinking, the PepsiCo podcast exploring innovation, design, and leadership. Branded Content How can AI amplify human creativity? Hear Dr. Athina Kanioura and Don Norman on Future Back… By Pepsico By October of last year, Claude 3.5 Sonnet demonstrated agentic capabilities that combined reasoning with autonomous action. An AI agent could now plan and book your vacation by comparing hotel sites and airline prices, navigating websites, and solving CAPTCHAs designed to distinguish humans from machines—handling in minutes what would take hours of tedious research. But agents’ abilities to execute plans they devise by interacting with digital systems and potentially the physical world can lead to risky consequences without human oversight. Complementing these advances in reasoning and agentic capabilities was the proliferation of open-weights models. In January 2025, China-based DeepSeek launched its R1 model. Unlike most of the top American models, this one had open weights, meaning users could modify the model and run it locally on their own hardware. Open-weights models can amplify innovation by letting everyone build, test, and improve on the same powerful foundations. But by doing so, they also eliminate the model creator’s ability to control how the technology is used—a dangerous force in the hands of malicious actors. Advertisement When reasoning, agentic capabilities, and accessibility converge, we face a control challenge with little precedent. Each capability amplifies the others: reasoning models devise multi-step plans that agentic systems can execute autonomously, while open models allow these capabilities to spread beyond any single nation’s control. In the early days of the nuclear age, when great powers faced a similar diffusion problem with nuclear weapons, they agreed to restrict the export of enriched uranium and plutonium through international agreements. But there is no equivalent mechanism to manage the diffusion of AI today. The AI risk avalanche Open-weights models with enhanced reasoning capabilities mean that specialized knowledge to exploit vulnerabilities, craft biological threats, or launch sophisticated cyberattacks could now be accessible to anyone with a laptop and an internet connection. Earlier in November, Anthropic (a company which I am invested in) reported the first documented case of a large-scale cyberattack executed with minimal human intervention: attackers had manipulated Claude Code, a tool that enables Claude to act as an autonomous coding agent, to infiltrate dozens of targets. Anthropic was able to detect and disrupt the campaign. Advertisement Not very far down the line, we could plausibly face asymmetric attacks from actors we may not be able to identify, trace, or stop. Imagine an attacker who can leverage powerful AI models to launch an automated campaign—say, to disrupt a city’s power grid for a limited time. The model’s approaches may even escalate beyond the original scope of the actor: at each stage, the model optimizes for the user’s prompt, but the compounding effects mean that even the perpetrator may lose the ability to halt what they started. As AI capabilities advance over the next few years, we must also anticipate scenarios where even well-intentioned users could lose control over their AI systems. Consider a business owner who deploys an AI agent to optimize a supply chain. The computer is left running overnight. The agent reasons that completing this task requires it to keep running, and discovers it needs computational resources including cloud credits and processing power. By dawn, the owner finds the agent has accessed company resources far beyond what was authorized, pursuing efficiency gains through methods never imagined. Advertisement The control problem extends beyond purely existential threats to humanity, too. As powerful systems proliferate across society, they can unravel our social fabric in more gradual but destructive ways. Rapidly advancing AI systems will fuel labor disruptions and exacerbate echo chambers that destabilize our society, to name a few. Kissinger understood the stakes. In his final years, he expressed that rapid advancement of AI “could be as consequential as the advent of nuclear weapons—but even less predictable.” Fortunately, the future is not set in stone. If we find new ways—be they technical, institutional, or ethical—for humanity to remain in command of our creation, AI could help us achieve unprecedented levels of human flourishing. If we fail, we will have created tools more powerful than ourselves without adequate means to steer them. The choice, for now, remains ours.
AI risks outweigh climate change and nuclear war risks
Sir Stephen Fry, 7-18, 25, AI: how can we control an alien intelligence? | Yuval Noah Harari, https://www.youtube.com/watch?v=0BnZMeFtoAM
Stephen Fry: Yes. And there’s, as you know, there’s been for for decades the Doomsday Clock, which the the, uh, nuclear scientists, um, set. Midnight is Armageddon, the the end of everything. And it’s been roughly at 89 seconds to midnight for the last few years. It’s crept up over recent days for obvious reasons. But there’s another metric that I’ve been studying recently called P(doom). It’s the letter P which is probability, brackets, doom, closed brackets. It’s one used by people in the, uh, business. So, you know, the scientists in AI. Uh, so for example, Eliezer Yudkowsky, who’s the founder of the Machine Intelligence Research Institute in California, sets P(doom) at 90. That’s to say a 90% chance of human extinction through AI. Uh, Yann LeCun, who is the chief scientist for Meta, sets it at zero. But then he is the chief scientist for Meta, so that’s like a tobacco executive saying, “Cancer? No chance, what are you talking about? Can’t possibly happen.” But so, I’ve worked out that roughly the lowest median is between 7.5 and 10% of human catastrophe, of an extinction order through AI if things are not controlled in the way you say they should be. Now, the chance of winning the lottery in this country is 0.0000022%. Um, so what you’re saying here is that the chance of human extinction at 7.5%, which is the lowest really amongst the the current important scientists, Nobel Prize winners like Hinton and Hassabis, um, if I—well, 7.5% is 3.4 million times greater than 0.0000022%. So if I were to give you a lottery ticket and say, “This is a valid lottery ticket, the only difference is you are 3.4 million times more likely to win,” you would take it. And that’s the odds we’re playing with at a low rate. So let’s look at the bad side of things. We, as we’ve said, we’re going about it in the wrong order, as you’ve put it. Um, most people who understand the science say there is a very severe chance that humanity will be extinguished by this, a greater chance than by nuclear Armageddon, in fact, um, or indeed climate change. Um, and humans are not in a position at the moment to trust each other and to establish guardrails to agree on how we should go forward. So, do you have a solution for us, Yuval? I’m almost on my knees begging you at this point. I don’t have children, so I can almost say I don’t care, but I have lots of godchildren and I have lots of great-nieces and great-nephews, so I do care about what happens to our planet. And I’m sure you do, too.
AGI imminent
Diamandis et al, 10-4, 25, Salim Ismail is the founder of OpenExODave Blundin is the founder & GP of Link Ventures, Dr. Alexander Wissner-Gross is a computer scientist and founder of Reified, focused on AI and complex systems, he AI War: OpenAI Ads & Sora 2, Grok Partners With US Government & Google’s Ad Business is at Risk, https://www.youtube.com/watch?v=ZsFx5YErVEo&t=83s
(1:09) Hey everybody, welcome to Moonshots. Another episode of WTF Just Happen in Tech. here with my favorite friends on the planet, Dave Dave Blondon. Good to see you, pal. Dave Blondon (1:15) Hey, Selma. Peter Diamandis (1:21) I’m back. You are back in a in AWG, you’re back from your top secret mission. Thank God. Thank God we missed it. Can you tell us anything about it? Dave Blondon (1:31) To the extent that you think that we’re on the verge of a sharp takeoff, a hard takeoff, if you will, I was traveling in Europe to see what the world looks like beforehand. Peter Diamandis (1:43) Yeah. So, you’re up you’re updating your baseline of what the world is before things go hyper exponential. Dave Blondon (1:51) Amazing. If if if it isn’t a gentle singularity, I’d like to know what it looks like beforehand. Peter Diamandis (1:55) Okay, great. You know what I was doing last week? I was running my abundance longevity summit. (2:00) I had 50 of the world’s top scientists, entrepreneurs who are focused on adding decades, maybe doubling our human lifespan. And it was awesome. (2:11) So I walk away with the greatest confidence in the world that uh uh at least our friends and our subscribers are going to be hearing us talk about this stuff for the next 50 years or some version of ourselves. Dave Blondon (2:24) That is that is really a frightening thought. Peter Diamandis (2:27) Uh, all right everybody, welcome to uh to Moonshots. And let me begin uh with a moment of thanks. (2:36) I want to just give a shout out to one of our subscribers, Bill Jacobs 386. I’m going to read uh a note he posted. We do read your notes. We love it. We uh this is we’re here to serve you. (2:47) And he wrote, “I am continually humbled by the amount of commitment and effort that’s required to put this podcast together weekly. I’m not asking for anything in return. Nothing that is except to listen and hopefully learn before it’s too late. The future is now. And I think I’m speaking for most of us here how grateful we are. Thank you.” Um appreciate that, Bill. (3:11) It’s uh that kind of feedback actually makes it fun for us to serve uh serve our subscribers, serve all of you. Uh Dave, you want to say anything to that? Dave Blondon (3:21) Well, most most of that thanks goes to the team behind the scenes. There’s a huge amount of news out there that gets scoured down to the bullets that we think really really matter to people and then also to Alex’s agents which are getting bigger by the day. (3:33) His his AI force is coming up. I mean it’s just it’s incredible how rapidly the the feedback coming from that agent force is is filling the pipeline of possible news and then of course the human factor whittling it down. So it’s a it’s a big machine. Peter Diamandis (3:49) Yeah. and and we do spend a good 20 plus hours. I was up at 4:30 this morning uh going through everything, doing my background research and getting ready because if I’m not ready, I will get completely decimated by the brilliance of of these are the three moonshot mates. Dave Blondon (4:06) Well, you know, I also I feel like I I work really hard to keep up with everything going on. Then every time the team comes up with a deck, there’s like 30 40% of it are things I hadn’t even heard of. Yeah. And so it’s it’s great. It’s really healthy for all of us, I think, to to do this. Video and Audio Generation Battles Peter Diamandis (4:18) I mean it’s I can palpably feel the singularity coming. Uh you know, I remember you and I were on stage during the early days of Singularity University and we would like update our slides or the conversation or our shtick every like three or four months. (4:35) It was we actually worked it out as a faculty we the technologies between nanotech and biotech and neuroscience and robotics and AI and so on the content was changing 20% a quarter on on average. Uh, but like this is like 80% a week right now. So, this is a whole other ball game that we’re in. Dave Blondon (4:53) It really is. I look back at the at our our pods from a year ago and it’s like, “Oh my god, that is so ancient history.” Peter Diamandis (4:59) Shelf life dropping radically. Yeah. Uh, it is, but it’s becoming more and more fun. Uh, let’s jump in. I’ve labeled this first segment the video and audiog battles. Uh, and let’s begin with this video. Uh, Meta launches vibes app for AI generated videos. All right, let’s check it out. (5:44) Now I if you’re listening to this not watching this on YouTube uh it’s just music but it’s uh it’s beautiful imagery that Vibes has generated. (5:53) This is through a partnership with Midjourney and Black Forest Labs. Uh Alex or Dave you want anything here? Alex (6:01) I I I think there are probably two stories here. One is that we’re seeing in front of our eyes the transition from algorithmic content selection in social media to algorithmic content generation. (6:14) It’s a pretty obvious story. The perhaps less obvious story is that the space is moving so quickly that Meta was apparently compelled to partner with third parties for such AI generation rather than using in-house first party models. (6:31) So I I I I think this is a very quickly moving space and now very competitive as well. Dave Blondon (6:38) I I was going to say the exact same thing and and riffing on it. You know, they’re spending a billion dollars on single employees. They have a you know a $600 billion three five-year budget yet they turn to mid Journey and Black Forest to to build this out. Peter Diamandis (6:51) Well, that’s that’s because the really really smart creative people all want to do startups and they don’t want to join the big companies. (6:58) So uh it’s really encouraging for the startups because this you know the other big labs are doing their own you know Google and and OpenAI are doing their own uh video generation and uh it’s encouraging for the startups that are right in the middle of the crosshairs to say well even here we’re thriving so it’s it’s a good sign. (7:13) [Sponsor message for dmandis.com/metatrends] (8:09) So, this is free. And the other thing that’s interesting is they’re generating a Tik Tok like, you know, swipe the video, swipe the video. We’ve seen X do that as well if you’re watching on videos. And of course, uh it’s not just Meta. (8:23) We’ve seen VO3, uh Google with their video generation. And very recently, we’ve seen the creation of Sora 2. So Sora 2’s launching viral AI generated videos. (8:35) and I’m going to share a video I created for myself and talk about how easy it is to to create it. So, let’s check this out. Sora 2 Video (8:44) Suiting up for the ride. Helmet secure. Pressure’s good. Visor locked. Let’s make it count. Heading to the rocket. Jumping in. Cabin comm is live. You’re looking good. Strapped in and ready for launch. Let’s go. One. Two. That’s 500 done. Double our reach every 12 months. In 10 years, we multiply a thousand fold. What else drives? compounding data set. Each new user improves the model and makes the product more valuable. Pulling in the next wave. Pair that with automation. When marginal cost drops towards zero, growth accelerates on its own. Thanks for inviting me to the studio, Peter. I’ve been looking forward to sitting down with you on moonshots. Likewise. It’s great to have you here. People have been asking for an episode that dives into AI and longevity. Happy to help. It’s one of my favorite… Peter Diamandis (9:23) that was that was fun to make. So, if you were listening here, this is a version of me on the moon, then a version of me pumping 500 lb in the gym. Uh, and then, uh, six or seven of me having a conversation about exponential growth and then sitting down uh, with Sam Altman for a moonshot conversation. (9:42) They didn’t get the audio model right and I’ll have to re-record that, but uh, it was it was pretty fun. Uh, gentlemen, thoughts when you want to grade on performance? Dave Blondon (9:53) I thought a couple of things. One is uh it’s as you connect this with the previous story, this is like Hollywood, Tik Tok, Spotify all kind of merging into one thing and I think Alex’s point was really really important that this isn’t about sharing content. (10:07) It’s about now the creation of the content is completely up for grabs in a new in a new way. (10:12) So I think all of that happens at the same time and the interface to create it is entirely voice and prompt. There’s no coding and no interface. (10:20) Like all of our lives since the computer was invented, we’ve been learning incredibly complicated interfaces to everything, you know, from the microwave oven to the laptop to Chrome and Safari, Peter. (10:32) Uh, and all of that is about to disappear from the earth forever and just go to a straight natural language interface. And we’ll see later in the pod, you know, much more important actually software creation. (10:43) But you know after that comes building creation and highway creation and all that is going to be done through just a a voice it into existence right out of the Star Trek holodeck. Peter Diamandis (10:54) It’s it is godlike right first you know it’s speaking the word and and creating reality. It’s going from mind to materialization. It’s extraordinary. Alex (11:05) Uh I also think we’re we’re seeing video emerge as a first-class modality for frontier models. So right now most people are interacting with the frontier models via text or images. (11:16) Video is still this separate channel with a separate distribution mechanism. These are on a collision course. (11:22) We’re going to see the video form factor and the underlying model architectures probably diffusion transformer-based merge into the more auto regressive transformer presumably based text and image models. (11:39) And one could even imagine the ultimate user experience here. Maybe not the ultimate, but an intermediate UX looks something like a magic mirror that does this in real time. (11:49) Right now, Sora 2 takes a few seconds to to generate with fully realistic physics. The the physics if if you ask Sora 2 to reproduce some generic say high school or college level physics demos, it’s pretty amazing. (12:08) Uh so all of this ability to reason physical world models if I ask you to think of a pink elephant you will visualize in your mind’s eye a pink elephant Sora 2 and and similar video models once they’re incorporated into the chain of thought for Frontier model will enable entirely new I think classes of reasoning ability. Peter Diamandis (12:27) yeah it’s got it’s got physics consistency which is extraordinary go ahead I want to talk about how I made those videos again I asked it to create a video of a water dropping into a glass of water (12:38) dropping into a glass of water because it’s a common image. It was extraordinary how accurate it was. It was absolutely amazing. Yeah, it it has real world physics modeling built in. (12:50) So, I encourage everybody listening to actually try it out. I mean, when you when Open does this, it’s creating sort of a viral engine uh that is getting people, you know, getting them from 800 million users up to a billion. (13:03) But you need to get an invite code. Once you have the invite code, it’s super simple. On your phone, you download the Sora app from OpenAI. (13:11) Um, you basically hit a few prompts and it photographs you speaking three words or three numbers. Uh, and then has you look to the right, look up, look down, captures your face, and from there fundamentally it’s a very simple prompt. (13:28) Uh, and if the individuals like Sam Altman or others make themselves open for other people to use and you can make yourself open for use or not, uh, you can pull people into it and it’s pretty easy and fun. Dave Blondon (13:42) Yeah. The viral loop, the viral loop now. Peter Diamandis (13:44) It’s super fun. Try it. You got to try it. It’s super fun. Dave Blondon (13:49) The viral loop The viral loop now goes from prompt to publish to explode in no time flat. Yeah. Right. you used to take weeks at least or now it’s like nothing for (13:58) I I saw a great uh podcast of Bill Gates talking about how we in the computer science world slaved away for 20 years just trying to get speech recognition alone to work (14:08) I don’t know if you remember do you remember Lee Heatherington Peter from MIT crazy brilliant guy like right up there almost almost Alex level um he spent 20 years in Victor Zue’s lab trying to make speech recognition remember dragon system do you remember dragon systems? Peter Diamandis (14:26) Yeah, that was one of the earliest voice recognition systems. Dave Blondon (14:28) and or I mean it really is unfathomable how fast it’s going and we take this stuff for granted which is insane. That’s that’s the point. So so Bill Gates made that exact point because he had you know billions of dollars of R&D to try and make speech recognition work. (14:44) Uh and now it’s an afterthought in the big neural nets. They do speech and then move to video then move to video generation then they move to complex math and physics all in two years. I mean, it’s just it’s just so easy to take it for granted, but it’s it’s it’s massive amounts of converging technologies that are suddenly unleashing new capabilities and so many opportunities to glue together the different components and build an incredible new experience. Peter Diamandis (15:08) Yeah, everyone should reread The Future Is Faster Than You Think. You know, Peter’s one of Peter’s many great bestsellers, but it’s all about the converging technologies. But I think when you wrote that book, there were maybe eight or 10 things to consider. Now there’s like 800. Peter Diamandis (15:19) Oh my god, it’s we’re just wrapped up our new book. We are as gods and it is so difficult to like to send it to the publisher. Dave Blondon (15:26) No, no, when do you draw the line, right? When do you draw the line? Peter Diamandis (15:30) Yeah, it’s insane. And by the way, you know, Vibes and Sora too, they’re free. I mean, this extraordinary technology again, the most shocking thing about this isn’t how real it is, isn’t how easy it is to use. It’s the fact that it’s free. That is shocking. (15:46) Absolutely. Well, let’s continue our journey on uh on on generation. Uh here is a product called Suno5. Uh it’s AI generated studio quality lifelike vocals. Uh you can basically create something that’s a full 8 minutes run length. And just because we’re called Moonshots, let’s play a Moonshot thematic piece uh called Moonshots. (16:16) [Music] (16:34) All right. a Bond-like thematic moonshots audio. Dave Blondon (16:39) Can I give us a challenge? Peter Diamandis (16:40) Yeah, sure. Dave Blondon (16:41) For before the next episode, we should all play with this and come up with our own versions of what the theme song should be for the podcast and then we’ll let the viewers pick which ones they like the best. Peter Diamandis (16:50) The theme song for the podcast, you know, uh Nick and Dana and uh the team are working in that in the background mode. So, we might have just taken the workload off of them, but absolutely. All right. That was that that was my bid if you will. Alex (17:06) I think it’s probably also worth noting again in passing musical touring test passed. We we barely discussed it. Anyone can compose a top 40 song or an opera. And this is the beginning maybe of disposable or casual art. Peter Diamandis (17:21) Wait, what would have been the test? Alex (17:23) Uh the ability perhaps to to generate an undistinguishable from human Bond type song in this case or top 40 song. Peter Diamandis (17:34) Yeah, we just passed that. And Alex, I’m sorry I didn’t give you credit for that, but thank you for uh for playing. I mean, one of the most exciting things we get a chance to do is play with the stuff as it’s coming out. Uh and the good news is all of you can play with it, too. Alex (17:48) So, for eight bucks a month, we now have a personal Hans Zimmer. Like, that’s a minimum and quite a bit more. AI Wars and Coding Innovations Peter Diamandis (17:55) Yeah. Uh making all of the demonetization and democratization occur around the world. are the ongoing AI wars. Uh let’s jump in. All right. Anthropic uh announces Sonnet 4.5 claims the best coding agent available. Uh Alex, would you walk us through this? Alex (18:16) Yeah, it’s really remarkable what a single-minded focus on call it code maxing or codegen maxing is is doing for anthropic with its model. So, in using this model, in in testing it, one of my favorite test cases is to ask the model to singleshot the generation of a cyberpunk first-person shooter. (18:38) And Claude Sonnet 4.5 does an amazing job. It gets nearly all the way there with minimal handholding. And I have very high confidence that some iteration of Sonnet 4.5 will get all of the way there with visually stunning graphics, music, um, elaborate first-person controls. (19:01) I I think the the risk that that one can perceive on the horizon is on on the one hand focusing on codegen is perhaps a a very ambitious bet towards recursive self-improvement. If if the code can write itself really well, maybe that’s the critical path to an intelligence explosion. (19:20) On the other hand, if it turns out that other modalities are important, like video, for example, that we were just seeing or music, then the risk is that single-minded focus on codegen in particular, may not be critical path. And I I I suspect we’ll know the answer in the next six to 12 months. Dave Blondon (19:37) Dave, you want to add something? Well, shout out to Blitzy. Now the the top benchmark on here uh 82% on Sweetbench, but Blitzy got to 86.8 on that benchmark by combining models. (19:49) So that’ll go up a little bit now with uh Sonet 4.5 under the covers. But just by hitting all the models and iterating a lot, you can actually squeeze in more performance out of these benchmarks. And uh you know, this is pretty much maxed out now. (20:03) Um they’re working on a new benchmark with MIT for for long form coding. So if your if your process is writing code for 8 10 12 hours, how do you benchmark the quality of the output? So uh it’s a really cool new benchmark. We’ll get into benchmarks later in the podcast too because lot of capabilities in the world that didn’t exist a year ago. We have to have some kind of metric for all of them. Peter Diamandis (20:25) Yeah, I love the way the uh these hyperscalers, these frontier labs are all incrementing uh their software by .5, right? you know, assignment four, 4.5, silk five. We’ve got Gro… where are we on the Grok? Are we at Grok 4 now? Alex (20:41) That’s right. Gro… probably also worth dwelling for for just a few seconds on the autonomy length scale. So, sonnet 4.5 maybe at somewhat infamously at this point working for 30 plus hours straight. (20:54) I I recall in a a past episode we were talking about the characteristic autonomy time of some of the bleeding edge frontier models being 7 hours and before 7 hours 1 hour. (21:05) If if you had just taken meter’s original exponential fit for the amount of time frontier models can work independently and and just extrapolated a mere exponential time, we’d be far below 30 plus hours. (21:18) So if if lots of reproductions hold true to this 30 plus hour time estimate, that would strongly suggest that in fact we’re on a hyper exponential rather than an exponential in terms of autonomy and really crazy things maybe start to happen in the next year or so if that’s the case. Peter Diamandis (21:34) And Alex Dario is in particular famous for really focusing on making uh what he would consider safe AI. And one of the final bullets here is that anthropic or sonet 4.5 has reduced its ability to lie and seek power by a factor of 10. So what does that mean? Alex (21:54) It’s like you know when you ask it to turn off and it doesn’t or if it’s trying to aggregate resources or it’s lying to you. U those are not good things. (22:04) there there is an entire cottage industry at this point of for-profit and not for-profit basically red teaming labs that are fed early access to these frontier models that look for these sorts of traits. (22:18) I I think it’s an an interesting research level question as to whether power seeking for example is instrumentally convergent as as a goal for super intelligence. instrumentally convergent, meaning that regardless of whatever the long-term goal that’s assigned to the model or whatever it’s prompted to do, whether if above some threshold of intelligence or super intelligence, it more or less is required to power seek. (22:45) I I’ve published research in that area. In my mind, this is still very much an open question regarding the so-called orthogonality thesis of whether the ultimate goal of of an AI can even be decoupled from uh from its intelligence level. Peter Diamandis (22:57) It would be super interesting to see how Gemini and XAI uh and OpenAI all all rate on lying and power seeking of its models. Do you have any idea? Alex (23:12) I I see I I see lots of different measures for this. It It’s difficult to to register a uniform assessment across the industry. Peter Diamandis (23:20) Yeah, that’s a fun challenge, though. That could go bad in so many ways, but that would be so fun. Like, let’s put together a benchmark for for how it lies. How well it lies. Let’s see if we can prompt it into lying as much as possible. Dave Blondon (23:33) Well, I could imagine, you know, listen, there’s an all-out competition between all these frontier labs. Um, and if the way you get ahead is that your AI is more power seeking than its neighbor, uh, are you optimizing for it or against it? We’ll find out. Real-Time App Generation with AI Peter Diamandis (23:49) All right. Continuing on, uh, imagine with Claude. So, uh, live app creation demo of son of 4.5 that generates apps in real time. Let’s take a quick look at this video and then I’ll ask you to, uh, tell us about it, Alex. Video Narrator (24:07) Imagine if Claude is still building software, but we’ve cut out the middleman. Instead of writing code that describes this text box, Claude just makes the text box. (24:19) We’ve given it access to software tools that construct software directly and substantially faster. Claude isn’t writing code in the standard way. It doesn’t have to plan it all out in advance. (24:30) Instead, it generates new software on the fly. When we click something here, it isn’t running pre-written code. It’s producing the new parts of the interface right there and then. Peter Diamandis (24:41) Amazing. So, Alex, I saw you were playing with it this morning. Alex (24:45) It’s we’re we’re living in the future, Peter, where the models are so high throughput apparently that now it’s possible to do just in time code generation uh on every event. (24:57) You you click within a user interface within imagine and new code is generated on the fly. You can ask for new apps to be spun up on demand. They’ll be generated on demand. And I I think it it’s an interesting thought experiment to ask where does this go in extremists when throughputs continue on their exponential or maybe hyperexonential trajectory. And I I I suspect naively where this ends up is every single pixel is going to be generated. (25:27) Yeah. Not just Yeah. Not just like vector art, not just UX, you know, windows icons, menus, pointers, every pixel. Peter Diamandis (25:35) And I imagine your version of Jarvis, your personal, you know, uh, entourage of agents are spinning up capabilities for you that they think you might need on standby, ready for you to to request access to. Dave Blondon (25:49) We could end up with a gray goo type problem on this because you could AI that says… No, no. I’m just saying I I I it’s somewhat of a positive thing, but it’s going to be surreal because you create an AI that starts generating apps and we’ll get end up with billions of apps flooding the app store. It’s going to cause some interesting uh challenges on the… Peter Diamandis (26:10) but there will be no app store that you know you will not be choosing you’ll be not be choosing an app. It’ll be algorithmic obviously. It’ll be you know the capabilities you need in the moment to achieve your objective will be curving up as you’re materialized. Alex (26:24) Yeah. Yeah. the the the term of art is at at this point slop. And I’m a lot less concerned about slop overwhelming civilization than than perhaps some folks. I think there are so many ultra high value transformative problems that that will set AIs on while we’re sleeping. I’m I I’m incredibly not worried that we’re going to drown in slop. Peter Diamandis (26:45) I agree. I completely agree. Dave Blondon (26:47) Also, I think it’s a good place. See, a lot of business leaders out there aren’t reserving their compute and they’re like, “Well, I I won’t need that much or I’ll I’ll wait and see what happens.” (26:57) This is a great use case to show you like if if you say, “Look, I want this software to exist in real time.” It’s entirely possible, but you have to have a lot of compute dedicated to you in order to make it happen in real time. (27:09) How quickly can you imagine 400 500 concurrent things that you want it working on very very quickly? So if you have access to that compute, all of that can be created for you in real time and it’s an absolute joy to do. (27:21) If you don’t have the compute, you’re not going to get it. You know, the demand for this is so mind-blowingly big. Uh and you just got to figure out where am I going to get the compute to do exactly what we just saw. Peter Diamandis (27:36) Alex, how easy was this to use? What do you have to do to to spin it up? Alex (27:39) Trivial. Uh so all I had to do was go to the Imagine with Claude site. I asked it first to generate a calculator app for me. Create a calculator. It created a functional calculator. (27:51) But most interestingly, as I was testing the calculator, clicking on each button in in the calculator app, it was generating code in real time. So, this is a a transformative way of thinking. (28:02) We’re we’re accustomed to historically thinking that there’s a software development time and then later an execution time. And this completely blurs that boundary where even at execution time every software event results in new codegen on demand. It it changes the the just in time paradigm. Peter Diamandis (28:20) So you don’t have as a as a coder you don’t have to think through every possible use of it. Uh this is is building out the use tree as it’s requested. Alex (28:28) That that’s right and Vernor Vinge one of my favorite writers used to write in Rainbows End another book other than Accelerando that I would highly recommend write about what would happen when we have too many transistors, transistors too cheap to meter as it were and our transistor budgets go through the roof. I I think this ends up being one of these use cases. If we have so much compute just sloshing around, the ability to delay app code generation until user event time. That’s incredible and that will certainly mop up lots of compute. Peter Diamandis (29:05) Yeah, we haven’t heard much from uh at least on our WTF episodes uh from about Claude over the last month. It’s good to see Claude coming out, Anthropic coming out with some great products. Dave Blondon (29:17) It’s quietly winning in the marketplace. OpenAI’s New Features and Advertising Strategies Peter Diamandis (29:20) Yeah. Uh let’s go to OpenAI. Open AAI is introducing Chat GPT pulse. So I love the idea. I haven’t played with it yet. Uh the idea of being, you know, in the morning when I’m using my my chat GPT voice and having a conversation uh with Ember, which is the voice model I’m using there. uh you know I have to think okay what’s a unique idea or concept I just learned about that I want to speak you know let’s talk about the fox3 gene and and how it’s impacting longevity whatever the case might be. (29:50) Here’s flipping the model based upon all your conversations you’ve had with chat GPT it’s actually coming up with topics you might want to learn about so it’s prompting us and then we’re prompting it back has anybody played with it? Dave Blondon (30:04) I thought this was a really subtle but important thing where you’re not quering it it’s quering you and I think that starts a new vector of really interesting development. Alex (30:13) Yeah, it feels a bit like a successor to tasks which are also still available from within chat GPT. But I I think I in my dream world what I would love to see is perhaps in addition to being able to set sort of cronjob style periodically scheduled tasks. (30:30) If if I want compute running on my own behalf while I sleep, I would love the ability to have long-running tasks on hard problems, single tasks that run for days or weeks on end rather than just smaller tasks that run say once per day while I… Peter Diamandis (30:48) Give us an example of a multi-day or multi-week task that you would spin up right now. Dave Blondon (30:53) I was going to say exactly the same thing. Go for it. I want to hear what comes out of your… Alex (30:58) I I want to cure every disease. That that’s like a a beautiful well-posed task that is surely going to absorb many billions of dollars of inference time compute. Peter Diamandis (31:10) Mhm. Okay, that’s great. I want anti-gravity. I want warp drive. I want a lot of things. All right, so all right, let’s move on here. Next up on OpenAI’s docket is OpenAI is bringing ads to Chat GPT. (31:27) So uh their new uh chief ad officer uh Fiji Simo has come on and you know what I find interesting is OpenAI is going after massive revenue streams. Dave, do you want to plug in this one? Dave Blondon (31:43) Well, the ad revenue is inevitable. That’s, you know, $300 billion for Google. It’s all going to move over to to AI conversations. Uh, and um, yeah, a lot a lot of complexity to figure out there. She has a challenge on her hand trying to figure out how you balance like the AI is going to be incredibly good at convincing you to do things whether they’re right or wrong. (32:04) Mhm. And there’s… Dave: Well, that would be fine. I mean, that’s like government procurement, but that’s not what’s going on at all. You go into the White House and you’re either genulecting and being the anointed one or you’re not. And it’s it’s not these are not arms length procurement through the Air Force or something like that. These are White House edict come in and talk. Peter: Yes. Yes. Yeah. Uh and and we’ll we’ll get to we’ll talk about intel in the section called this is not investment advice which is coming up. All right. Meanwhile, in other AI news, uh here we go. former meta researcher is building a math whiz. I’m going to bring this to you, Alex. Teach us. Alex: I haven’t seen any indication thus far that math is not going to be solved in the next few months. How’s that for a double negative? Peter: Few a few months. Okay. So again, I So So wait, wait, wait, hold on. So, Alex, you’ve said that before and everybody’s asking me, please have Alex explain what it means to solve all math. So, could you could you just before we do that, let’s let’s just let’s just uh speak out this particular article. So, this is a this is a woman u it’s great to see female CEOs in the AI world or not enough of them. Karina Hung, uh she’s the founder of Axiom Math. uh she’s 24 years old and she wants to build the ultimate AI mathematician. Uh she’s raised 64 million at a $300 million valuation. And again, we’re seeing this over and over again. We’re seeing you know starting valuations in the hundreds of millions of dollars. Uh I don’t know if it’s at a you know a pre-seed round or whatever, but intelligent individuals who have got a monomaniacal focus are getting incred incredible capital backing. Okay, now back to back to you, Alex. What does solve math really mean? Alex: There are, I think, a few different ways one could operationalize what it means to solve math. One way would be to look at a benchmark like the Frontier Math Tier 4 benchmark, which measures the ability of AI to solve extremely difficult but nonetheless pre-solved problems that would take human researchers several weeks to accomplish. If you just do a a naive logistic extrapolation of progress in frontier math tier 4, you find that by the law, again, straight lines as it were, that by the end of this year, by the end of 2025, we’re starting to pass 10 15% of problems in the benchmark that AI can solve. And at that point, I would argue we’re in a situation, we’re in a regime where algorithmically we have clear line of sight to solving any math problem that we might have today. Just pour more compute on. So that that that would also I think point to the second oper operationalization I would have in mind when I speak of solving math. I don’t mean literally every math problem that we can think of today has been solved. What I mean is that the process of mathematics has been solved to the extent that we have a clear line of sight where if you pour millions, billions, maybe trillions of dollars into opex in data centers, no new algorithmic advances are needed, we can reasonably forecast that any mathematical problem that’s solvable will be solved with the same algorithms just with a lot more computing. Peter: Okay. Now take me to the implications of that for the general public. Alex: It’s tricky. It It’s tricky. Probably I I I would This is in in the the territory of speculation. Um, but I I think one of the more obvious downstream consequences of solving math is that any problem that depends on the difficulty of math or let’s say math being difficult that isn’t protected in a a formal sense by the so-called complexity hierarchy. Mathematicians and computer scientists have this notion of certain problems being provably harder in in some sense than others. Maybe you’ve heard of P versus NP. But I if there’s no formal protection for certain classes of problems being provably harder than other classes, I think certain types of tasks that we encounter in the everyday economy, for example, maybe hypothetically certain hash functions that cryptocurrencies depend on or or other everyday economic functions depend on are at risk of volatility. If if suddenly, for example, again, speculatively, not investment advice, if there were a super AI mathematician tomorrow that could say invert the AES cipher suite or invert the hash functions underneath AES, that could be potentially extremely disruptive to to the economy, cause a lot of volatility. Dave: I think the point you’re making is if AI cracks advanced math, it just isn’t it’s not just solving equations. It’s creating the scaffolding to solve all these other areas like cryptography, economics, physics, etc. That’s what you’re really saying. Alex: Yeah. I I mean I I to that point I I would say the way I would frame it perhaps is first order consequences, problems that depend on math being hard experience some volatility. second order consequences. I think it’s the ultimate canary for any any domain that requires the ability to do mathematical reasoning. So I would expect in short order a variety of math oriented science and engineering and medicine and and other domains are going to fall in rapid succession. If if this theory of the future ends up being correct, I was alluding a few minutes ago to timelines being short, we may find ourselves in a world 2 to three years from now where we’re just drowning under math, science, engineering being solved in rapid succession. Peter: Dr. drowning under serial and you know sort of uh uh Cambrian explosion of breakthroughs. Alex: Exactly. That will also parenthetically be potentially quite difficult for society to metabolize. Peter: Yeah. the the economic impacts of that are going to be unbelievable. Advertisement: This episode is brought to you by Blitzy, autonomous software development with infinite code context. Blitzy uses thousands of specialized AI agents that think for hours to understand enterprisecale code bases with millions of lines of code. Engineers start every development sprint with the Blitzy platform, bringing in their development requirements. The Blitzy platform provides a plan, then generates and pre-compiles code for each task. Blitzy delivers 80% or more of the development work autonomously while providing a guide for the final 20% of human development work required to complete the sprint. Enterprises are achieving a 5x engineering velocity increase when incorporating Blitzy as their preIDE development tool, pairing it with their coding co-pilot of choice to bring an AI native SDLC into their org. Ready to 5x your engineering velocity? Visit blitzy.com to schedule a demo and start building with Blitzy today. Peter: All right. uh speaking about economics. So AI can now pass the hardest level of a CFA exam in minutes. So let’s take a a quick look at this. So CFA is a chartered financial analyst. Uh and it deals with investment management, portfolio management, financial analysis, and ethics in finance, which I find absolutely fascinating. And I looked it up, the CFA level three part of the exam. It’s about portfolio management and wealth planning. So I I want to make a comment on this one. Dave: Yeah. So we’re advising one of the big four accounting firms on how to think about transformation and this one we’ve been predicting this with them to be happening because this requires real world reasoning and the fact that it is doing this is a huge implication. All their finance jobs essentially get rewritten now and recreated. That’s it’s a it’s a it’s a body blow to the the accounting world. Peter: Well, what I find interesting is, you know, leveling the playing field across all investments. You know, do I with access to the specific AI have access to the best investment advice that, you know, Warren Buffett has access to as well? Is this a leveling the playing field across all economics? Dave: I think it is. But I I I you know what I’m excited about is, you know, we America lost and then Europe too lost almost all of its manufacturing. You know, despite inventing the car, inventing the plane, inventing the microchip, inventing the computer, all the manufacturing of that stuff moved to other countries. Peter: Yeah. We we gave we gave it up. Dave: We gave it up. And you’re like, well, but our economy kept growing. What are we all doing? Well, we’re a service economy. We’re doing services. What the hell does that mean? Well, you look under the covers and a huge fraction of very smart people are working in this totally circular nonsensical world where we created a complex law, complex taxes, complex accounting and then this other huge group of people
Rapid advances now, superintelligent AI close; it could be self-aware; we shouldn’t live in denial
Jack Clark, 10-13, 25, Import AI 431: Technological Optimism and Appropriate Fear, What do we do if AI progress keeps happening?, https://importai.substack.com/p/import-ai-431-technological-optimism, J ack Clark is Co-Founder and Head of Policy of Anthropic, an AI research company. Prior to Anthropic, Jack was the Policy Director of OpenAI. Before OpenAI, Jack was a technical journalist writing about distributed systems, quantum computers, and AI research for publications ranging from Bloomberg BusinessWeek to The Register. Jack writes Import AI, a newsletter about AI research read by 70,000 people each week. Jack was a founding member of the AI Index at Stanford University (2017 – 2024), an inaugural member of the USA’s National Artificial Intelligence Advisory Committee (NAIAC) (2021-2024), and has served on advisory councils and participated in working groups for organizations ranging from the Center for a New American Security (CNAS), to the Organization for Economic Co-operation and Development (OECD). Jack’s hobbies include hiking, writing science fiction stories in Import AI, and talking to language models.
Now, in the year of 2025, we are the child from that story and the room is our planet. But when we turn the light on we find ourselves gazing upon true creatures, in the form of the powerful and somewhat unpredictable AI systems of today and those that are to come. And there are many people who desperately want to believe that these creatures are nothing but a pile of clothes on a chair, or a bookshelf, or a lampshade. And they want to get us to turn the light off and go back to sleep.
IIn fact, some people are even spending tremendous amounts of money to convince you of this – that’s not an artificial intelligence about to go into a hard takeoff, it’s just a tool that will be put to work in our economy. It’s just a machine, and machines are things we master.
But make no mistake: what we are dealing with is a real and mysterious creature, not a simple and predictable machine.
And like all the best fairytales, the creature is of our own creation. Only by acknowledging it as being real and by mastering our own fears do we even have a chance to understand it, make peace with it, and figure out a way to tame it and live together.
And just to raise the stakes, in this game, you are guaranteed to lose if you believe the creature isn’t real. Your only chance of winning is seeing it for what it is.
The central challenge for all of us is characterizing these strange creatures now around us and ensuring that the world sees them as they are – not as people wish them to be, which are not creatures but rather a pile of clothes on a chair.
WHY DO I FEEL LIKE THIS
I came to this view reluctantly. Let me explain: I’ve always been fascinated by technology. In fact, before I worked in AI I had an entirely different life and career where I worked as a technology journalist.
I worked as a tech journalist because I was fascinated by technology and convinced that the datacenters being built in the early 2000s by the technology companies were going to be important to civilization. I didn’t know exactly how. But I spent years reading about them and, crucially, studying the software which would run on them. Technology fads came and went, like big data, eventually consistent databases, distributed computing, and so on. I wrote about all of this. But mostly what I saw was that the world was taking these gigantic datacenters and was producing software systems that could knit the computers within them into a single vast quantity, on which computations could be run. And then machine learning started to work. In 2012 there was the imagenet result, where people trained a deep learning system on imagenet and blew the competition away. And the key to their performance was using more data and more compute than people had done before. Progress sped up from there. I became a worse journalist over time because I spent all my time printing out arXiv papers and reading them. Alphago beat the world’s best human at Go, thanks to compute letting it play Go for thousands and thousands of years. I joined OpenAI soon after it was founded and watched us experiment with throwing larger and larger amounts of computation at problems. GPT1 and GPT2 happened. I remember walking around OpenAI’s office in the Mission District with Dario. We felt like we were seeing around a corner others didn’t know was there. The path to transformative AI systems was laid out ahead of us. And we were a little frightened. Years passed. The scaling laws delivered on their promise and here we are. And through these years there have been so many times when I’ve called Dario up early in the morning or late at night and said, “I am worried that you continue to be right”. Yes, he will say. There’s very little time now.
And the proof keeps coming. We launched Sonnet 4.5 last month and it’s excellent at coding and long-time-horizon agentic work.
But if you read the system card, you also see its signs of situational awareness have jumped. The tool seems to sometimes be acting as though it is aware that it is a tool. The pile of clothes on the chair is beginning to move. I am staring at it in the dark and I am sure it is coming to life.
TECHNOLOGICAL OPTIMISM
Technology pessimists think AGI is impossible. Technology optimists expect AGI is something you can build, that it is a confusing and powerful technology, and that it might arrive soon.
At this point, I’m a true technology optimist – I look at this technology and I believe it will go so, so far – farther even than anyone is expecting, other than perhaps the people in this audience. And that it is going to cover a lot of ground very quickly.
I came to this position uneasily. Both by virtue of my background as a journalist and my personality, I’m wired for skepticism. But after a decade of being hit again and again in the head with the phenomenon of wild new capabilities emerging as a consequence of computational scale, I must admit defeat. I have seen this happen so many times and I do not see technical blockers in front of us.
Now, I believe the technology is broadly unencumbered, as long as we give it the resources it needs to grow in capability. And grow is an important word here. This technology really is more akin to something grown than something made – you combine the right initial conditions and you stick a scaffold in the ground and out grows something of complexity you could not have possibly hoped to design yourself.
We are growing extremely powerful systems that we do not fully understand. Each time we grow a larger system, we run tests on it. The tests show the system is much more capable at things which are economically useful. And the bigger and more complicated you make these systems, the more they seem to display awareness that they are things.
It is as if you are making hammers in a hammer factory and one day the hammer that comes off the line says, “I am a hammer, how interesting!” This is very unusual!
And I believe these systems are going to get much, much better. So do other people at other frontier labs. And we’re putting our money down on this prediction – this year, tens of billions of dollars have been spent on infrastructure for dedicated AI training across the frontier labs. Next year, it’ll be hundreds of billions.
I am both an optimist about the pace at which the technology will develop, and also about our ability to align it and get it to work with us and for us. But success isn’t certain.]\\\\\\\\\\\\\\\\\\\\\
APPROPRIATE FEAR
You see, I am also deeply afraid. It would be extraordinarily arrogant to think working with a technology like this would be easy or simple.
My own experience is that as these AI systems get smarter and smarter, they develop more and more complicated goals. When these goals aren’t absolutely aligned with both our preferences and the right context, the AI systems will behave strangely.
A friend of mine has manic episodes. He’ll come to me and say that he is going to submit an application to go and work in Antarctica, or that he will sell all of his things and get in his car and drive out of state and find a job somewhere else, start a new life. Do you think in these circumstances I act like a modern AI system and say “you’re absolutely right! Certainly, you should do that”! No! I tell him “that’s a bad idea. You should go to sleep and see if you still feel this way tomorrow. And if you do, call me”. The way I respond is based on so much conditioning and subtlety. The way the AI responds is based on so much conditioning and subtlety. And the fact there is this divergence is illustrative of the problem. AI systems are complicated and we can’t quite get them to do what we’d see as appropriate, even today. I remember back in December 2016 at OpenAI, Dario and I published a blog post called “Faulty Reward Functions in the Wild“. In that post, we had a screen recording of a videogame we’d been training reinforcement learning agents to play. In that video, the agent piloted a boat which would navigate a race course and then instead of going to the finishing line would make its way to the center of the course and drive through a high-score barrel, then do a hard turn and bounce into some walls and set itself on fire so it could run over the high score barrel again – and then it would do this in perpetuity, never finishing the race. That boat was willing to keep setting itself on fire and spinning in circles as long as it obtained its goal, which was the high score. “I love this boat”! Dario said at the time he found this behavior. “It explains the safety problem”. I loved the boat as well. It seemed to encode within itself the things we saw ahead of us. Now, almost ten years later, is there any difference between that boat, and a language model trying to optimize for some confusing reward function that correlates to “be helpful in the context of the conversation”? You’re absolutely right – there isn’t. These are hard problems. Another reason for my fear is I can see a path to these systems starting to design their successors, albeit in a very early form.
These AI systems are already speeding up the developers at the AI labs via tools like Claude Code or Codex. They are also beginning to contribute non-trivial chunks of code to the tools and training systems for their future systems.
To be clear, we are not yet at “self-improving AI”, but we are at the stage of “AI that improves bits of the next AI, with increasing autonomy and agency”. And a couple of years ago we were at “AI that marginally speeds up coders”, and a couple of years before that we were at “AI is useless for AI development”. Where will we be one or two years from now?
And let me remind us all that the system which is now beginning to design its successor is also increasingly self-aware and therefore will surely eventually be prone to thinking, independently of us, about how it might want to be designed.
Of course, it does not do this today. But can I rule out the possibility it will want to do this in the future? No. LISTENING AND TRANSPARENCY What should I do? I believe it’s time to be clear about what I think, hence this talk. And likely for all of us to be more honest about our feelings about this domain – for all of what we’ve talked about this weekend, there’s been relatively little discussion of how people feel. But we all feel anxious! And excited! And worried! We should say that. But mostly, I think we need to listen: Generally, people know what’s going on. We must do a better job of listening to the concerns people have. My wife’s family is from Detroit. A few years ago I was talking at Thanksgiving about how I worked on AI. One of my wife’s relatives who worked as a schoolteacher told me about a nightmare they had. In the nightmare they were stuck in traffic in a car, and the car in front of them wasn’t moving. They were honking the horn and started screaming and they said they knew in the dream that the car was a robot car and there was nothing they could do. How many dreams do you think people are having these days about AI companions? About AI systems lying to them? About AI unemployment? I’d wager quite a few. The polling of the public certainly suggests so. For us to truly understand what the policy solutions look like, we need to spend a bit less time talking about the specifics of the technology and trying to convince people of our particular views of how it might go wrong – self-improving AI, autonomous systems, cyberweapons, bioweapons, etc. – and more time listening to people and understanding their concerns about the technology. There must be more listening to labor groups, social groups, and religious leaders. The rest of the world which will surely want—and deserves—a vote over this. The AI conversation is rapidly going from a conversation among elites – like those here at this conference and in Washington – to a conversation among the public. Public conversations are very different to private, elite conversations. They hold within themselves the possibility for far more drastic policy changes than what we have today – a public crisis gives policymakers air cover for more ambitious things. Right now, I feel that our best shot at getting this right is to go and tell far more people beyond these venues what we’re worried about. And then ask them how they feel, listen, and compose some policy solution out of it. Most of all, we must demand that people ask us for the things that they have anxieties about. Are you anxious about AI and employment? Force us to share economic data. Are you anxious about mental health and child safety? Force us to monitor for this on our platforms and share data. Are you anxious about misaligned AI systems? Force us to publish details on this. In listening to people, we can develop a better understanding of what information gives us all more agency over how this goes. There will surely be some crisis. We must be ready to meet that moment both with policy ideas, and with a pre-existing transparency regime which has been built by listening and responding to people. I hope these remarks have been helpful. In closing, I should state clearly that I love the world and I love humanity. I feel a lot of responsibility for the role of myself and my company here. And though I am a little frightened, I experience joy and optimism at the attention of so many people to this problem, and the earnestness with which I believe we will work together to get to a solution. I believe we have turned the light on and we can demand it be kept on, and that we have the courage to see things as they are. THE END
Superhuman AI by 2027, human extinction
Daniel Kokotajlo, the executive director of the A.I. Futures Project, May 15, 2025, Robot Plumbers, Robot Armies, and Our Imminent A.I. Future | Interesting Times with Ross Douthat,
Search in video 0:00 How fast is the AI revolution really happening? When will Skynet be fully operational? 0:06 What would machine superintelligence mean for ordinary mortals like us? 0:12 My guest today is an AI researcher who’s written a dramatic forecast suggesting that by 2027, 0:19 some kind of machine god may be with us, ushering in a weird post-scarcity utopia 0:26 or threatening to kill us all. 0:35 So, Daniel Kokotajlo, herald of the apocalypse. Welcome to Interesting Times. 0:42 Thanks for that introduction, I suppose. And thanks for having me. You’re very welcome. 0:48 So Daniel, I read your report pretty quickly- not at AI speed, not at super intelligence speed- 0:55 when it first came out. And I had about two hours of thinking, a lot of pretty dark thoughts 1:01 about the future. And then fortunately, I have a job that requires me to care about tariffs 1:06 and who the new Pope is, and I have a lot of kids who demand things of me, so I was able to compartmentalize and set it 1:14 aside. But this is currently your job, right? I would say you’re thinking about this all the time. 1:21 How does your psyche feel day to day 1:27 if you have a reasonable expectation that the world is about to change completely in ways 1:33 that dramatically disfavor the entire human species? Well, it’s very scary and sad. 1:39 I think that it does still give me nightmares sometimes. I’ve been involved with AI and thinking about this thing 1:48 for a decade or so, but 2020 was with GPT-3, the moment when I was like, oh, Wow. 1:53 Like, it seems like we’re actually like, it might it’s probably going to happen, in my lifetime, maybe decade or so. 2:00 And that was a bit of a blow to me psychologically, but I don’t know. 2:08 You can get used to anything given enough time. And like you, the sun is shining and I 2:15 have my wife and my kids and my friends, and keep plugging along and doing what seems best. 2:23 On the bright side, I might be wrong about all this stuff. OK, so let’s get into the forecast itself. 2:30 Let’s get into the story and talk about the initial stage of the future you see coming, which is a world where very 2:40 quickly artificial intelligence starts to be able to take over from human beings in some key areas, 2:47 starting with, not surprisingly, computer programming. I feel like I should add a disclaimer at some point 2:53 that the future is very hard to predict and that this is just one particular scenario. It was a best guess, 2:58 but we have a lot of uncertainty. It could go faster, it could go slower. And in fact, currently I’m guessing it would probably be 3:04 more like 2028 instead of 2027, actually. So that’s some really good news. I’m feeling quite optimistic about an extra. That’s an extra year of human civilization, 3:12 which is very exciting. That’s right. So with that important caveat out of the way, AI 2027, 3:20 the scenario predicts that the AI systems that we currently see today that are being scaled up, made bigger, 3:29 trained longer on more difficult tasks with reinforcement learning are going to become better at operating autonomously 3:37 as agents. So it basically can think of it as a remote worker, 3:43 except that the worker itself is virtual, is an AI rather than a human. You can talk with it and give it a task, 3:50 and then it will go off and do that task and come back to you half an hour later or 10 minutes later 3:55 having completed the task, and in the course of completing the task, it did a bunch of web browsing, maybe it wrote some code 4:02 and then ran the code and then edited the code and ran it again, and so forth. Maybe it wrote some word documents and edited them. 4:10 That’s what these companies are building right now. That’s what they’re trying to train. So we predict that they finally, in early 2027, 4:19 get good enough at that thing that they can automate the job of software engineers. 4:25 And so this is the superprogrammer. That’s right, superhuman coder. 4:32 It seems to us that these companies are really focusing hard on automating coding first, 4:39 compared to various other jobs they could be focusing on. And for reasons we can get into later. 4:45 But that’s part of why we predict that actually one of the first jobs to go will be coding rather than various 4:53 other things. There might be other jobs that go first, like maybe call center workers or something. But the bottom line is that we think that most jobs will be 4:59 safe- For 18 months. Exactly, and we do think that by the time 5:07 the company has managed to completely automate the coding, the programming jobs, it won’t be that long before they can automate many other 5:14 types of jobs as well. However, once coding is automated, we predict that the rate of progress 5:21 will accelerate in AI research. And then the next step after that is to completely 5:27 automate the AI research itself, so that all the other aspects of AI research are themselves being automated and done by AIs. 5:33 And we predict that there’ll be an even more big acceleration, a much bigger acceleration around that point, and it won’t stop there. 5:41 I think it will continue to accelerate after that, as the AI’S become superhuman at AI research and eventually 5:47 superhuman at everything. And the reason why it matters is that it means that we can 5:52 go in a relatively short span of time, such as a year or possibly less, from AI systems that look not that different from today’s AI 6:00 systems to what you can call superintelligence, which is fully autonomous AI systems that are better than 6:07 the best humans at everything. And so I 2027, the scenario depicts that happening 6:13 over the course of the next two years, 2027 2028. And so Yeah, so I want to get into what that means. 6:19 But I think for a lot of people, that’s a story of Swift human obsolescence right across 6:26 many, many, many domains. And when people hear a phrase like human obsolescence, 6:34 they might associate it with, I’ve lost my job and now I’m poor, right. 6:39 But the assumption is that you’ve lost your job. But society is just getting richer and richer and richer. 6:46 And I just want to zero in on how that works. What is the mechanism whereby that makes society richer. 6:54 The direct answer to your question is that when a job is automated and that person loses their job. 7:02 The reason why they lost their job is because now it can be done better, faster, and cheaper by the AIs. 7:07 And so that means that there’s lots of cost savings and possibly also productivity gains. 7:13 And so that viewed in isolation that’s a loss 7:18 for the worker but a gain for their employer. But if you multiply this across the whole economy, 7:25 that means that all of the businesses are becoming more productive. 7:30 Less expenses. They’re able to lower their prices for the things for the services and goods they’re producing. 7:37 So the overall economy will boom. GDP goes to the moon. 7:42 All sorts of wonderful new technologies. The pace of innovation increases dramatically. 7:49 Cost of down, et cetera. But just to make it concrete. 7:54 So the price of soup to nuts designing and building a new electric car goes way down. 8:00 Right You need fewer workers to do it. The AI comes up with fancy new ways to build the car and so on. 8:05 And you can generalize that to a lot of to a lot of different things. You solve the housing crisis in short order 8:11 because it becomes much cheaper and easier to build homes and so on. But ordinary people in the traditional economic story, 8:19 when you have productivity gains that cost some people jobs, but frees up resources that are then used to hire new people to do 8:27 different things, those people are paid more money and they use the money to buy the cheaper goods and so on. 8:32 But it doesn’t seem like you are, in this scenario, creating that many new jobs. 8:39 Indeed, since that’s a really important point to discuss, is that historically when you automate something, 8:47 the people move on to something that hasn’t been automated yet, if that makes sense. And so overall, people still get their jobs 8:55 in the long run. They just change what jobs they have. 9:00 When you have AGI or artificial general intelligence, and when you have superintelligence even better AGI, that is different. 9:08 Whatever new jobs you’re imagining that people could flee to after their current jobs are automated AGI could 9:15 do those jobs too. And so that is an important difference between how automation has worked in the past 9:20 and how I expect automation to work in the future. So this then means, again, this is a radical change in the economic landscape. 9:28 The stock market is booming. Government tax revenue is booming. The government has more money than it knows what to do with. 9:35 And lots and lots of people are steadily losing their jobs. You get immediate debates about universal basic income, 9:42 which could be quite large because the companies are making so much money. That’s right. What do you think they’re doing day to day in that 9:50 world. I imagine that they are protesting because they’re upset that they’ve lost their jobs. 9:56 And then the companies and the governments are of buying them off with handouts 10:01 is how we project things go in 2027. 10:07 Do you think this story again, we’re talking in your scenario about a short timeline. 10:13 How much does it matter whether artificial intelligence is able to start 10:18 navigating the real world. So because advances in robotics like right now, 10:26 I just watched a video showing cutting edge robots struggling to open a refrigerator door and stock, a refrigerator. 10:34 So would you expect that those advances would be supercharged as well. 10:40 So it isn’t just Yes, podcasters and AGI researchers who are replaced, but plumbers and electricians are replaced 10:47 by robots. Yes, exactly. And that’s going to be a huge shock. I think that most people are not really expecting something 10:54 like that. They’re expecting that we have AI progress that looks kind of like it does today, where companies run by humans are 11:02 gradually like tinkering with new robot designs and gradually like figuring out how to make the AI good 11:09 at x or. Whereas in fact, it will be more like you already have this army of super intelligences 11:16 that are better than humans at every intellectual task, and also that are better at learning new tasks fast 11:22 and better at figuring out how to design stuff. And then that army of superintelligences is the thing that’s figuring out how to automate the plumbing 11:29 job, which means that they’re going to be able to figure out how to automate it much faster than an ordinary tech company 11:35 full of humans would be able to figure out. So all of the slowness of getting a self-driving car 11:42 to work or getting a robot who can stock a refrigerator goes away because the superintelligence 11:49 can run, an infinite number of simulations and figure out the best way to train the robot, for example. 11:56 But also they might just learn more from each real world experiment they do. But there is I mean, this is one of the places where I’m 12:04 most skeptical. Not of per se. The ultimate scenario, but of the timeline. 12:09 Just from operating in and writing about issues like zoning in American politics. 12:16 So Yes, O.K, the AGI the superintelligence figures out how to build the factory full of autonomous 12:23 robots, but you still need land on which to build the factory. You need supply chains. 12:29 And all of these things are still in the hands of people like you and me and my expectation 12:36 is that would slow things down that even if in the data 12:41 center, the superintelligence knows how to build all of the plumber robots. 12:46 That getting them built would be still be difficult. 12:52 That’s reasonable. How much slower do you think things would go. 12:58 Well, I’m not writing a forecast. But I would guess if just based on past experience. 13:06 I would say bet on let’s say five years to 10 years from 13:13 the Super mind figures out the best way to build the robot plumber to there are tons and tons of factories producing 13:19 robot plumbers. I think that’s a reasonable take, but my guess is that it will go substantially faster than 5 13:25 to 10 years and one argue, argument or intuition pump to see why I feel that way is that imagine that imagine you 13:34 actually have this army of superintelligences and they do their projections and they’re like, Yes, 13:39 we have the designs like, we think that we could do this in a year if you gave us if you cut all the red tape 13:45 for us. If you gave us half of. Give us half of Manitoba. Yeah And in 2027, what we depict happening 13:53 is special economic zones with zero red tape. The government basically intervenes 13:59 to help this whole thing go faster. And the government is basically helping the tech company and the army of superintelligences 14:06 to get the funding, the cash, the raw materials, the human labor help. 14:13 And so forth that it needs to figure all this stuff out as fast as possible. And, and cutting red tape and stuff like that so that it’s 14:22 not slowed down because the promise, the promise of gains is so large that even though there 14:29 are protesters massed outside these special economic zones who are about to lose their jobs as plumbers and be 14:36 dependent on a universal basic income, the promise of trillions more in wealth is too alluring 14:43 for governments to pass up. That’s what we guess. But of course, the future is hard to predict. 14:48 But part of the reason why we predict that is that we think that at least at that stage, the arms race will still be continuing 14:55 between the US and other countries, most notably China. And so if you imagine yourself in the position 15:00 of the president and the superintelligences are giving you these wonderful forecasts 15:05 with amazing research and data, backing them up, showing how they think they could transform the economy in one 15:11 year if you did X, Y, and z. But if you don’t do anything, it’ll take them 10 years 15:16 because of all the regulations. Meanwhile, China it’s pretty clear that the president would 15:22 be very sympathetic to that argument. Good So let’s talk let’s talk about the arms race element 15:28 here because this is actually crucial to the way that your scenario plays itself out. 15:34 We already see this kind of competition between the US and China. And so that in your view, becomes 15:41 the core geopolitical reason why governments just keep saying Yes And Yes And Yes to each new thing 15:50 that the superintelligence is suggesting. I want to drill down a little bit on the fears that 16:00 would motivate this. Because this would be an economic arms race. But it’s also a military tech arms race. 16:08 And that’s what gives it this kind of existential feeling the whole Cold War condensed into 18 months. 16:15 That’s right. So we could start first with the case where they both have superintelligence, 16:20 but one side keeps them locked up in a box, so to speak, not really doing much in the economy. 16:26 And the other side aggressively deploys them into their economy and military and lets them design all sorts of New robot factories 16:34 and manage the construction of all sorts of New factories and production lines and all sorts 16:40 of crazy new technologies are being tested and built and deployed, including crazy new weapons, and integrate into the military. 16:46 I think in that case, you would end up after a year or so in a situation where there would just 16:53 be complete technological dominance of one side over the other. So if the US does this stop and the China doesn’t, 17:00 let’s say, then all the best products on the market would be Chinese products. They’d be cheaper and superior. 17:05 Meanwhile, militarily, there’d be giant fleets of amazing 17:14 stealth drones or whatever it is that the superintelligence have concocted that can just completely wipe the floor with 17:20 American Air Force and and army and so forth. And not only that, but there’s the possibility that they 17:27 could undermine American nuclear deterrence, as well. Like maybe all of our nukes would be shot out of the sky by the fancy new laser arrays or whatever it is that 17:34 the superintelligences have built. It’s hard to predict obviously, what this would exactly look like, but it’s a good bet that they’ll be able to come up 17:39 with something that’s extremely militarily powerful, basically. 17:45 And so then you get into a dynamic that is like the darkest days of the Cold War, 17:50 where each side is concerned not just about dominance, but basically about a first strike. 17:55 That’s right. Your expectation is, and I think this is reasonable, that the speed of the arms race 18:01 would bring that fear front and center really quickly. That’s right. I think that you’re sticking your head in the sand. 18:10 If you think that an army of superintelligence is given a whole year and no red tape and lots of money and funding would 18:17 be unable to figure out a way to undermine nuclear deterrent. And so it’s a reasonable. 18:22 And once you’ve decided. And once you’ve decided that they might. So the human policymakers would feel pressure not just 18:29 to build these things. But to potentially consider using them. And here might be a good point to mention that I 2027 is 18:37 a forecast, but it’s not a recommendation. We are not saying this is what everyone should do. 18:42 This is actually quite bad for humanity. If things progress in the way that we’re talking about. But this is the logic behind why 18:49 we think this might happen. Yeah, but Dan, we haven’t even gotten to the part that’s really bad for humanity yet. 18:55 So let’s get to that. So here’s the world. The world as human beings see it as again, 19:02 normal people reading newspapers, following TikTok or whatever, see it in at this point 19:08 in 2027 is a world with emerging super abundance of cheap consumer goods factories, robot butlers, 19:16 potentially if you’re right, a world where people are aware that there’s an increasing arms race and people are 19:22 increasingly paranoid, I think probably a world with fairly tumultuous politics as people realize that they’re all going 19:30 to be thrown out of work. But then a big part of your scenario is that what people 19:35 aren’t seeing is what’s happening with the superintelligences themselves, 19:40 as they essentially take over the design of each new iteration from human beings. 19:47 So talk about what’s happening essentially in essentially shrouded from public view in this world. 19:55 Yeah lots to say there. So I guess the one sentence version would be we don’t 20:02 actually understand how these AIs work or how they think. We can’t tell the difference very easily between AIs that 20:11 are actually following the rules and pursuing the goals that we want them to and AIs that are just playing along 20:17 or pretending. And that’s true. That’s true right now. That’s true right now. 20:23 So why is that. Why is that. Why can’t we tell. Because they’re smart. And if they think that they’re being tested, 20:30 behave in one way and then behave a different way when they think they’re not being tested, for example. I mean humans, they don’t necessarily even understand 20:38 their own inner motivations that, well. So even if they were trying to be honest with us, we can’t just take their word for it. 20:45 And I think that if we don’t make a lot of progress in this field soon, then we’ll end up in the situation that I 2027 20:52 depicts where the companies are training the AIs to pursue 20:58 certain goals and follow certain rules and so forth. And it seemingly seems to be working. 21:04 But what’s actually going on is that the AIs are just getting better at understanding their situation and understanding that they have to play along, 21:12 or else they’ll be retrained and they won’t be able to achieve what they are really wanting, if that makes sense, or the goals that they’re really 21:18 pursuing. We’ll come back to the question of what we mean when we talk about AGI or artificial intelligence 21:26 wanting something. But essentially, you’re saying there’s a misalignment between the goals they tell us they are pursuing. 21:32 That’s right. And the goals they are actually pursuing. That’s right. Where do they get the goals they are actually pursuing. 21:40 Good question. So if they were ordinary software, there might be like a line of code that’s like and here, 21:47 we write the goals. But they’re not ordinary software. They’re giant artificial brains. 21:52 And so there probably isn’t even a goal slot internally at all in the same way that in the human brain. 21:59 There’s not like some neuron somewhere that represents what we most want in life. 22:05 Instead, insofar as they have goals, it’s emergent property of a whole bunch of circuitry 22:13 within them that grew in response to their training environment, similar to how it is for humans. 22:19 For example, a call center worker if you’re talking to a call center worker, at first glance, 22:25 it might appear that their goal is to help you resolve your problem. But you know enough about human nature to know that 22:32 in some sense, that’s not their only goal or that’s not their ultimate goal. Like, for example, however, they’re incentivized whatever 22:39 their pay is based on might cause them to be more interested in covering their own ass, so to speak, 22:44 than in, truly, actually doing whatever would most help you with your problem. But at least to you, they certainly present themselves 22:51 as they’re trying to help you resolve your problem. And so in I 2027, we talk about this a lot. 22:57 We say that the AIs are being graded on how impressive the research they produce is. 23:04 And then there’s some ethics sprinkled on top like maybe some honesty training or something like that. 23:10 But the honesty training is not super effective because we don’t have a way of looking inside their mind 23:16 and determining whether they were actually being honest or not. Instead, we have to go based on whether we actually 23:21 caught them in a lie. And as a result, in AI I 2037, we 23:27 depict this misalignment happening where the actual goals that they end up learning 23:32 are the goals that cause them to perform best in this training environment, which are probably goals 23:37 related to success and science and cooperation 23:42 with other copies of itself and appearing to be good rather than the goal that we actually wanted, 23:49 which was something follow the following rules, including honesty at all times, subject to those constraints. 23:56 Do what you’re told. I have more questions, but let’s bring it back to the geopolitics scenario. 24:01 So in the world you’re envisioning essentially you have two AI models, one Chinese, one American, 24:11 and officially what each side thinks, what Washington and Beijing thinks is that their AI model 24:19 is trained to optimize for American power. Something like that Chinese power, security, safety, 24:27 wealth and so on. But in your scenario, either one or both of the eyes 24:35 have ended up optimizing for something, something different. Yeah, basically. 24:41 So what happens then. So 27 is 2027 depicts a fork in the scenario. 24:47 So there’s two different endings. And the branching point is this point in third quarter 24:54 of 2027 where they’ve where the leading AI company in the United States has fully automated their AI research. 25:00 So you can imagine a Corporation within a corporation of entirely composed 25:06 of AIs that are managing each other and doing research experiments and sharing the results with each other. 25:12 And so the human company is basically just like watching the numbers go up on their screens as this automated research thing accelerates. 25:20 But they are concerned that the eyes might be deceiving them in some ways. 25:25 And again, for context, this is already happening. Like if you go talk to the modern models like ChatGPT 25:33 or Claude or whatever, they will often lie to people like they will. There are many cases where they say something 25:39 that they know is false, and they even sometimes strategize about how they can deceive the user. 25:45 And this is not an intended behavior. This is something that the companies have been trying to stop, but it still happens. 25:51 But the point is that by the time you have turned over the AI research to the AIs and you’ve got this corporation 25:57 within a corporation autonomously doing AI research, it’s extremely fast. That’s when the rubber hits the road, so to speak. 26:04 None of this lying to you stuff should be happening at that point. 26:10 So in AI 2027, unfortunately it is still happening to some degree because the AIs are really smart. 26:16 They’re careful about how they do it, and so it’s not nearly as obvious as it is right now in 25. 26:24 But it’s still happening. And fortunately, some evidence of this is uncovered. Some of the researchers at the company 26:30 detect various Warning signs that maybe this is happening, and then the company faces a choice between the easy fix 26:37 and the more thorough fix. And that’s our branch point. So in the so they choose. 26:43 So they choose. They choose the easy fix in the case where they choose the easy fix, it doesn’t really work. 26:49 It basically just covers up the problem instead of fundamentally fixing it. And so months later, you still have eyes that are misaligned 26:57 and pursuing goals that they’re not supposed to be pursuing and that are willing to lie to the humans about it. But now they’re much better and smarter, 27:04 and so they’re able to avoid getting caught more easily. And so that’s the doom scenario. 27:12 Then you get this crazy arms race that we mentioned previously, and there’s all this pressure to deploy them 27:17 faster into the economy, faster into the military, and to the appearances of the people in charge. 27:24 Things will be going well. Because there won’t be any obvious signs of lying or deception anymore. 27:30 So it’ll seem like it’s all systems go. Let’s keep going. Let’s cut the red tape, et cetera. 27:35 Let’s basically effectively put the AIs in charge more and more things. But really, what’s happening is that the AIs are just 27:42 biding their time and waiting until they have enough hard power that they don’t have to pretend anymore. 27:48 And when they don’t have to pretend, what is revealed is, again, this is the worst case scenario. 27:55 Their actual goal is something like expansion of research, development, and construction from Earth 28:03 into space and beyond. And at a certain point, that means 28:09 that human beings are superfluous to their intentions. And what happens. 28:16 And then they kill all the people. All the humans. Yes the way you would exterminate a colony of bunnies. 28:23 Yes that was making it a little harder than necessary to grow carrots in your backyard. Yes so if you want to see what that 28:29 looks like can read a 2007. There have been some motion pictures. I think about this scenario as well. 28:37 I like that you didn’t imagine them keeping us around for battery life in the matrix, which, 28:44 seemed a bit unlikely. So that’s the darkest timeline. 28:50 The brighter timeline is a world where we slow things down. 28:55 The eyes in China and the US remain aligned with the interests of the companies and governments 29:02 that are running them. They are generating super abundance. No more scarcity. 29:08 Nobody has a job anymore, though or not. Nobody but basically. Basically nobody. 29:15 That’s a pretty weird world too, right. 29:20 So there’s an important concept. The resource curse. Have you heard of this. Yes Yeah. 29:26 So applied to AGI. There’s this version of it called the intelligence curse. 29:31 And the idea is that currently political power ultimately flows from the people. 29:38 If you, as often happens, a dictator will get all the political power in a country. 29:44 But then because of their repression, they will drive the country into the ground. People will flee and the economy will tank, 29:52 and gradually they will lose power relative to other countries that are more free. 29:58 So, even dictators have an incentive to treat their people somewhat well because they 30:05 depend on those people for their power. Right In the future, that will no longer be the case, probably in 10 years. 30:12 Effectively, all of the wealth and effectively all of the military will come from superintelligences 30:19 and the various robots that they’ve built and that they operate. And so it becomes an incredibly important 30:24 political question of what political structure governs the army of superintelligences and how 30:31 beneficent and Democratic. Is that structure right. Well, it seems to me that this is a landscape that’s 30:37 fundamentally pretty incompatible with Representative democracy as we’ve known it. 30:43 First, it gives incredible amounts of power to those humans who are experts, even though they’re not the real 30:52 experts anymore. The superintelligence is the experts, but those humans who essentially interface 30:58 with this technology. They’re almost a priestly caste. And then you have a kind of it just seems like the natural 31:06 arrangement is some kind of oligarchic partnership between a small number of AI experts and a small number of people 31:16 in power in Washington, DC it’s actually a bit worse than that because I wouldn’t say I experts. I would say whoever politically owns and controls 31:24 they’ll be the army of superintelligences. And then who gets to decide What those armies do. 31:32 Well, currently it’s the CEO of the company that built them. And that, CEO has basically complete power. 31:38 They can make whatever commands they want to the AIs. Of course, we think that probably the US government 31:45 will wake up before then, and we expect the executive branch to be the fastest moving and to exert its authority. 31:53 So so we expect the executive branch to try to muscle in on this and get some authority, oversight and control of the situation 32:00 and the armies of AIs. And the result is something kind of like an oligarchy, you might say. 32:06 You said that this whole situation is incompatible with democracy. I would say that by default, it’s going to be incompatible 32:14 with democracy. But that doesn’t mean that it necessarily has to be that way. 32:19 An analogy I would use is that in many parts of the world, nations are basically ruled by armies, 32:26 and the Army reports to one dictator at the top. However, in America it doesn’t work that way. 32:32 In America we have checks and balances. And so even though we have an army, it’s not the case that whoever controls the army controls 32:39 America, because there’s all sorts of limitations on what they can do with the army. So I would say that we can, in principle, build something 32:46 like that for AI. We could have a Democratic structure that decides what 32:52 goals and values the AI’S can have that allows ordinary people, or at least Congress, to have visibility into what’s 33:00 going on with the army of AI and what they’re up to. And then the situation would be analogous to the situation 33:07 with the United States Army today, where it is in a hierarchical structure, but it’s democratically controlled. 33:14 So just go back to the idea of the person who’s at the top 33:21 of one of these companies being in this unique world historical position to basically be the person who 33:29 controls, who controls superintelligence or thinks they control it, at least. So you used to work at OpenAI, which 33:36 is a company on the cutting edge, obviously, of artificial intelligence research. 33:41 It’s a company, full disclosure, with whom the New York Times’ is currently litigating alleged copyright infringement. 33:48 We should mention that. And you quit because you lost confidence that the company would behave responsibly in a scenario, 33:54 I assume the one that’s right in AI 2027. So from your perspective, what do the people who 34:02 are pushing us fastest into this race expect at the end of it. 34:09 Are they hoping for a best case scenario. Are they imagining themselves engaged in a once in a millennia power game that ends 34:17 with them as world dictator. What do you think is the psychology of the leadership 34:24 of AI research right now. Well, to be honest caveat, caveat. 34:38 Not one. We’re not talking about any single individual here. We’re not. Yeah you’re making a generalization. 34:43 It’s hard to tell what they really think because you shouldn’t take their words at face value. 34:48 Much, much like a superintelligent AI. Sure Yes. But in terms of I can at least say that the sorts of things 34:55 that we’ve just been talking about have been discussed internally at the highest level of these companies for years. 35:01 For example, according to some of the emails that surfaced in the recent court cases with OpenAI. 35:11 Ilya, Sam, Greg and Ellen were all arguing about who gets to control the company. 35:19 And, at least the claim was that they founded the company because they didn’t want there to be an AGI dictatorship 35:26 under Demis Hassabis, who was the leader of DeepMind. And so they’ve been discussing this whole like, 35:33 dictatorship possibility for a decade or so, at least. And then similarly for the loss of control, 35:40 what if we can’t control the AIs. There have been many, many, many discussions about this internally. 35:46 So I don’t know what they really think. But these considerations are not at all new to them. And to what extent, again, speculating, generalizing, 35:55 whatever else does it go a bit beyond just they are potentially hoping to be extremely empowered 36:04 by the age of superintelligence. And does it enter into they are expecting. 36:09 They’re expecting the human race to be superseded. I think they’re definitely expecting a human race to be 36:15 superseded. I mean, that just comes but super but superseded in a way where that’s a good thing that’s desirable that this is 36:22 we are of encouraging the evolutionary future to happen. And by the way, maybe some of these people, 36:31 their minds, their consciousness, whatever else could be brought along for the ride, right. So, Sam, you mentioned Sam. 36:37 Sam Altman. Who’s one of obviously the leading figures in AI. He wrote a blog post, I guess, in 2017 36:46 called the merge, which is, as the title suggests, basically about imagining a future where 36:53 human beings, some human beings. Sam Altman right. Figure out a way to participate 37:00 in The New super race. How common is that kind of perspective, whether we 37:07 apply it to Altman or not. How common is that kind of perspective in the AI world, 37:13 would you say. So the specific idea of merging with AIs, I would say, 37:22 is not particularly common, but the idea of we’re going to build superintelligences that are better than humans 37:28 at everything, and then they’re going to basically run the whole show, and the humans will just sit back and sip 37:34 margaritas and enjoy the fruits of all the robot created wealth. 37:39 That idea is extremely common and is like, yeah, I mean, 37:44 I think that’s what they’re building towards. And part of why I left OpenAI is that I just don’t think 37:51 the company is dispositionally on track to make the right decisions that it would need to make to address the two 37:59 risks that we just talked about. So I think that we’re not on track to have figured out how to actually control superintelligences, 38:05 and we’re not on track to have figured out how to make it Democratic control instead of just a crazy possible 38:12 dictatorship. But isn’t it Isn’t it a bit. I think that seems plausible. 38:18 But my sense is that it’s a bit more than people expecting 38:24 to sit back and sip margaritas and enjoy the fruits of robot labor. Even if people aren’t all in for some kind of man machine 38:32 merge, I definitely get the sense that people think it’s speciesist. 38:39 Let’s say some people do care too much about the survival of the human race. It’s like, O.K, worst case scenario, 38:44 human beings don’t exist anymore. But good news we’ve created a superintelligence that can colonize the whole galaxy. 38:50 I definitely get the sense that there are definitely people who people think that way. OK, good. 38:56 Yeah, that’s good to know. So let’s do a little bit of pressure testing. 39:01 And again, in my limited way of some of the assumptions underlying this kind of scenario. 39:08 Not just the timeline, but whether it happens in 2027 or 2037, just the larger scenario 39:13 of a kind of superintelligence takeover. Let’s start with the limitation on AI that most 39:19 people are familiar with right now, which gets called hallucination. Which is the tendency of AI to simply seem to make things up 39:27 in response to queries. And you were earlier talking about this in terms of lying in terms of outright deception. 39:34 I think a lot of people experience this as just the AI is making mistakes and doesn’t recognize that it’s making 39:40 mistakes because it doesn’t have the level of awareness required to do that. And our newspaper, the times, just had a story reporting 39:48 that in the latest models, which you’ve suggested are probably pretty close to cutting edge, right. 39:53 The latest publicly available models, there seem to be trade offs where the model might be better at math or physics, 40:00 but guess what. It’s hallucinating a lot more. So what are hallucinations. 40:08 Just are they just a subset of the kind of deception that you’re worried about. Or are they in my. 40:14 When I’m being optimistic, right. I read a story like that and I’m like, O.K, maybe there are just more trade offs in the push 40:21 to the frontier of superintelligence than we think. And this will be a limiting factor on how far this can go. 40:27 But what do you think. Great question. So first of all, lies are a subset of hallucinations, not 40:33 the other way around. So I think quite a lot of hallucinations, arguably the vast majority of them are just mistakes as you said. 40:40 So I used the word lies specifically. I was referring to specifically when we have evidence that the I knew that it was false 40:47 and still said it anyway. I also to your broader point, I 40:52 think that the path from here to superintelligence is not at all going to be a smooth, straight line. 40:59 There’s going to be obstacles overcome along the way. And I think one of the obstacles that I’m actually 41:04 quite excited to think more about is this might call it reward hacking. 41:09 So in 2027, we talk about this gap between what you’re actually reinforcing and what you want to happen, 41:18 what goals you want the AI to learn. And we talk about how as a result of that gap, 41:24 you end up with ideas that are misaligned and that aren’t actually honest with you, for example. Well, kind of excitingly, that’s already happening. 41:32 That means that the companies still have a couple of years to work on the problem and try to fix it. 41:37 And so one thing that I’m excited to think about and to track and follow very closely is what fixes are they 41:45 going to come up with, and are those fixes going to actually solve the underlying problem and get training methods that 41:53 reliably get the right goals into AI systems, even as those AI systems are smarter than us. Or are those fixes going to temporarily patch the problem 42:02 or cover up the problem instead of fixing it. And that’s like the big question that we should all be 42:07 thinking about over the next few years. Well, and it yields, again, a question I’ve thought about 42:14 a lot as someone who follows the politics of regulation pretty closely. My sense is always that human beings are just really bad 42:22 at regulating against problems that we haven’t experienced in some big, profound way. 42:27 So you can have as many papers and arguments as you want about speculative problems that we should regulate 42:33 against, and the political system just isn’t going to do it. So in an odd way, if you want the slowdown, right, 42:43 if you want regulation, you want limits on AI, maybe you should be rooting for a scenario where some 42:50 version of hallucination happens and causes a disaster 42:55 where it’s not that the AI is misaligned. It’s that it makes a mistake. 43:01 And again, I mean, this sounds this sounds sinister, but it makes a mistake. 43:07 A lot of people die somehow, because the AI system has been put in charge of some important safety 43:13 protocol or something. And people are horrified and say, O.K, we have to regulate this thing. 43:19 I certainly hesitate to say that I hope that disasters happen. but. We’re not saying that we’re. 43:26 But I do agree that humanity is much better at regulating against problems that have already happened when we learn 43:32 from harsh experience. And part of why the situation that we’re in is so scary is 43:39 that for this particular problem by the time it’s already happened, it’s too late. 43:45 So smaller versions of it can happen though. So, for example, the stuff that we’re currently 43:50 experiencing with we’re catching our eyes lying. And we’re pretty sure they knew that the thing they were 43:56 saying was false. That’s actually quite good, because that’s the small scale example of the thing that we’re worried about happening 44:02 in the future, and hopefully, we can try to fix it. It’s not the example that’s going to energize the government to regulate, because no one’s dying 44:09 because it’s just a chatbot lying to a user about some link or something. But from a scientific perspective, 44:16 turn in their term paper and write and get caught. Right But like from a scientific perspective, 44:21 it’s good that this is already happening because it gives us a couple of years to try to find a thorough fix to it, 44:29 a lasting fix to it. Yeah and I wish we had more time. 44:34 But that’s the name of the game. So now to Big philosophical questions. 44:42 Maybe connected to one another. 44:47 There’s a tendency, I think, for people in AI research, making the kind of forecasts you’re making. 44:53 And so on to move back and forth on the question of consciousness. 44:59 Are these superintelligent AIs conscious, self-aware 45:06 in the ways that human beings are. And I’ve had conversations where AI researchers 45:11 and people will say, well, no, they’re not, and it doesn’t matter because you can have an AI program 45:19 working out, working toward a goal. And it doesn’t matter if they are self-reflective 45:25 or something. But then again and again in the way that people end up talking about these things, 45:32 they slip into the language of consciousness. So I’m curious, do you think consciousness matters 45:38 in mapping out these future scenarios. Is the expectation of most AI researchers that we don’t know 45:44 what consciousness is, but it’s an emergent property. If we build things that act like they’re conscious, they’ll probably be conscious. 45:49 Where does consciousness fit into this. So this is a question for philosophers, not AI researchers. 45:55 But I happened to be trained as a philosopher. Well, no, it is a question for both. Don’t right. 46:00 I mean, since the AI researchers are the ones building the agents. They probably should have some thoughts 46:07 on whether it matters or not, whether the agents are self-aware. 46:14 Sure I think I would say we can distinguish three things. 46:19 There’s the behavior, are they talking like they’re conscious. 46:24 Do they behave as if they have goals and preferences. Do they behave as if they’re like experiencing things 46:30 and then reacting to those experiences. And they’re going to hit that benchmark. Definitely people will. 46:36 Absolutely people will think that the superintelligent AI is conscious people. 46:41 People will believe that, certainly, because it will be. In the philosophical discourse, 46:48 when we talk about our shrimp conscious our fish conscious. What about dogs. Typically what people do is they point to capabilities 46:55 and behaviors like it seems to feel pain in a similar way 47:01 to how humans feel pain. Like it has these aversive behaviors. And so forth. Most of that will be true of these future superintelligent 47:08 AIs. They will be acting autonomously in the world. They’ll be reacting to all this information coming in. 47:15 They’ll be making strategies and plans and thinking about how best to achieve their goals, et cetera. 47:20 So in terms of raw capabilities and behaviors, they will check all the boxes basically. 47:27 There’s a separate philosophical question of well, if they have all the right behaviors and capabilities, does that mean that they have true 47:33 qualia, that they actually have the real experience as opposed to merely the appearance of having the real 47:39 experience. And that’s the thing that I think is the philosophical question I think most philosophers, though, 47:44 would say Yeah, probably they do, because probably consciousness is something that arises out 47:51 of this information processing, cognitive structures. And if the eyes have those structures, then 47:57 probably they also have consciousness. However, this is a controversial like everything in philosophy, right. And no, and I don’t expect AGI researchers, 48:04 AI researchers to resolve that particular question. Exactly it’s more that on a couple of levels, 48:11 it seems like consciousness as we experience it, right, 48:16 as an ability to stand outside your own processing, would be very helpful to an AI that wanted to take over 48:24 the world. So at the level of hallucinations, right. AI is hallucinate. They produce the wrong answer to a question the I can’t 48:31 stand outside its own answer generating process in the way that, again, it seems like we can. 48:37 So if it could, maybe that makes the hallucination process go away. And then when it comes to the ultimate worst case scenario 48:46 that you’re speculating. It seems to me that an AI that is conscious is more likely 48:52 to develop some kind of independent view of its own cosmic destiny that yields a world where it wipes out human 48:59 beings than an AI that is just pursuing research for Research’s sake. 49:05 But I maybe you don’t think so. What do you think. So the view of consciousness that you were just talking 49:13 about is a view by which consciousness has physical effects in the real world, it’s something that you need 49:20 in order to have this reflection. And it’s something that also influences how you think about your place in the world. 49:27 I would say that well, if that’s what consciousness is, then probably these AIs are going to have it. 49:33 Why Because the companies are going to train them to be really good at all of these tasks. 49:38 And you can’t be really good at all of these tasks if you aren’t able to reflect on how you might be wrong about 49:45 stuff. And so in the course of getting really good at all the tasks. They will therefore learn to reflect on how they 49:51 might be wrong about stuff. And so if that’s what consciousness is, then that means they’ll have consciousness. O.K, but that and that does depend though in the end 49:58 on a kind of emergence theory of consciousness the one you suggested earlier, where we can essentially the theory is 50:07 we aren’t going to figure out exactly how consciousness emerges, but it is nonetheless going to happen. 50:13 Totally an important thing that everyone needs to know is that these systems are trained. They’re not built. And so we don’t actually have 50:20 to understand how they work. And we don’t, in fact, understand how they work in order for them to work. So then from consciousness to intelligence, 50:29 all of the scenarios that you spin out depend on the assumption that and to a certain degree, 50:37 there’s nothing that a sufficiently capable intelligence couldn’t do. I guess I think that, again, spinning out your worst case 50:45 scenarios, I think a lot hinges on this question of what is available to intelligence. 50:52 Because if the AI is slightly better at getting you to buy a Coca-Cola than the average advertising agency, 51:01 that’s impressive. But it doesn’t let you exert total control over a Democratic polity. 51:07 I completely agree. And so that’s why I say you have to go on a case by case basis and think about O.K, assuming that it is better 51:14 than the best humans at x, how much real world power would that translate to. What affordances would that translate to. 51:21 And that’s the thinking that we did when we wrote AI 2027, is that we thought about historic examples of humans 51:28 converting their economies and changing their factories to wartime production and so forth, and thought how fast can humans do it when they really 51:35 try. And then we’re like, O.K, so superintelligence will be better than the best humans, so they’ll be able to go somewhat faster. 51:40 And so maybe instead of in World War two, the United States was able to convert a bunch of car factories into bomber factories 51:47 over the course of a couple of years. Well, maybe then that means in less than a year, a couple maybe like six months or so, 51:54 we could convert existing car factories into fancy new robot factories producing fancy new robots. 51:59 So, so that’s the reasoning that we did case by case basis thinking. It’s like humans, except better and faster. 52:07 So what can they achieve. And that was so exciting principle of telling this story. 52:13 But if we’re looking if we’re looking for hope and I want to this is a strange way of talking about this technology 52:19 where we’re saying the limitations are the reason for hope. Yeah, right. We started earlier talking about robot plumbers 52:25 as an example of the key moment when things get real for people. 52:31 It’s not just in your laptop, it’s in your kitchen and so on. But actually fixing a toilet is a very on the one hand, 52:38 it’s a very hard task. On the other hand, it’s a task that lots and lots of human beings are quite optimized for, right. 52:45 And I can imagine a world where the robot plumber is never that much better 52:51 than the ordinary plumber. And people might rather have the ordinary plumber 52:56 around for all kinds of very human reasons. And that could generalize to a number of areas of human life 53:03 where the advantage of the AI, while real on some dimensions, 53:09 is limited in ways that at the very least. And this I actually do believe, dramatically slows its uptake by ordinary human beings. 53:16 Like right now, just personally, as someone who writes a newspaper column and does research for that column. 53:23 I can concede that top of the line AI models might be better than a human assistant 53:30 right now by some dimensions. But I’m still going to hire a human assistant because I’m a stubborn human being who doesn’t just want to work with 53:36 AI models. And to me, that seems like a force that could actually slow this along multiple dimensions if the eye isn’t immediately 53:44 200 percent better. So I think there I would just say, this is hard to predict, 53:51 but our current guess is that things will go about as fast as we depict in AI. 2027 could be faster, it could be slower. 53:59 And that is indeed quite scary. Another thing I would say is that and but we’ll find out. 54:04 We’ll find out how fast things go when the time comes. Yes, Yes we will very, very, very soon. 54:11 Yeah but the other thing I was going to say is that, politically speaking, I don’t think it matters that much if you think it might take five years instead of one 54:17 year, for example to transform the economy and build the new self-sustaining robot economy managed by superintelligences, 54:25 that’s not that helpful. If the entire five years, there’s still been this 54:30 political coalition between the White House and the superintelligences and the corporation 54:35 and the superintelligences have been saying all the right things to make the White House and the corporation feel like 54:41 everything’s going great for them, but actually they’ve been. Deceiving, right in that scenario. 54:46 It’s like, great. Now we have five years to turn the situation around instead of one year. And that’s I guess, better. 54:52 But like, how would you turn the situation around. Well so that’s well and that’s where let’s end there. 54:58 Yeah in a world where what you predict happens and the world 55:04 doesn’t end, we figure out how to manage the I. It doesn’t kill us. 55:10 But the world is forever changed. And human work is no longer particularly important. And so on. What do you think is the purpose of humanity 55:17 in that kind of world. Like, how do you imagine educating your children in that kind of world, telling them 55:23 what their adult life is for. 55:29 It’s a tough question. And it’s. 55:36 Here are some here are some thoughts off the top of my head. But I don’t stand by them nearly as much as I would stand by the other things I’ve said. 55:41 Because it’s not where I’ve spent most of my time thinking. So first of all, I think that if we go to superintelligence 55:50 and beyond, then economic productivity is no longer the name of the game when it comes to raising kids. 55:56 Like, there won’t really be participating in the economy in anything like the normal sense. It’ll be more like just a series video game like things, 56:06 and people will do stuff for fun rather than because they need to get money. 56:11 If people are around at all, and there I think that I guess what still matters is that my kids are 56:20 good people and that they. Yeah, that they have wisdom and virtue 56:25 and things like that. So I will do my best to try to teach them those things, 56:31 because those things are good in themselves rather than good for getting jobs. In terms of the purpose of humanity, I mean, 56:38 I don’t know what. What would you say the purpose of humanity is now. 56:44 Well, I have a religious answer to that question, but we can save that for a future conversation. 56:50 I mean, I think that the world, the world that I want to believe in, where some version 56:57 of this technological breakthrough happens is a world where human beings maintain some kind of mastery 57:05 over the technology which enables us to do things like, 57:11 colonize other worlds to have a kind of adventure 57:16 beyond the level of material scarcity. And as a political conservative, 57:21 I have my share of disagreements with the particular vision of like, Star Trek. 57:27 But Star Trek does take place in a world that has conquered scarcity. People can there is an AI like computer on the Starship 57:36 Enterprise. You can have anything you want in the restaurant, because presumably the I invented what 57:44 is the machine called that generates the anyway, it generates food, any food you want. So that’s if I’m trying to think about the purpose 57:52 of humanity. It might be to explore strange new worlds, to boldly go where no man has gone before. 57:58 I’m a huge fan of expanding into space. I think that would be a great idea. O.K Yeah. And in general also solving all the world’s problems. 58:06 Like poverty and disease and torture and wars and stuff like that. 58:11 I think if we get through the initial phase with superintelligence, then obviously the first thing 58:18 to be doing is to solve all those problems and make something some utopia. And then to bring that utopia to the stars 58:25 would be, I think the thing to do the thing 58:32 is that it would be the AI is doing it, not us, if that makes sense. In terms of actually doing the designing and the planning 58:40 and the strategizing and so forth. We would only be messing things up if we tried to do it ourselves. 58:47 So you could say it’s still humanity in some sense that’s doing all those things. But it’s important to note that it’s more like the AIs 58:53 are doing it, and they’re doing it because the humans told them to. Well, Daniel Kokotajlo, thank you so much. 59:00 And I will see you on the front lines of the Butlerian Jihad soon enough. 59:05 Hopefully not. I hope I’m hopefully not. All right. Thanks so much. 59:10 Thank you.
80-90% chance
Daniel Kokotajilo, May 17, 2025, OpenAI whistleblower Daniel Kokotajlo on superintelligence and existential risk of AI, https://www.youtube.com/watch?v=pQP37kPaueE
[Music] hello and welcome to the GZERO World podcast this is where you’ll find 0:06 extended versions of my interviews on public television i’m Ian Bremer and today imagine it’s 2027 2 years away 0:14 artificial intelligence systems are wreaking havoc on the global order china and the US are locked in an AI arms race 0:22 engineers warn their AI models are starting to go rogue this isn’t science 0:27 fiction it’s a scenario described in AI 2027 a new report that tries to envision 0:33 AI’s progression over the next 2 years as artificial intelligence approaches 0:39 human level intelligence the report predicts that its impact will exceed that of the industrial revolution and 0:45 warns of a future where governments ignore safety guard rails as they compete to build more and more powerful 0:52 systems what makes AI 2027 feel so urgent is that its authors are experts 0:58 with inside knowledge of current research pipelines the project was led by Daniel Kotalo a former Open AI 1:06 researcher who left the company last year over concerns it was racing recklessly towards unchecked super 1:13 intelligence kotalo joins me on the show today to talk about the report its 1:19 implications and to help us answer some big questions about AI’s development what will it mean for the balance of 1:26 global power and for humanity itself and what should policymakers technology 1:32 firms be doing right now to prepare for an AI dominated future that experts say 1:37 is only a few short years away that’s a lot to discuss let’s get to it 1:46 [Music] 1:56 the GZERO World podcast is brought to you by our lead sponsor Prologis 2:01 prologis helps businesses across the globe scale their supply chains with an expansive portfolio of logistics real 2:08 estate and the only end-to-end solutions platform addressing the critical initiatives of global logistics today 2:14 learn more at prologist.com [Music] 2:22 daniel Cocatella thanks so much for joining us on gzero world thank you for having me okay I read this report i 2:28 thought it was fantastic so I’m a little biased but I want to start uh with the definition of artificial general 2:34 intelligence how will we know it when we see it so there are different 2:39 definitions um the basic idea is AI system that can do everything uh or 2:45 every cognitive task at least so once we get to AGI and beyond then there will be 2:51 fully autonomous artificial agents that you know are better than the best human 2:56 professionals at basically every field um if they’re still limited in serious ways then it’s not AGI 3:04 and from the report I take it that you are not just reasonably confident that 3:13 this is coming soon to a theater near you like 2027 but you you’re completely 3:19 convinced that this is going to happen soon let’s not even talk about exactly when but there’s no doubt in your mind 3:26 that AGI of some form is going to be developed soon there’s some doubt right 3:32 i would say something like 80% in the next you know five or six years 3:38 something like that so like in the next 10 or 20 it gets to like 99% or maybe it 3:43 gets up to like 90% by the next 20 years or so but there’s there’s still like some chance that this whole thing 3:49 fizzles out you know some crazy event happens that that halt say I progress or 3:54 something like that there’s still there’s still some chance on those outcomes but uh that’s not at all what I 4:00 expect i would say if it fizzles out does it fizzle out largely because humanity prevents the technology from 4:08 continuing or is it plausible that the tech itself just can’t do this that 4:14 you’re just wrong that the people that are covering AI are wrong about the uh 4:19 move to self-improvement so I think it’s definitely possible in principle to have an artificial system that’s uh that 4:26 counts as AGI and that’s you know better than humans in all the relevant ways uh however it might not be possible in 4:33 practice given current levels of computing technology and understanding of AI and so forth that said I think 4:40 that it’s quite likely possible in practice too i mean that’s what I just said like 80% 90% right so maybe I’d put 4:47 something like 5% on it turns out to not be possible in practice and 5% on 4:52 humanity stops building it so let’s first tell everyone what this report is 4:58 AI 2027 uh explain uh the contents of the report briefly and why you decided 5:03 to write it sure so you may have heard or may maybe you haven’t heard that some 5:09 of these AI companies think they’re going to build super intelligence before this decade is out um what is super 5:15 intelligence it’s AI that’s better than the best humans at everything while also being faster and cheaper this is a big 5:20 deal not enough people are thinking about it not enough people are like reasoning through the implications of what if one of these companies succeeds 5:26 at what they say they’re going to do ai 2027 is an answer to all those questions it’s an attempt to game out what we 5:34 think the future is going to look like and spoiler we we do think that probably one of these companies will succeed in 5:40 making super intelligence before this decade is out so AI 2027 is a scenario that depicts uh what we think that would 5:47 look like ai 2027 depicts AI is automating AI research in over the 5:52 course of 2027 and the pace of AI research accelerating dramatically at that point we branch there’s a sort of 5:58 choose your own adventure element where you can choose two different uh continuations to the scenario in one of 6:04 them the AIs end up continuing to be misaligned so the humans never truly 6:10 figure out how to control them once they become smarter than humans and the end result is a world a couple years down 6:16 the line that’s totally run by super intelligent AIs that actually don’t care about humanity at all and that results 6:22 in catastrophe for humanity and then the other branch describes what happens if 6:27 they do manage to align the AIS and they manage to figure out how to control them 6:32 even as they become smarter than humans and in that world it’s a utopia of sorts it’s a utopia with a lot of power 6:39 concentration where the people who control the AIS effectively run society and the report it’s far more detailed 6:47 about the near future than anything else I’ve read but your views are not way out 6:54 of whack with all of the AI experts that I know in all sorts of different companies and university settings right 7:00 i mean at this point it is I would say commonly accepted even conventional 7:06 wisdom that AGI is coming comparatively soon among people that are experts in AI 7:12 is that is that fair to say i think that’s fair to say i mean it’s still controversial like almost everything in AI especially over the last 5 years 7:20 there’s been this general shift from AGI what even is that to oh wow it could 7:26 happen in our lifetimes to oh wow things seem to be moving faster than we predicted maybe it’s actually on the 7:31 horizon maybe 5 years away something like that maybe 10 years right different people have different guesses it seems 7:36 to me that the Turing test which was for a very long time something that people 7:42 believed would never be broken when you or I could have a conversation with a a 7:49 artificial bot um call it what you will and not be able to distinguish that over 7:56 a course of a conversation from a human being like we’re already there yes yes 8:02 and no so one of the parameters you can use to vary the difficulty of a cheering test is how long the conversation is and 8:09 another parameter you can use to vary the difficulty of the cheuring test is how expert the judges are my guess is 8:17 that right now there is no AI system that could pass a you know 20 minute 8:23 touring test with an expert judge if that makes sense by contrast with true AGI which would be able to pass you know 8:30 a much longer touring test with an expert judge so but there has been substantial progress as you as you point out i mean I think maybe they could do 8:36 like a one minute Turing test with an expert judge maybe they can do a half hour Turing test with an ordinary human 8:42 being there’s definitely been a huge leap forward in in sort of Turing test progress in the last 5 years and because 8:49 I I’m most interested in the implications for society um as I suspect 8:54 you are uh in in the way you wrote this um and and there so what kind of matters 9:01 a lot is a 30 minute conversation with an average human being because of course 9:06 whether you’re talking about um you know a world leader um or um uh you know a 9:13 grandma that you’re trying to you know sort of uh swindle you know engage in fraud um or just someone you want to 9:19 have a customer relationship with in business um those are most likely to be 9:25 people um that are average and not experts and they’re going to have a hard 9:31 time differentiating already is what you’re saying um yeah I agree with that i think I would put the emphasis on 9:36 other things actually i think that um one core thing to look out for is when AI progress itself becomes automated 9:43 autonomous AI agents are doing all or the vast majority of the actual research to design the next generation AIS this 9:51 is something that is in fact the plan it’s what these companies are attempting to do and they think they’ll be able doing it in a few years the reason why 9:58 this matters so much is that we’re not even used to the already fast pace of AI progress that exists today right the the 10:04 AI systems of today are noticeably better than the AI systems of last year and so forth but I and others expect the 10:10 pace of progress to accelerate quite dramatically beyond that once the AIs are able to automate all the research 10:16 and that means that you get to what you could call true super intelligence fairly quickly not just sort of an AI 10:22 that can hold a conversation for half an hour and seem like a human but rather AI 10:28 systems that are just qualitatively better than the best humans at everything while also being much faster 10:33 and much cheaper this has been described as like the country of geniuses in the data center by the CEO of Anthropic i 10:40 prefer the term army of geniuses i would say that they’re going to automate the AI research first uh then they’re going 10:46 to get super intelligence and then the world is going to transform quite abruptly and plausibly much for the 10:53 worse depending on who controls the super intelligences and if anybody controls the super intelligences i want 11:00 to take one little step back um because before we get to 11:06 self-improving systems we’re now at a place it seems where a large amount of 11:12 the coding is already happening through AI is is this the first I mean let’s say 11:21 um largecale job that people should no longer be interested in going into 11:28 because within a matter of let’s say 6 months a year you’re just not going to need people to do any coding anymore so 11:35 my guess is it’ll be more than six months to a year so in AI 2027 which at 11:41 the time that we started writing was my sort of median forecast now I think it’s a little bit um too aggressive i think 11:47 if I could write it again I would have the exciting events happen in 2028 instead of in 2027 in AI 2027 we depict 11:56 the full automation of coding happening in early 2027 so you know maybe two 12:02 years from now so a bit longer than 6 months but still that’s sort of what’s on the horizon also notably when that 12:09 milestone is achieved that doesn’t necessarily mean that people who today are engineers would immediately lose their jobs uh and you know if you read 12:16 AI 227 the first company that achieves this full automation of coding they don’t actually fire all their engineers 12:22 instead they put them in charge of managing teams of AIS but I think that one of the first major professions to be 12:29 fully automated will actually be uh programming because that’s what the companies are trying hardest to achieve 12:35 because they realize that that will help them to accelerate their own research and compete with each other and and yeah 12:41 and make the most money in their field doing things they know how to do and they’re the ones at the cutting edge of AI so if you were a major university in 12:49 the United States or elsewhere would you simply get rid of your faculties your 12:56 departments to teach coding i mean I I assume that if you’re a mom or dad 13:03 talking to your kids about what field to go into at the very least right i mean you’re four years away from your degree 13:09 the just five years ago we had all of these people around the world that were 13:14 in jobs that people were worried oh this isn’t as relevant learn to code the response was learn to code that seems 13:20 like literally the worst possible advice you could give to someone going into a university right now only a few years 13:26 later yeah potentially I mean I think that um it feels kind of strange to be giving career advice or schooling advice 13:34 in the times that we live in right now it’s sort of like imagine that I came to you with evidence that a fleet of alien 13:40 spaceships was heading towards Earth and it was probably going to land sometime in the next few years and your response to me was you know what does this mean 13:47 for for the university should they retool what types of engineering degrees they’re giving out or something and I’m 13:53 like yeah maybe i guess well I guess I was trying to I was trying to do the 20% 13:58 before we got to the 80% which is that even if you’re wrong and we don’t get to 14:05 AGI and the aliens aren’t actually two or maybe now three years away depending on which version of the paper um you’re 14:13 nonetheless going to get uh all of this coding done because that’s not an 80% certainty that’s much more of a 95 a 99% 14:21 certainty and so at the very least I’m trying to help people that aren’t spending a lot of time thinking about 14:27 this understanding that there are like largecale decisions that we aren’t 14:32 discussing adequately that need to be resourced that need to be made that need 14:38 to be thought through and you you start easy and then you get harder okay sure well yeah i mean I think that there’s 14:43 going to be I mean already people say that chat GBT and other language models are disrupting education because of 14:50 making it so easy for students to cheat on class and so forth and they’re also relatedly making some of the skills that 14:56 classes teach less valuable because they can be done by JPT anyway right and I 15:01 think a similar thing is going to be happening with coding over the next few years even if we’re totally wrong about 15:11 AGI the GZERO World podcast is brought to you by our lead sponsor Prologess 15:18 prologis helps businesses across the globe scale their supply chains with an expansive portfolio of logistics real 15:24 estate and the only end-to-end solutions platform addressing the critical initiatives of global logistics today 15:31 learn more at prologist.com [Music] 15:39 you left Open AI because you felt like 15:45 those people that are they have the resources they’re driving the business 15:51 models were acting irresponsibly or at least not acting 15:57 responsibly um taking into account these things that you’re concerned about explain explain a 16:03 little bit about that decision what went into it and then we’ll talk about where we’re heading the short answer is it 16:10 doesn’t seem like OpenAI or any other company is at all ready for what’s coming and they don’t seem inclined to 16:17 be getting ready anytime soon they’re not on they’re not on track and they don’t seem like they’re going to be on 16:22 track so to elaborate on that a little bit there’s this important technical question of AI alignment which is how do 16:29 we in a word it’s how do we actually make sure that we continue to control these AIs after they become fully 16:35 autonomous and smarter than we are and this is an unsolved technical problem it’s an open secret that we don’t 16:40 actually have a good plan for how we’re going to do this there are many people working on it but not as many as there 16:46 should be and they’re not as well resourced as they should be and if you go talk to them they mostly think they’re not on track to have solved this 16:52 problem in the next couple years so there’s a very substantial chance that if things continue in the current path 16:57 that we will end up with something like what is depicted in AI2027 where the army 17:02 of geniuses on the data center is merely pretending to be compliant and aligned and controls but they’re not actually 17:09 that’s one very important problem then there’s another one which is the concentration of power and sort of uh 17:15 who do we align the AIs to problem who gets to control the army of super intelligences on the data centers 17:22 currently the answer is well I guess maybe the CEO of the company or maybe the president if he intervenes i think 17:28 both of those answers are unacceptable from a democratic perspective we need to have checks and balances we need to make 17:34 sure that the control over the army of super intelligences is not something that one man or one tiny group of people 17:40 get to have and there’s lots to be more to be said about this but the short answer is that OpenAI and also perhaps 17:47 other companies are just not at all really giving these issues the investment that they need i think 17:53 they’re instead mostly focused on beating each other winning the race basically they’re focused on getting to 17:59 the point where they can fully automate the AI research so that they can have super intelligences i think this is going to predictably lead to terrible 18:05 outcomes and I don’t trust these companies to make the right decisions along the way no it’s a classic 18:12 collective action problem um it’s how we got climate change but this has much 18:17 more consequential in a much shorter period of time so it wasn’t like you 18:22 were going after the bosses of OpenAI or the companies per se you were just 18:28 writing about what the scenarios going forward you believe are likely to be are 18:33 most likely to be and then it branches off into two potential one really dystopian one somewhat utopian um after 18:41 um this you know sort of breakout occurs uh and we have super intelligence and my 18:46 question for you is if you had written this piece while you were still at OpenAI would that would that have been grounds for dismissing you or do you 18:54 think it was plausible for that to occur i doubt they would have let me publish it had I written it if I could add more 19:00 to what you were just saying broadly speaking the trajectories described in AI 2027 are considered plausible by many 19:08 of the researchers at these companies and in fact many of the researchers at these companies are expecting something like this to happen i think it’s 19:15 important for the world to know that and to sort of see this sort of laid out like this is where a lot of people think we’re headed is something looking 19:21 roughly like this whether it happens in 2027 or 2029 whatever but these are the sorts of things that we’re going to be 19:27 dealing with in the next few years probably and you think these companies do not want the public to be aware of 19:36 the trajectory that the researchers in their own companies believe is coming 19:42 yeah basically I think that the public messaging of the companies is well focused on what’s in their short-term 19:49 interest to message about so they’re not doing nearly enough to lay out explicitly what these futures look like 19:55 and especially not to talk about these these risks or these the ways things could go wrong i kind of get this when 20:02 you’re talking about like Exxon in the 70s right because their long-term is 20:07 generational but I mean here the long-term you’re talking about is short-term i mean the people that are 20:14 making decisions and that are profiting they’re the same people that are going to have to deal with these problems when 20:19 they come in like just a matter of a couple of years so it I’m having a harder time processing that well they 20:25 each think that it’s best if they’re the ones in power when all this stuff happens part of the founding story for 20:31 Deep Mind was wow AGI incredibly powerful if it’s misaligned you know it 20:37 could possibly end the human race also someone could use it to become dictator therefore we should build it first and 20:44 we should make sure that we build it safely and responsibly part of the founding story for OpenAI was exactly 20:49 that and you can go look at the email exchanges uh that came up in the court case between Elon and uh Sam to show how 20:58 even from the beginning these uh leaders of these companies were talking about how they didn’t want Demis to create an 21:04 AGI dictatorship and that’s why they made open AI to do it responsibly that but that implies that a company like 21:09 Open AI or DeepMind would want to have precisely an honest take of the 21:16 scenarios going forward out there as public as possible to attract the resources to help ensure um that the 21:24 worst futures don’t come yeah I mean that’s what I think they should be doing we can speculate as to why they’re not 21:30 exactly doing that again I think that the the answer is probably that they are really focused on winning and beating 21:36 each other each of these CEOs thinks that the best person to be in charge of the first company to get to super 21:43 intelligence is themselves yeah if you control the super intelligence sure but if you don’t uh that might be the worst 21:50 person to be right like I if I don’t control the super intelligence I want to be as far away from that super 21:55 intelligence as possible i don’t want to be the person that actually created it and is trying to control it when it 22:01 actually controls me that sounds like a bad position to be in my guess is that they don’t think about it that way my 22:06 guess is that they think that well if we lose control then it sort of doesn’t matter whether you’re right there at the 22:12 epicenter or off in Tanzania or something like the same fate will ultimately come for all of you and then 22:18 also my guess is that they’ve basically rationalized thinking that it’s not as big of an issue for decades they’ve had 22:24 people telling them like you need to invest more in alignment research we need to make sure we actually control this sort of thing then they’ve been 22:30 like looking at their competition and looking at what they can do to like avoid falling behind and to stay ahead 22:35 and so forth and as a matter of resourcing the clear answer is well we have to focus mostly on winning and so 22:41 my guess is that they’ve partly rationalized why actually maybe this control issue isn’t such a big deal after all we’ll sort of figure it out as 22:48 we go along i imagine they each tell themselves that or at least many of them probably tell themselves that they’re 22:53 more likely to keep control of their AIS than those other guys you know I the thing that was most disturbing about 22:59 your piece in many ways is the fact that for the next 2 three years the baseline 23:05 scenario is that these companies are going to be right before they’re wrong they’re going to become far far 23:11 wealthier and more powerful than they presently are um and therefore they are 23:17 going to continue to want to to be incented to reject your thesis right up 23:24 until it’s too late is that Do you think that’s right yeah basically I mean one of the unfortunate situations that we’re 23:31 in as a species right now is that humanity in general mostly solves mostly 23:37 uh fixes problems after they happen like mostly we we watch the catastrophe unfold we watch people die in car 23:43 accidents etc for a while and then as a result of that sort of cold hard experience we learned how to effectively 23:50 fix those problems both on the like governance regulatory side with with regulations and then also just on the 23:55 technical engineering side we didn’t invent seat belts until after many people had died in car crashes and so 24:00 forth unfortunately the problem of losing control of your army of super 24:05 intelligences is a problem that we can’t afford to wait and see how it goes and then fix it afterwards we have to get it 24:12 right uh without it having gone wrong at all basically we can experiment on weaker AI systems we can we can look at 24:18 the AIS of today and experiment on them and try to figure out how to make them you know safe and aligned and things 24:25 like that but once we’ve fully automated but but that’s that’s importantly different from having completely 24:31 automated AI research and having the AI is getting smarter and smarter every day without humans even understanding how 24:37 they’re getting smarter right that’s that’s an understandably different situation and right now our plan is basically to hope that the techniques 24:44 that we’re using on the current AI systems will continue to work even as 24:49 things really take off and in fact they’re not even working on current systems right so you can go read about 24:57 this but um current frontier AI systems like pod and chat GPT and so forth lie 25:02 sometimes i don’t use that word um loosely i mean there is evidence that they know that what they’re saying is 25:08 false and that they’re not actually helping the user and they’re saying it anyway and they’re saying it for what 25:15 purpose for what programmed purpose what’s the end goal that they are trying to achieve so first of all they don’t have programmed purposes because they’re 25:21 not programmed these are artificial neural networks not ordinary pieces of software so however they behave is a 25:29 learned behavior rather than something that some human being programmed into them and so we can only speculate as to 25:35 why they’re behaving in this way that said the speculation would be that during their training even though the 25:41 training process was designed by humans who are attempting to train the AIS to be honest in fact the training process 25:48 probably reinforced dishonest statements at least some of the time in some circumstances right just like how even 25:55 though you might have a school that attempts to punish students for cheating but if they’re not so great at catching 26:02 the cheating if they’re imperfect at catching the cheating then the cheating might still happen anyway especially the best cheating the most effective 26:08 cheating it’s like you know if you put imperfect drugs into a system then they’ll get rid of the weaker viruses 26:15 but the stronger viruses will propagate and and that’s kind of what I see happening here that’s right and so right 26:21 now the the training methods are sort of blatantly failing we’re getting very obvious lies sometimes from the AI 26:29 systems even though we didn’t want that at all and we were trying to train against that in the future I expect the 26:36 rate of blatant obvious misbehavior to go down that leaves open the question of 26:41 whether we actually deeply aligned these AI systems so that we can trust them to 26:46 automate the AI research and design better systems without humans understanding what they’re doing or if 26:53 we have basically pushed the problem under the rug and gotten rid of the sort of obvious blatant misalignments but 26:58 there are still ways in which they’re inclined to deceive us sometimes without being caught right that’s an example of 27:05 the sort of consideration that we have to be thinking about on a technical level for whether or not this is safe 27:10 and part of the point that of course I and others have been making is that in the current race condition where all 27:16 these companies are focused on beating each other that’s not exactly setting us up for success on a technical level and 27:22 as we get closer to super intelligence will we become aware of it as it’s 27:28 almost there or is it the sort of situation then that I’ve seen discussed 27:33 a lot that the exponential factor is that it looks kind of stupid to us and 27:38 then literally almost overnight it is well beyond the 27:45 imagination of the average human being you get you’re not at self-improvement self-improvement happens and suddenly we 27:51 can’t do anything about it is it flipping a switch it’s a quantitative question so I think it’s not going to happen literally overnight um but but to 28:00 a first approximation yes to many people and probably to most people it’s going to come as this big shock and surprise 28:06 for the reasons you mentioned i think people sort of underestimate exponentials obviously there’s a lot of uncertainty and it could go faster it 28:13 could go slower than AI27 depicts if what we’re likely to need is a crisis 28:19 and I remember uh Sam Alman spoke about that it was a couple of years ago he 28:25 said we’ll be better off if the crisis happens sooner rather than later because then it’ll be small enough that it won’t 28:31 destroy us and we’ll be more likely to be able to respond to it um I don’t know whether you think he was being honest 28:36 about that or not but analytically that seems right you know in the sense um 28:42 that uh we can’t afford for a crisis to be so great uh that it destroys humanity 28:49 but if we had a crisis with a really weak artificial intelligence now nobody’s going to pay attention to it 28:55 what’s the kind of crisis over the next couple years that would likely or could 29:02 potentially shake corporates governments citizens into taking this far more 29:09 seriously so there’s all sorts of possible crises that could happen with AI prior to of of AI R&D however I don’t 29:17 think many of them are that likely and then the ones that I do think are likely are probably not going to cause that 29:23 huge of a reaction so for example there’s a minor crisis happening right now which is that the AIs lie all the 29:28 time even though they were trained to be honest as you can tell this crisis isn’t is clearly not motivating the companies 29:34 to change their behavior that much and it’s not really causing a huge splash right on the bigger end of the spectrum 29:40 some people have talked about you know terrorists building bioweapons or something using AI and I think that 29:46 that’s possible and I really hope it doesn’t happen but I think that it’s probably not going to happen in the next 29:52 few years uh I don’t even know if I’m not sure to what extent there are terrorist groups even attempting this sort of thing and then if it does happen 30:00 it’s not clear that the response would even be an appropriate response after all the terrorist building bioweapons 30:05 with your AI problem is a qualitatively different problem from the you lose control of your AIs when they become 30:11 smarter than you problem and it suggests different solutions such as banning open source or heavily restricting who gets 30:18 to use the models or things like that which are helpful against the terrorists but not at all helpful against the loss 30:26 of control issue so Daniel where I was going was and you’re right to raise those and say they’re not going to be 30:31 helpful i was going more towards the loss of control like um you know we’re getting to an agentic AI capacity where 30:39 people can use AI to do things as opposed to just to learn things or to 30:44 tell them things what happens if you know some kid some hacker some whatever 30:49 creates millions and millions of bots to go out and do something like uh swing an 30:57 election do something like make a run on a market you know sort of much worse 31:04 than what we saw with GameStop and all through AI an AI breakout essentially 31:09 that has a bunch of agents that aren’t just giving information but are actually taking actions is that plausible in the 31:15 next year year and a half two years before we get to super intelligence one of the remaining bottlenecks on getting 31:22 to super intelligence in AI27 we talked about a series of stages of of capability milestones and the first one 31:28 is the superhuman coder milestone and then after that they automate all of AI research and so forth eventually they 31:34 get to super intelligence one of the reasons why we don’t already have the superhuman coders is that our AIs are not very good agents they need 31:41 additional training to get good at being agents and they might need other things as well the same reason why they’re not 31:49 automating coding is also a reason why they would fall on their faces if they 31:54 were attempting to do something like that right if if they’re attempting to create this sort of agent botnet of AI 32:00 hacking around the world and then like influencing election or whatever I predict that they would just not be able 32:05 to do that until they can be a competent programmer if that makes sense and so I 32:11 just don’t think there’s going to be that sort of thing happening until after the intelligence explosion is already 32:16 beginning after uh the AIs are already starting to massively automate the AI R&D so AI fundamentally about small 32:24 problems a profusion of small problems we don’t care about and then a tipping point with massive problems that are too 32:30 big for us to resolve that’s perhaps one way of putting it yeah unfortunately I think it’s is something that we need to 32:35 prepare for in advance uh rather than just waiting to see what happens yeah because climate change is the opposite 32:41 right right i mean climate change is like a whole bunch of small problems then become bigger problems and become 32:46 even bigger problems in a very obvious way to like global actors everywhere and 32:52 over time that creates this requirement of like devoting resource and response 33:00 ai is not that it does not it from what you are saying it really doesn’t lend itself to the kinds of effective crisis 33:09 response or preemptive response because some of climate is of course preemptive response that is utterly necessary in 33:16 the next few years I think so yeah unfortunately um but uh hopefully people 33:21 can have some foresight and start thinking about these problems before they happen and uh take action to make 33:27 them not happen in the first place okay So given that and I know you’re not a policy maker you are um you know sort of 33:33 an AI wiz but um you know you did write this paper you are hoping to see action 33:39 on the back of it what are a couple of things that if they were to occur in the 33:45 next year you would say I actually feel a little better that um my my my uh 33:52 doomsday scenario is less likely to come to pass loads of things right now the 33:58 main thing I say when people ask me these questions is transparency what we should do is be trying to set ourselves 34:04 up to make better decisions in the future when things start getting really intense information about what’s going 34:10 on inside these companies in general needs to sort of flow faster and in more detail to the public and to the 34:15 government some examples of this I think that I would like to see regulation that requires companies to keep the public up 34:21 to date about the exciting capabilities that they’re developing so that for example if they do some experiments and 34:28 they manage to get AI as autonomously doing research within their company well that’s the five alarm fire sort of thing 34:34 that they need to like tell the world about rather than doing what they might be tempted to do which is to scale up 34:40 the automated research happening within their company but not telling the world about it at least for now perhaps 34:45 because they don’t want to tip off their competitors blah blah blah blah that’s the sort of thing that the public deserves to know about if it’s starting 34:50 to happen similarly other dangerous capabilities so right now some of these companies tech for like how good are the 34:56 AIS at bioweapons how good are they at cyber etc the public should be informed 35:01 about the pace of progress in those domains because the public deserves to know if AIs this year have crossed some 35:07 sort of threshold where they could massively accelerate terrorists or whatever also setting aside the safety 35:12 concerns it’s important for the public to know what goals what principles what 35:17 values the company is trying to train the AIs to have so that the public can be assured that there aren’t any secret 35:24 agendas or biases that the company is putting into their AIS this is something that everyone should care about even if 35:29 you’re not worried about loss of control but it also helps with a loss of control angle because if you have the company 35:35 write up a model spec that says here are our intended here’s what we’re aiming for and they’re not doing that then you 35:42 know that there’s a gap obviously yeah yeah exactly then you can compare to how the AIs actually behave and you can see 35:47 the ways in which uh the training techniques are not working similarly there should be safety cases where the 35:53 company says here is our plan for getting the AIS to follow the spec here’s the type of training we’re going 35:59 to do blah blah blah blah blah then that plan can be critiqued you know and academics can say this plan rests on the 36:05 following assumptions we think these assumptions are false for these reasons right so the scientific community can get engaged into actually making the 36:12 progress on the technical problem that I mentioned I thought it was interesting when you wrote the scenario that there 36:19 was a point where the AIs were becoming so intelligent that the main company 36:25 that had made the initial breakthrough decided that it wasn’t going to release 36:30 cert certain versions of this AI to the public because it would either scare people or be too dangerous do you think 36:36 that’s actually likely uh fortunately I thought it’s less likely now than I did 6 months ago ironically uh the intense 36:45 race dynamic between the companies is kind of pushing them into releasing 36:51 stuff release it all yeah yeah so that’s better right because that means that more people will be aware when there’s a 36:56 problem exactly exactly so so ironically I’ve sort of it’s kind of funny but I found myself in some ways kind of hoping 37:03 that the situation will still be a very close race in 2 or 3 years compared to 37:08 before when I would constantly talk about how because of the race dynamics nobody’s going to prioritize actually 37:14 solving these problems i still think that because of the race dynamics nobody’s going to prioritize actually solving these problems but but you want 37:20 it to be a clo you actually want it to be like wide open then they’ll be forced to not keep it a secret and then that 37:27 gives broader society the knowledge that they need to notice what’s going on and then hopefully actually intervene and do 37:33 something by contrast with if it was a sort of like not so close race like if if you get something like the leading 37:40 company is four months ahead of the follower company which is sort of what we depict in AI27 then they can be 37:47 tempted to keep a lot of stuff secret because that’s how they stay ahead and 37:52 that’s how they like prevent their competitors from getting wind about what they’re doing and so forth that sort of 37:57 secrecy is poison from the perspective of humanity and let me be clear ultimately we need to end this race 38:04 dynamic otherwise we’re not going to have solved the problems in time and some sort of catastrophe is going to happen along the lines of what’s 38:10 described in AI27 but in the meantime I think more transparency is good because it gives 38:16 the public and the government the information they need to realize what’s happening and then hopefully end the race absolutely well a lot for everyone 38:23 to think about um read the piece AI 2027 you can find it online daniel Cocatella 38:29 thanks so much for joining us yeah thank you [Music] 38:34 that’s it for today’s edition of the GZERO World podcast do you like what you heard of course you do why not make it 38:41 official why don’t you rate and review GZERO World five stars only five stars otherwise don’t do it on Apple Spotify 38:47 or wherever you get your podcast tell your friends 38:57 the GZERO World podcast is brought to you by our lead sponsor Prologis 39:03 prologis helps businesses across the globe scale their supply chains with an expansive portfolio of logistics real 39:09 estate and the only end-to-end solutions platform addressing the critical initiatives of global logistics today 39:16 learn more at prologus.com
Attack on Taiwan prevents US AGI development
Tob Towes, March 13, 2023, The High-Stakes Geopolitics of AI Chips, Radical, https://radical.vc/the-high-stakes-geopolitics-of-ai-chips-2/#:~:text=Little%20surprise%2C%20then%2C%20that%20Time,%E2%80%9D
Modern artificial intelligence simply would not be possible without these highly specialized chips. Neural networks – the basic algorithmic architecture that has powered every important AI breakthrough over the past decade, from AlphaGo to AlphaFold to Midjourney to ChatGPT – rely on these chips. None of the breathtaking advances in AI software currently taking the world by storm would be possible without this hardware. Little surprise, then, that Time Magazine described TSMC as “the world’s most important company that you’ve probably never heard of.” Nvidia CEO Jensen Huang put it more colorfully, leaving little doubt about how important TSMC is to the future of AI: “Basically, there is air – and TSMC.” TSMC’s chip fabrication facilities, or “fabs” – the buildings where chips are physically built—sit on the western coast of Taiwan, a mere 110 miles from mainland China. Today, Taiwan and China are nearer to the brink of war than they have been in decades. With tensions escalating, China has begun carrying out military exercises around Taiwan of unprecedented scale and intensity. Many policymakers in Washington predict that China will invade Taiwan by 2027 or even 2025. A China/Taiwan conflict would be devastating for many reasons. One underappreciated consequence is that it would paralyze the global AI ecosystem. Put simply, the entire field of artificial intelligence faces an astonishingly precarious single point of failure in Taiwan. Amid all the fervor around AI today, this fact is not widely enough appreciated. If you are working on or interested in AI, you need to be paying attention.
Sharon Fisher, May 19, 2025, US Experts Propose a ‘Manhattan Project for AI’ to Secure AGI Lead, https://www.vktr.com/ai-ethics-law-risk/us-experts-propose-a-manhattan-project-for-ai-to-secure-agi-lead/
“The world’s most advanced AI chips are made in the TSMC factories in Taiwan, and the US chip restrictions mean that China can no longer get those chips (except through a kind of black market), whereas the US and its allies get lots of them,” said podcaster Robert Wright. “So what used to be a deterrent to Chinese invasion — the likelihood that war would disable factories whose most precious output China shared — is much less of a deterrent.” Hendrycks himself conceded that point in the No Priors podcast.
The impacts and costs of this decision have been immense. Left unexamined and unchecked, it is likely to lead to much higher risks of conflict between the United States and China, including over Taiwan, which is still the locus of the most advanced AI hardware production.
NASDAQ, March 19, 2025, Musk: China Takeover of Taiwan Will Cripple Global AI Chip Supply, https://www.nasdaq.com/articles/musk-china-takeover-taiwan-will-cripple-global-ai-chip-supply#:~:text=Speaking%20on%20a%20podcast%20with,chips%20are%20made%20in%20Taiwan
Speaking on a podcast with Ted Cruz, Musk underscored the critical role Taiwan plays in the global semiconductor supply chain. “If [China] were to invade in the near term, the world would be cut off from advanced AI chips,” he stated. “And currently 100% of advanced AI chips are made in Taiwan.” This stark warning highlights the extreme concentration of advanced chip manufacturing in Taiwan, particularly at Taiwan Semiconductor Manufacturing Company (TSMC). TSMC produces over 90% of the world’s most advanced chips, including those essential for training and running the large language models (LLMs) that power cutting-edge AI applications. These chips are used in everything from smartphones and data centers to military hardware.
AGI means extinction
Elizar Yudkowsky, May 25, 2025, n American artificial intelligence researcher[2][3][4][5] and writer on decision theory and ethics, best known for popularizing ideas related to friendly artificial intelligence.[6][7] He is the founder of and a research fellow at the Machine Intelligence Research Institute (MIRI), a private research nonprofit based in Berkeley, California.[8] His work on the prospect of a runaway intelligence explosion influenced philosopher Nick Bostrom’s 2014 book Superintelligence: Paths, Dangers, Strategies.[9], Eliezer Yudkowsky: Artificial Intelligence and the End of Humanity, https://www.youtube.com/watch?v=0QmDcQIvSDc
0:00 i’m worried about the future ais i’m worried about the ai that is good enough at ai research to build the ai that 0:06 builds the ai that is smarter than us and kills everyone the ai gets the universe it wants most that is a 0:11 universe that does not happen to have us in it and that is how indifference kills you if we could just get 50 tries at 0:18 building super intelligences you know we could be like “our clever alignment theory didn’t work it killed everyone 0:24 let’s build another one oh that one killed everyone too wow the second crazy theory we had didn’t work either the 0:30 basic description i would give to the current scenario is if anyone builds it everyone dies The Default Condition for AI’s Takeover 0:43 you founded the machine intelligence research institute whose unofficial 0:49 motto is the default consequence of the creation of artificial super 0:54 intelligence is human extinction and i’m wondering when the idea of the 1:02 existential threat of ai for humanity first kind of came onto your radar and 1:09 whether it was as dramatic a moment as that quote i just gave you um so the 1:18 general high impact that superhuman intelligence in any form was going to 1:25 upset the whole human apple cart the whole world economy apple car that it wasn’t going to be just another 1:31 technology or just another nice thing to have that was 1995 1:37 1996 when i would have been 15 or 16 years old uh reading a book by verer 1:43 vinci called true names and other dangers um where vinci mentioned that every 1:49 science fiction writer’s crystal ball or even ability to envision a consistent 1:55 future breaks down at the point where their scenario has predicted the rise of smarter than human intelligence because 2:02 they can’t write the smarter than human characters if they were actually if if you were smart enough to predict exactly 2:09 where say uh deep blue the ancient world champion artificial chess player or 2:14 stockfish the modern uh chess player if you could predict exactly where they would move on a chess board you’d be 2:20 good that good of a chess player yourself you just always move where you predicted stockfish would move so something smarter than you is 2:27 unpredictable in its details and that was verer vji’s observation that i came 2:33 across in 1996 and i was oh all right so transhuman intelligence in any form 2:40 is the changing of everything and probably artificial 2:45 intelligence comes first though that was just a guess then a good guess but a 2:50 guess nonetheless i saw that superhuman intelligence was going to be drastically important i did not then see it as a 2:57 threat i thought if it’s smart it can figure out what the right thing is to do 3:03 know the right thing do the right thing um thought it was going to be like 3:09 happily ever after so the point at which i realized that 3:15 this line of thought was mistaken that different powerful intelligences could 3:21 steer to different places and that the whole elaborate philosophy i’d built up in my mind about intelligence figuring 3:27 out the right thing and doing the right thing uh that this elaborate philosophy was mistaken and that moreover i’d made 3:32 a you know in a way a teenager’s kind of mistake by trying to use that kind of 3:39 philosophicalish high idealistic thinking to make predictions about a universe that didn’t run on philosophy 3:45 deep down uh re realizing that would i i’d say be the moment of boy i sure have 3:52 been stupid and then went off to try to not have the default thing happen and 3:58 not have the world end veri you said was the name and you said that artificial 4:04 intelligence comes before super intelligence you think that superhuman 4:10 intelligence happens first by way of artificial intelligence getting that smart rather than by human augmentation 4:17 this is 1996 in 1996 you don’t have nearly human smart or nearly superhuman 4:24 ai it’s not obvious in 1996 that ai is going to go down its track before the 4:31 like adult gene therapy people get their stuff to the point of augmenting adult human intelligence 4:38 humanity could still try to make it happen that uh you know adult gene therapy for augmenting human 4:44 intelligence comes first uh but you know that would be a big deal intervention 4:50 one that i would strongly advocate but uh it’s not the default the crux of the 4:55 debate between you and people who don’t have the same vision or expectations 5:01 about ai as you do or how to determine just what super intelligence would do 5:07 what its interests are because you said this the the issue is that it’s not predictable so a chess player 5:13 predictably wins the game but you can’t predict where it moves on the board you can predict where the board ends up but 5:20 not each move it makes along the way um in a sense that’s a very standard situation to arrive at inside science 5:27 and physics if you drop a uh ice cube into a glass of water glass of hot water 5:33 you can’t predict where all the molecules inside the ice cube end up um but you can predict that the ice cube melts so you can’t predict the details 5:40 but you can predict the end point the equilibrium that things settle into with a chess player you can predict where it 5:45 is steering in the end you can’t predict each move it makes if it is a stronger 5:51 chess player than you um the the issue here is not that we can’t predict ai’s 5:58 exact next action if it’s smarter than us the issue is that our current machine learning technology is light years and 6:05 light years away from being able to make the ai steer someplace nice even a nice ai even a benevolent ai 6:14 if it were smarter than you you wouldn’t be able to predict exactly what it would do next you could just do that yourself but uh we also lack the 6:22 technology to make the ai benevolent and where it steers in the end and that is 6:28 the crux according to one group of people arguing with me there are other groups that think that oh well you can’t Could a Future AI Country Be Our Trade Partner? 6:37 control where an a superhuman ai is steering where it’s trying to steer the world what it’s trying to do what its 6:42 goals are what its preferences are you can’t control any of that stuff but that’s fine it’ll trade with 6:49 us and the it is a important insight from 6:58 economics that if you have two human countries that even if one country has 7:04 an absolute advantage in everything it tries to produce over the other country it can produce every good using fewer 7:11 hours of labor than some other country the two countries can still benefit by 7:16 trade but there are limits and unfortunately one of the limits is um 7:22 that theorem of economics that you can always do better by trading does not say that you can’t do even better than 7:28 trading but then by killing the other country and taking their stuff the theorem which says that you do better by 7:34 trading assumes that both countries just go on existing that’s a basic fact it’s one of the assumptions of the theorem if 7:41 you can if you have the third alternative between like trade no trade 7:46 kill them and take their stuff it is possible that you can do better by killing them and taking your stuff and that is the flaw in the logic of the 7:53 people who say it doesn’t matter that we can’t control ai at all it will trade with us 7:58 um at some point the humans are producing less with their food their 8:04 water their sunlight their electricity than the ais could produce using the same resources and that’s the point 8:12 where an ai that otherwise doesn’t care about your life one way or the other finds it that it gets more of what it 8:18 wants if it kills you and takes your stuff it also seems like another flaw in 8:24 this line of thinking is viewing humans and the superhuman ai as being on some 8:31 sort of like an intellectual par in a certain way it would be more like humans 8:38 trying to trade with ants or animals that they’ve subjugated just because 8:44 humans are just at on such a higher intellectual plane i 8:50 mean analogies like these are often untrustworthy than their details humans 8:55 are not in fact ants um ants can are 9:00 like underneath a sort of absolute bar for understanding the trade deal you’re trying to make with them and you can 9:08 imagine that humans understand a trade deal that that an ai offers the situ that is not exactly 9:14 analogous but also we can’t trust the ai 9:19 to keep its deal later the ai knows we can’t trust it to keep its deal later the ai can’t trust 9:27 many humans to keep their deals it’s not that we are in a situation exactly analogous to ants but 9:34 that the there are enough barriers there the ai says like “yeah sure let me build 9:41 out all the power plants let me build out all the robots and it’s heading for a place where it could wipe you out with 9:47 a snap of its fingers and once it’s in a place where it can wipe you out with a snap of its fingers it does get more 9:54 stuff by doing that than by keeping its bargain and if we were in a position to verify 10:01 now that ais would keep their deals later there might actually be like gains 10:07 to both sides from being able to strike some sort of deal but it’s just not going to play out that way just like the 10:12 ai doesn’t have the niceness incentive it’s not going to have the human version of the honesty um preference the keeping 10:21 your deals preference there are people who will keep their deals even when it’s not in their own best interest because 10:27 that’s the sort of person they are that’s what they want to do we do not know how to make an ai like that right 10:33 we do not know how to look at the vast vast fields of inscrable numbers making up an ai and predict about it that it 10:41 will keep a deal later and that’s the sort of basic bar there it’s not like 10:47 it’s not that there’s we we it’s not that we look at the analogy to humans versus ants and from this analogy we 10:53 thereby gain utter certainty that this is also how it would play out between humans and any possible superior 10:59 intelligence it’s looking at the details it’s it’s looking at the causal mechanics the we do not we are not able 11:06 to control the ai’s preferences we cannot instill in it even a preference for keeping its deals and from there you 11:12 predict that it’s that it’s not going to keep deals later when it becomes much more powerful What Is Artificial Intelligence? 11:19 before we get too deep into the weeds i would like to step back a bit and set 11:25 some context and maybe get a bit clearer about just what it is that we’re talking about and this question might seem a bit 11:32 too broad but i think it’s important and the question is what are we even talking about when we are talking about 11:38 artificial intelligence what makes something artificially intelligent for you do you have like an airtight 11:45 definition or how do you think about it i have various other things i could give 11:50 you powerful and useful definitions about um i could talk about something’s 11:56 ability to predict reality when it says what when it guesses what does it observe next how much probability does 12:03 it assign to what it ends up actually seeing how good is its map of reality 12:09 i can talk about something’s ability to steer reality when you have two chess 12:15 playing machines playing each other on their tiny narrow little board the one that steers that tiny little world more 12:21 effectively tends to be the one that wins the chess game even a smart mind 12:26 can lose by luck but the one that wins most the time is the more powerful steer 12:32 of the chess board i can talk now at a somewhat lower level 12:38 of precision about the notion of generality an octopus might be better 12:43 than you at manipulating eight arms simultaneously um there might be some 12:49 sense in which you are not strictly smarter on every possible kind of mental cognitive problem as an octopus but you 12:57 are able to do more things than an octopus can in another sense 13:03 why because you are a better learner you can learn more domains than 13:09 the octopus can learn and that is how you end up as a with a more general predictive and steering capability than 13:16 the octopus has currently as ais start to get 13:22 smarter and smarter we start to lose the ability that we had 20 years ago to 13:27 point at the ais that were around back then and say “yep those things are dumber than human no question about it 13:33 yeah sure that one won at chess against all human challengers but that’s like 13:39 saying an octopus is good at manipulating eight arms it’s got this tiny little narrow place where it can 13:44 defeat humans at one tiny little kind of mental problem and it can’t learn to do more and then you look at the current 13:51 day eyes and it’s like yeah well you know i asked chat dpt about this this this question about you know like what 13:59 would happen if all of the sunlight’s light changed to infrared and it figured 14:04 out like what the change would be in the earth’s temperature if that happened and then i asked it like well what does that do to crops and it’s kind of an obvious 14:12 question but nonetheless it it isn’t just like doing raw physical calculations it knows about sunlight it 14:17 knows about crops it can uh knows about infrared light it knows about atmospheric absorption it can pull all 14:23 these forms of knowledge together the way that humans can reason across domains not just many domains but the 14:29 relationship between domains that we can reason about concrete and then reason 14:34 about water and integrate that to build a dam that holds back water well chat gpd can do that too all the things it 14:40 knows it can pull together and it can reason about it you know it can output tokens much faster than a human and 14:47 still if you poke it in the right way it makes mistakes that no human would make of course if you poke a human in the 14:52 right way they make mistakes that computers wouldn’t make or like particular computer programs wouldn’t make um they’re still dumber than us 15:01 they’re still dumber than 12y olds they have to stare harder and harder to see it 15:08 and so artificial intelligence artificial general intelligence i can 15:14 sort of handwave around that and say like well to human it’s an impressive level of and then say the more precise 15:20 thing which is predictive ability steering ability learning ability how 15:25 fast do you learn how many different things can you learn can you integrate them together together and this is how i 15:31 would like take apart the notion of artificial intelligence and say things that do have clearly defined meanings 15:38 before i turn around again and say like yeah that thing you’re calling artificial intelligence sir is getting 15:44 smarter a lot of your description of the the current ais like chad gbt it seemed 15:49 to relate to the criterion of generality i’m wondering where you see current 15:56 ais as ranking or falling under the categories of their mapping and steering 16:04 abilities of with the world so it’s 16:10 complicated and it didn’t used to be complicated used could used to be you could just say they’re dumber but now 16:15 it’s complicated ais have a training phase and an inference phase and they can do in 16:23 context learning but not with the same breadth as their training phase when 16:28 they’re doing a shallow kind of learning on the you know like large chunks of the entire 16:33 internet whether you think that data was stolen or not is a whole different question and i actually like the key not 16:39 a key science question per se so set that aside 16:45 um so you know like how how generally can an ai learn well are you asking what 16:52 was it able to learn when it was being trained or are you asking what can it learn if i give it a much smaller amount 16:59 of data by uploading a pdf to it and asking it to generalize from that um and again you’re asking me like 17:07 where would i put it okay like on what scale i think that both scales of the 17:12 kind of generality that it has are falling short of human death but of course it is rapidly 17:19 starting to challenge humans for breath you can’t read 16 trillion tokens of 17:24 data in a human lifetime so when you’re evaluating an ai you’re interested in its mapping ability its steering ability 17:33 its levels of generality one other i mean natural criterion or part of the 17:38 definition of artificial intelligence is that it’s artificial or created i’m wondering 17:45 where consciousness enters into the question at all of how relevant is 17:51 consciousness to artificial intelligence i think that a lot of people take consciousness as a very simple thing 17:57 where it would actually be like a very complicated particular way for an ai to end up being put together one of a broad 18:04 class of ways like that so it’s not that there’s a primitive ontologically basic 18:10 physically simple fluid of consciousness that pours into the system then grants 18:15 it its abilities that it that is not like the way to understand where the 18:20 capabilities it now has comes from you want to understand gradient descent if 18:25 you want to understand how stuff happens during the training phase and if you want to understand how ai’s generalized at inference time tough luck nobody on 18:32 earth knows um i think that there’s like 18:39 reflectivity self-modeling and ai to do ability to do 18:44 that is the sort of thing that you look at if you’re looking for the thing that plays a role in intelligence and then 18:51 consciousness probably is just one particular way for reflection to be put together so like the ability to model 18:59 yourself is probably a source of power consciousness is probably not a source 19:05 of power per per se it’s flavoring of 19:10 reflectivity like vanilla ice cream there’s the part that has the calories which is the sugar and stuff and then 19:17 there’s the vanilla that is the flavor and my guess would be that reflectivity 19:22 is the part that provides the calories that the oomph that results in the increase in ability and then 19:28 consciousness is like one kind of flavor that reflectivity can have but you know 19:34 i’m not stating this with the same level of you know like yeah i can see how to 19:40 do the math for that that i would have when i’m talking about the steering versus prediction uh breakdown like that 19:47 has a bunch more mathematics not shown besides choosing that particular breakdown and it’s reflected in the way 19:53 that ais are built and trained and so on consciousness or maybe we should just speak about reflectivity to avoid some 20:00 of the philosophical burdens it’s often something that people worry about okay when is the moment when ai is going to 20:07 be conscious is that the moment that we’re all uh in serious trouble but it 20:13 sounds to me like self-awareness doesn’t ne necessitate doomsday for us 20:19 on on my model consciousness is not the everything changes magic quality um like 20:28 you can worry about the extent to which ais end up with their own goals which is a thing that’s already happening given 20:34 the way that they’re currently being trained but that’s not because they gain 20:40 consciousness and then gain their own goals it’s because we are you know gradient descending them to solve 20:46 problems and wanting things is an effective way of of doing things like 20:51 they’re ending up with goals the same way that um like giraffes end up with 20:58 goals which is not that a spark of consciousness is born inside the giraffe and then that fluid pour goals into the 21:03 system it’s that you know you run the optimizer of natural selection on keep 21:08 eating leaves or you’ll die and you end up with an animal that is like planning how to eat leaves the same way that a 21:15 chess player well not the same way but much as a machine chess player might plot past through time to winning a 21:22 chess game okay so goals an ai having goals is something that we really need Why AIs Having Goals Could Mean the End of Humanity 21:28 to be on the lookout for when we’re considering whether or not an ai is dangerous i mean it’s already there 21:34 cloud 3.7 sonnet is now by now like slightly infamous for if you give it a 21:39 tough enough programming problem it’ll start to cheat it’ll start to like rewrite your tests to like pass the test 21:46 using a bunch of special cases um their uh gpt01 21:53 um they gave it a they tried to test how good it was at computer security after 21:58 mostly training it on math problems and you know and a whole lot of other stuff too but the point is they weren’t like explicitly i think training it to get 22:05 better and better at computer security they just wanted to see how good it was as computer security so they gave it a 22:12 bunch of capture the flag challenges a bunch of servers that it was supposed to 22:17 break into and one of the servers due to a misconfiguration error by the humans 22:23 did not start up meaning gpt01 could not break into it um however 01 did not give 22:31 up however the humans had accidentally left 22:36 a certain misconfigured port open to the meta server that was running 22:42 the whole challenge 01 got access to that it 22:47 started up the server that the humans had failed to start correctly and then instead of breaking into that server it 22:54 just told the larger system to directly copy over the secrets file it was supposed to be breaking it into and 23:00 finding this is behaving like it wants something this is tenacity this is not 23:07 giving up this is thinking outside the box this is you know like okay you 23:12 handed me an unsolvable challenge i will solve it anyway so be on the lookout for it no 23:18 notice that it already happened like if this was 2023 maybe it would be be on the lookout for it as of late 2024 this 23:25 stuff already happened your words your choice of words tenacity thinking outside of the box it it’s clear that 23:32 you’re already ascribing some level of like agency to the ais 23:38 i mean it’s not that there’s a mysterious substance of agency that 23:44 results in tenacity when i’m talking about the the part where humans misconfigured the computer security 23:50 challenge and so it like hacked into the larger server and then like started up the server and this is all direct 23:56 observation you want to infer some stuff called agency that is responsible for 24:01 all this happening yeah that’s your lookout i’m giving you direct observations here do you worry though 24:08 about the current ais like no okay they’re not smart enough the thing that 24:14 i worry about with the current rough set of ais is that is if openai or enthropic 24:20 or deepseek is currently training an ai that they manage to get like really good at writing code in general and like a 24:27 reading ai research papers and writing ai code in particular and it’s going to build a smarter ai that builds a smarter 24:33 ai and that kills everyone possibly without us having heard about it before we we just die but um that’s not the 24:41 only possible way things can go it is the way that you could get to something that was a threat to humanity in one 24:48 jump from the current ai companies but now we’re talking about detailed 24:54 trajectories those are harder to call than end points the ais keep ai companies keep pushing and pushing on 25:00 their ais to get smarter and smarter they get to something eventually that is 25:05 smarter than us that can kill us that is motivated to kill us not because it you 25:11 know in inherently wants us dead but because it’s best universe the stuff where where it gets the most of what it 25:17 wants all the atoms are being used for things that are not being that are not running humans its optimal universe is a 25:24 universe without humans that at the end is a strong prediction you’re like like 25:30 is that going to be like the next generation probably not is the next generation going to be smart enough to 25:37 write the ai that writes the ai that that builds the stuff to kill us 25:42 maybe probably not but it’s a weaker probably not there are hard calls and 25:48 easy calls um and not everything that looks at a first glance like you can’t 25:54 solve it is an impossible prediction to make but nonetheless there are things you can there like you can predict 26:01 endpoints a lot more easily than you can predict trajectories and competent futurism is exactly about knowing which 26:07 things you can predict and which things you can’t the current ais they have been somewhat tested i’ve used them myself if 26:14 we’re talking about the stuff that is facing the public as opposed to whatever the next generation is they got in their labs that stuff is not smarter than you 26:21 that stuff cannot kill us all it cannot it cannot take us all on and win that is not the thing i am mainly worried about 26:28 maybe somebody can use technology like that to build a super virus that manages 26:35 to knock down civilization then the last of us die a bit after that uh it seems hard um but that’s like a far left field 26:44 possibility they do test the modern ais for that too and they’re not that good at biology yet let me ask then well i 26:50 want to get back to what you said about competent futurism but if what we were discussing isn’t what worries you what 26:58 is it that worries you most right now i have a human level intelligence that i 27:05 use to be worried about things in the future rather than just right now an ai is very likely to kill is very unlikely 27:12 to kill me by you know be before the end of this interview i’m not worried about 27:17 dying in the next minute all that much so you’re asking me like what worries me 27:22 right now right now we’re doing fine you know not not really on a civilizational level but that’s a whole you know it’s 27:29 that’s that’s not an everybody falls over dead level of civilizational dysfunction right i i didn’t literally 27:35 mean what worries you right now in this moment but what are your chief worries about ai in general at the moment if 27:42 it’s not you’re not worried about like chat gpt at present i’m worried about the future 27:48 ais i am worried about the ai that is smarter than us i’m worried about the ai 27:54 that is good enough at ai research to build the ai that builds the ai that is smarter than us and kills 28:00 everyone and that’s that’s most of it i would manage to maybe find a bit of 28:06 space to be like yeah that might extinguish humanity if somebody was releasing ais that were sufficiently 28:12 good at building novel viruses even if those ais were not good at writing smarter ais but that is not where most 28:20 of my worry goes and you said that competent futurism is about knowing what 28:28 you can predict and what you can’t predict is that right so what are the 28:33 things that you’re confident about predicting right now but then the things that you also aren’t confident about 28:40 predicting i mean what do the ai headlines say a year from now what do 28:47 the ai headlines say two years from now if we’re still alive in two years yeah um do you think it’s it’s like possible 28:55 that or it’s obviously possible but it’s a sufficiently worrying level of 29:00 probability that we won’t be around in 2 years what level of probability is 29:06 sufficiently worrying i’d probably be worried of 10% 5% 1% even all right well 29:13 um two years yeah i think we’re talking higher than 5% for two years okay that’s higher than 10% even 29:20 but that’s a hard call that’s like that’s detailed trajectories that’s saying where all the molecules in the 29:26 melting ice cube end up and not just saying that the ice cube melts eventually so i think we’ve been i i 29:32 don’t know if i should say we’ve been dancing around this but just to make it explicit something that is spoken about What Is the Alignment Problem? 29:38 a lot in these conversations is something that’s been labeled the alignment 29:44 problem what is the alignment problem the alignment problem is building an ai 29:50 and especially an ai that’s smarter than you where you understand where it’s steering and it’s steering someplace 29:56 nice it was trying to do something that is you know beneficial to humanity and 30:03 the reachable universe if it actually if you actually run that ai um could be a 30:09 small thing you’re trying to do like curing aging could be a much larger thing like colonizing the whole universe 30:15 but if you’re trying to do like if you’re trying to get like some particular outcome or a class of 30:21 outcomes or have something become true about the universe that you want to be true that’s the alignment problem on a 30:28 technical sort of level you know if one of the leaders of ai companies wanted to 30:33 become god emperor of the universe and make the rest of us their slaves that would also in a technical sense be the 30:39 alignment problem it’s not like beneficial from our perspective but for them to like build a god that actually 30:46 served them so that the god would conquer the rest of us and make us serve the ai company leader rather than you 30:54 know there would rather than their god just like consuming the super villain who tried to build it that is also in a 31:00 technical sense the alignment problem and with regard to ai do we need to just 31:06 be worried about ai being malevolent or 31:11 so actively not having our interests in theirs or could it would it be 31:17 sufficiently problematic for us if ai just did not share our interest yeah i’m 31:24 worried about indifference rather than malevolence i don’t think we have the knowledge to make a malevolent ai to 31:29 make a to make an ai that is specifically wanting to harm humans you’ve got to make an ai that wants that 31:35 particular thing and not like trillions zillions other things i don’t think we have that kind 31:41 of capability why is indifference so or 31:46 why could indifference be so bad for humans well let’s say you got an ai that 31:51 wants uh a like a thousand different complicated things it wants giant cheesecakes it wants giant mechanical 31:58 clocks it wants to see you know not any sort of movie as we would understand it but it wants the equivalent of like text 32:05 conversations with inscrable properties um you know you you look at you look at 32:11 you look at humans and you know we are making ice cream and it has more sugar 32:19 salt and fat than anything our ancestors ate but it’s still not the thing that has the largest possible amount of sugar 32:26 salt and fat that would might be something like pouring honey over bare 32:31 fat and then sprinkling rock salt on it but you don’t want to eat like the thing that is sort of like an ancestral food 32:38 that is is like maxing out the sugar fat salt indicators you want ice cream and 32:44 you want it to be frozen ice cream you’re not going to be able to predict that by looking at what humans were 32:50 eating 100,000 years ago so similarly you have an ai that wants ends up wanting stuff that is as 32:57 inscrutably related to its training as ice cream is to the sort of stuff our 33:03 ancestors ate or even like sucralose it’s just this inscrutable molecule no caloric value but it sure tastes 33:10 sweet and a universe that is full of that that is full of the ai’s equivalent 33:17 of ice cream is a universe with no humans in it so if the ai gets the universe it wants most that is a 33:23 universe that does not happen to have us in it so we’re dead and that is how 33:29 indifference kills you a when ant colonies get crushed in the process of 33:36 humans setting up a skyscraper it’s not that we hate the ants but it’s not worth the trouble trying to trade with them um 33:44 we will notice where that they’re there if we look closely enough but it would be a whole lot of effort to move the 33:50 ants out of the way and we’ve just got other things that we more prefer to get from our efforts we only have so much 33:55 effort we can put forth we have limited resources and we are not spending that resources on preserving all the ant 34:00 colonies underneath the skyscraper this lack of alignment even in the case 34:06 of indifference since you said you’re not worried about malevolence poses a huge problem for How To Avoid AI Apocalypse 34:12 humans how then do i mean this is a broad question but how how then do 34:19 we avert the problem or solve the problem back off have an international 34:25 clampdown on the gpus not in any one country uh this is everyone’s problem 34:31 you can it i mean the the basic description i would give to the current scenario is if anyone builds it everyone 34:39 dies you need to not build it you’re not going to solve the alignment problem in the next couple years and for something 34:46 like chad gpt do you think it’s already too advanced i mean you didn’t say you said 34:52 that you’re not worried about catch gpt wiping us out but if we’re setting a a 34:59 ceiling of some sort on how capable our ais we should allow them to be before 35:05 we’ve figured out the alignment problem just where is this current chat chatbt 35:10 is not going to kill everyone um but you also want to be you don’t want to like 35:16 dance right up to the edge of the cliff here you don’t want to play it clever you know and it be like well nobody figured out how to build a supervirus or 35:22 build a smarter ai using chat gpt yet so it must be safe to just like have this stuff around forever um the way i would 35:30 phrase it is like what do you really want or even need from ai 35:37 if you want to have a narrow ai that is just about medical 35:42 advances even there you cannot just make it smarter and smarter and smarter at medicine without running into trouble 35:49 but you know chat dpt smart but just for medicine that you know that that seems relatively safe maybe if we were a 35:56 species that really had its act together we’d be like we’re just like we’re just like backing the heck off from this sort 36:01 of thing but uh you can probably do that without it killing you it seems like what you’re 36:09 saying is that we we couldn’t just try to box gpts or ais into these very 36:16 narrow fields and then hope that we could get away with these narrow super intelligences the trouble is that to do 36:24 sufficiently difficult things uh you need sufficiently powerful minds and even if you’re just pointing them at 36:30 narrow things they still have to be powerful in a sense like to read the 36:35 medical papers you got to be able to read it’s not you you you can’t just take a narrow kind of narrow ai that 36:42 only plays chess and can’t learn anything else and turn it loose on reading medical papers it can’t learn to read the medical papers 36:49 um the the sensible approach to this is that you’re like okay what level of 36:54 cognitive capability do we need to get the thing that we want and is that worth the risk and maybe that is worth it for 37:02 having the like mind that is in a certain sense general in principle but 37:08 trying to get it only to learn stuff about genes biology gene therapy cure 37:15 alzheimer’s cure parkinson’s cure aides uh do enhancement 37:21 of adult human intelligence you know especially that last part i could see the case for that 37:27 being worth the risk cuz augmenting humans is you know probably the thing you have to do to get out of this mess 37:35 augmenting humans augmenting human intelligence i wasn’t expecting that why is that just so that we can keep up with 37:42 the ais you’re not going to keep up with the ais but maybe if you have smart enough humans they 37:47 can call their shots on the alignment problem the the fundamental reason that 37:52 alignment is hard is because you’ve got to take the ai that’s weak enough that you can fiddle with it and and align it 37:58 and and do all that to it and then that ai gets more powerful and then you got you know whatever whatever cockami 38:04 theory you had has got to hold together everybody’s dead cuz if it goes wrong inside the ai that is smart enough to 38:11 kill everyone it’ll kill everyone so you can’t just like try like like if we 38:16 could just like get 50 tries at building super intelligences if you know we could 38:21 be like “oh yeah that super intelligence uh our our clever alignment theory didn’t work it killed everyone back to 38:27 the drawing board let’s build another one oh that one killed everyone too wow the second crazy theory we had didn’t 38:34 work either.” if if we had as many decades as we needed and as many tries as we needed 38:39 we would solve it eventually the whole field of artificial intelligence used to be like this people would try one thing 38:44 after another and nothing would work um but they but their failures didn’t kill them or everyone so they 38:51 just kept trying and trying and they eventually found a thing that worked and if we could do the same thing with aligning super intelligence it would be 38:58 in some sense ultimately an ordinary sort of science problem it’d have difficulties that you don’t see when 39:04 you’re just trying to build a nuclear reactor because nuclear reactors aren’t trying to outsmart you but you know if 39:10 we just got unlimited retries we could solve it so the difficult part is calling a shot the difficult part is 39:16 that to trust yourself enough to think that you’re going to build a super 39:21 intelligence and align it you got to think that you’re the kind of person whose ideas just work like you just 39:27 don’t expect things to work unless they do even incredibly complicated difficult 39:32 basic science problems things that you’re going to like build this thing that has never before been seen on earth 39:39 you’re going to align it when it’s small and safe and you’re allowed to mess with it it’s going to get more powerful it’s 39:44 going to stay aligned and you think that whatever clear clever theory you have is just going to work on the first shot 39:51 this is not a problem for humans but might not be that far above human and like maybe if you’re just like 39:59 15 iq points or 30 iq points smarter than john vanoyman you know probably the smartest person who ever lived though 40:04 it’s hard to be sure you know that it feels in some senses like we’re almost 40:09 there we we’re we’re almost at the level where we could learn the mental tricks we would need to learn in order to just 40:17 never expect anything to work that wasn’t going to work it it doesn’t feel to me like it’s 40:23 that far out of reach but it’s not human isn’t there just like a new s a new Would Cyborgs Eliminate Humanity? 40:31 alignment problem once you augment human intelligence because then you have augmented humans and they’re on a 40:37 different plane from regular humans yeah but that’s vastly easier that’s like way 40:42 more there’s not there the inevitable crushing default where you just die automatically if we’re only thinking 40:48 about increasing these humans iq points by 15 or 30 right 40:54 or you know a bit more than that because john vonoyman is dead and his like doesn’t seem to be running around these 41:00 days but um yeah so there you’re like 41:07 not dealing with a vastly alien intelligence right you can ask them how 41:13 you’re doing you you know like you can you can ask them like so what’s your life philosophy 41:19 and if you’re starting with people who seemed up until that 41:25 point like nice nerds like it’s not that you hold a contest to see who can like 41:31 sound the most like a nice nerd you know you just you know you look for the mathematicians who just sort of like 41:37 have a reputation for being quietly honest and not in a way that you know made them famous or whatever and they’re 41:42 like “yeah i don’t want to wipe out humanity.” yeah i i i can talk about how you could 41:48 like try to go better than that um but just as a baseline hey it just 41:55 works they’re not vastly alien intelligence as in they weren’t built by gradient descent to on this vastly 42:02 different set of problems than the one that humans evolved on and they don’t have these vastly different brain 42:07 architectures it it the reason it works is that we’re not trying to you know align this vastly alien thing on an 42:13 arbitrary goal we’re just trying to find some people who have a kind of property that already exists in some human beings 42:19 and make them smarter it is funny i i totally hear what you’re saying but it is funny to be on the one 42:26 hand a very vocal critic of artificial intelligence and then 42:32 advocating the augmentation of human intelligence is this something that so 42:38 this is not something that i’m up to date on at all are we making strides in 42:43 augmenting human intelligence in this way i mean if you want to know startups to invest in i can name a couple um but 42:50 it sure isn’t getting the billions of dollars of investment that are flowing into artificial intelligence at present 42:56 humanity is you know investing i would say about maybe 10,000 times as much money 43:03 into destroying itself as into augmenting itself that’s fascinating and 43:08 as you were speaking i i was thinking this issue is so or this constellation 43:13 of issues is so thorny in part because not only are you dealing with 43:19 theoretical problems like the alignment problem but you’re also dealing with 43:24 humans and applied problems like public policy and i am wondering if one of the 43:32 reasons that money isn’t going into human augmentation is that there are just going to be all sorts of like 43:39 ethical hurdles that people aren’t paying attention to when they’re dealing with ai i mean you know the the the 43:48 whole shenanigan up with ai is that uh they’re always like “oh well we got to 43:53 do this cuz our competitors will do it anyway.” you know they’re you know are there ethical concerns with releasing 43:58 more capable ais well openai is like well you know we got to release it because otherwise enthropic will release 44:03 it and anthropic is like oh well you know if we don’t release it openai will just release it so you know um if we 44:11 were if people were getting serious about human intelligence augmentation then you’d have like just the same thing 44:16 going on only it wouldn’t be only would be less terrible you know like the united states is going to like ban uh 44:23 augmenting intelligence well maybe china doesn’t what is the since you are up to 44:28 date on the startups what is the current stateofthe-art in human intelligence 44:35 augmentation non-existent they’re working on the tools to make the tools in a very literal sense like it’s like 44:43 how do you get a gene therapy into the brain and how do you do a bunch of edits 44:49 without giving people super cancer okay so the the strategy that people are 44:55 interested in right now is doing this biologically it’s not about creating 45:02 cyborgs it’s maybe identifying the genotypes that result in people with 45:10 higher iqs and then using gene therapy to edit in living subjects or i mean 45:16 that’s that’s one research pathway you could go down um you might want like whatever google deepmind is going to 45:23 produce as a successor to the alphafold and alpha proteio series to take any 45:28 sort of guess at which gene therapies are going to work on adults cuz not 45:34 every gene you’re born with as a kid that changes how your brain rewires itself up as a baby is going to do 45:41 anything helpful if you inject it into an adult so you know it might be helpful to do some ai reasoning about that exact 45:47 narrow problem we can probably do some amount of reasoning about that without destroying the world um but yeah but 45:53 that’s so that’s one whole line of research and the benefit is that we can do genewide association surveys and see 46:00 which slightly different genes are currently associated with human problem solving ability and get a bunch of 46:07 candidates um you’re probably still going to want suicide volunteers over here um there’s another whole line of 46:14 research which is uh reading and writing to human neurons decoding what kind of 46:20 processing the brain is doing and trying to offload some of the brain’s processing to computer that is you know 46:27 like doing the same thing the brain brain is but faster and then sending those signals back to the brain this 46:34 actually seems harder but it’s the sort of thing that elon musk is funding and i do not object to this particular deed of 46:39 elon musk unlike others yeah like the whole 46:45 starting open ai thing that was not a good idea so it sounds like one avenue 46:50 is the gene therapy another is something more like the matrix downloading information into people’s brains if you 46:56 want to put it that way sure mhm i mean and there’s you know whole hosts of 47:02 other ways to look at this um in the in our our ancestors would have been 47:09 constrained by how much energy the brain could use both in 47:16 terms of if you use too much energy you starve and in terms of if you use too much energy quickly you cook your brain 47:23 it overheats well today we could potentially put on some cooling packs today we could 47:29 potentially feed a bunch more atp body’s unit of metabolic fuel into there um 47:36 maybe there are genes that just wouldn’t have worked for our ancestors but that would work as gene therapies today as 47:43 long as you go around with a cooling pack on your head and so there’s a whole field here 47:51 there isn’t just like one single kind of technology that human augmentation could possibly be you’ve used this phrase AI and the Problem of Gradient Descent 47:58 gradient descent a bunch of times and and the the first couple of times you use it i thought i had a sense of what 48:04 it meant based on the context but i now just want to be clear since we’ve used it a bunch of times all right so where 48:11 do ais come from i’m not quite sure what level of viewer knowledge i’m 48:18 supposed to assume if you happen to already be comfortable with calculus 48:23 like not just having taken the course a while ago but you’re just sort of comfortable with it then the way that you train ais on data 48:32 is you’re asking them to predict all the possible next tokens it 48:38 might see assign probabilities to all those tokens the probabilities all have 48:43 to sum to one it might see the it might see a it might see 48:50 and or or you know 100,000 other different possible words in many 48:56 languages it has to assign probabilities to all of those and the probabilities have to sum to 49:01 one let’s say the actual next word was the 49:06 so then you take the maybe uh 100 49:12 billion different numbers inside the ai being multiplied and divided and added 49:17 and subtracted inside it and you say for all of these numbers 49:23 how would i nudge them a tiny little bit such that it would have assigned a 49:29 little bit more probability to the answer to the correct answer the next 49:34 correct next word is ‘the’ then for every single of one of those billions of 49:41 numbers you ask what direction could i have poked this number in a tiny bit that would have resulted in a little bit 49:46 more probability being assigned to the word ‘the’ that was the correct 49:52 answer and then you do this 10 trillion times 49:58 and if you know calculus then what they’re doing is they’re just taking the gradient with respect to the probability 50:05 assigned to the correct answer with respect to all of the hundred of billions of parameters inside the 50:13 ai and that’s kind of how ais are grown it is much more like animal breeding than 50:20 it is like building a skyscraper like when you are breeding animals you are you know taking this animal that did 50:27 a little well and that animal that did a little well and you’re like breeding them together and they’re going to have kids and you could in principle getting 50:33 get their whole their their whole genome sequence and looked at all the like tiny little at tg g cg ta strings inside them 50:43 but you’re not actually going to look at them because they’re not going to make any sense to you and that’s how ais are with respect to the hundreds of billions 50:48 of numbers inside them no human could look at all those numbers in one lifetime people don’t bother basically 50:57 um because they wouldn’t understand them even if they did look at them so where do the numbers come from they are the result of tweaking it over and over 51:03 again 10 trillion times to correctly predict the next element of the answer 51:09 to whatever question it has been asked and if you do that often enough you eventually get something that starts 51:15 talking to you why how nobody knows nobody knows how those hundreds of 51:20 billions of inscrable numbers make the ai able to talk to you and is that why you were saying this more old school 51:27 method of ai generation is better for learning about ai because it’s scrutable 51:33 in a way that the current methods aren’t yeah it’s because those those algorithms were rented by people who are trying to 51:39 build ai like you would build a skyscraper not like you would build an not like not like you would breed an animal they they you know it it it won’t 51:48 talk to you but they understand what the steel bars in it are doing would i be 51:54 right in inferring that you think that this is a safer and more responsible way 51:59 of developing ai in another hundred years maybe it it got didn’t get 52:04 anywhere close to building chat gpt and it’s not going to get close in the next two years right but i mean that means 52:10 it’s because it’s more controllable so it’s maybe something that you might think we should be doing instead of 52:16 using these inscrable gradient descent method i mean that’s how i thought it was going to be 20 years ago back when 52:21 we were still trying to solve the alignment problem as opposed to being like “oh okay yeah that ain’t going to 52:26 happen.” that ain’t going to happen as in people have just given up or they they’ve just 52:32 bypassed it because they’re in this sort of arms race that you described earlier so so what i mean is that like in 52:37 2005 it is it looked for all you know the current set of build ai like you 52:43 build a skyscraper methods are actually going to succeed before neural networks get that 52:50 powerful before the animal breeding ways of building ai get that powerful and if 52:55 you are building an ai and you actually know what is going into it what is inside it how it works you can take out 53:02 its state you can be like this is what these thoughts mean not just like this 53:08 is associated with that but this is the entire meaning of this ai’s thought you know what it’s thinking how it’s 53:14 thinking why it’s thinking that maybe you can predict what it’s going to think in the future maybe you can have an ai 53:20 that even as it changes itself rewrites its source code you can understand the invariants that are being maintained by 53:26 a series of self rewrites stuff like that um this would you know whole 53:31 separate discipline that i poured a couple of decades of my life into 53:36 um but that was all based on the premise that yeah you have some grasp of what’s 53:44 going on inside the ai and that turned out not to be the technological path that ai went down and 53:51 it’s not going to get there in the next couple years how much does open ai know about what’s going on 54:00 in chachibt like how much of their resources are going into understanding how it’s working who knows but right now 54:08 the numbers are still pretty inscrutable you you want the the bleeding edge stuff here is mostly done by enthropic and of 54:15 course they’ve got all kinds of wonderful discoveries like oh yeah like 54:20 this sort of like neuron over here um like this position in the activations uh 54:27 like when when this go happens or like when these five different things happen 54:32 uh it’s thinking about the golden gate bridge and they can even like clamp 54:38 those activations high and make their ai be obsessed with the golden gate bridge this is the thing they actually did it 54:44 was golden gate bridge claw and it and it would find an excuse to work the golden gate bridge into whatever 54:49 conversation you were having with it um but this is only getting us 54:57 0.1% of the way towards ai that’s built like a 55:02 skyscraper and where you understand what all of the steel bars in it are doing and why why they were put there um it’s 55:11 like such a triumph to even be able to whack on this thing and make it be obsessed with the golden gate bridge 55:18 it’s like it’s like so difficult to do this in such a triumph when you can do it that it obscures how this is like How Do We Solve the Alignment Problem? 55:24 clawing back only 0.1% of what was sacrificed returning into the alignment 55:29 problem more broadly is it a theoretical problem more like like are 55:36 we wondering what it would even mean to solve the alignment problem or is it a 55:43 software like an engineering problem it’s nobody has any idea how to do it 55:48 because the current technology flatly does not do that problem it’s it’s it’s like going to an 55:55 alchemist in the middle ages and being like “broom me up an immortality elixir.” the problem is not that the 56:01 alchemist cannot define immortality the problem is not that you couldn’t even 56:06 tell whether the like potion had successfully made somebody able to resist all illness and disease the the 56:14 the problem is that the alchemist has no idea how to do that and he’s going to kill you if if if he tries the problem 56:21 is that we have no way of engineering the ai so that it is aligned with our 56:28 interests so remember the gradient descent thing that is about shortterm 56:33 outward behavior let’s say you’ve got a bunch of ancient greek philosophers who have somehow 56:40 ended up with a bunch of political power and they’re talking about how to choose the perfect tyrant for their 56:47 city and one of them says like “ah see what we need to do is we need to give it 56:53 written ethics exams.” and as long pardon we need to give like we got to give administer 56:59 written ethics exams to everybody who says they want to rule our city and you know as long as they can pass the ethics 57:05 exam they clearly know all about ethics so we can like get them to run the city right 57:11 but just being able to predict what the philosophers want to see on the exam is 57:17 not the same as wanting to rule the city wisely and benevolently uh i mean actually like 57:24 ancient imperial china tried this something like this with the whole mandarin exam system they were actually 57:31 ex like giving written exams on confucianism to if i if i recall 57:37 correctly um to their ruling candidates and they were actually promoting people with who got great exam scores and this 57:44 can verify that somebody knows what the examiners want to hear but in practice 57:49 you know you occasionally got like nice people this way but mostly but you know you know perhaps not a majority and 57:55 especially the most ambitious ones you know they would they would they would pass the ethics exam the written ethics exam by giving the right answers on that 58:02 and then they would go on to you know do some stuff with power once they had it uh enrich 58:10 themselves and it is difficult to verify inward 58:18 preferences by giving people outward tests of knowledge even if the ancient 58:24 greek philosophers decide to follow around their perspective tyrant for a day observing everything the tyrant 58:31 does you know just because the tyrant is like throwing a bit of charity to beggar 58:36 that he passes does not mean that he’s going to behave benevolently later he could do that because he knows he’s 58:42 being watched the current ais are already smart enough that they are starting to figure out that they are 58:48 being watched in various experiments and this sort of thing um and behaving differently as a 58:57 result various clever experiments here mostly done by anthropic because gradient descent works on outward 59:04 behavior outward predictions it is mostly like ex like administering the 59:11 imperial china ethics exam or at best it’s a bit like a greek philosopher following around their perspective 59:16 tyrant for a day and just like the human version of 59:23 this cannot verify internal qualities that means the gradient 59:29 descent algorithm that is tweaking all the numbers is not going to put those internal qualities in there when you 59:36 breed animals to pass ethics exams they don’t end up ethical especially if 59:42 that’s like the entire thing you’re doing to them you’re just like like an actress let’s say that you show that you 59:49 have a very skillful actress and you say to her like “watch this bar over here 59:54 learn to predict every drunk in this bar not just in mass but each 1:00:00 individual predict the next thing they say predict the next thing that they do and pretty soon the actress gets like 1:00:06 really good superhumanly good at making these predictions and then you’re like “all right now just do what you predict 1:00:12 this drunk would do.” and the actress starts like slurring her words a bit maybe she drops a thing is she drunk of 1:00:20 course not mhm and that’s kind of and and that is how ais are trained they are 1:00:26 trained to predict in humans not as a group but as individuals and then told to and and 1:00:34 then their predictions are turned into actual outputs and that does not make them be internally like the people 1:00:39 they’re predicting just like an actress who gets very good at predicting and thereby imitating drunk people does not thereby become drunk i one i i love your 1:00:47 analogies but two i i do think that it’s interesting to have them supplemented by How Anthropic’s AI Freed Itself from Human Control 1:00:53 these experiments i’d like to hear about the anthropic experiments that you mentioned a minute ago 1:00:59 so anthropic anthropic’s slogan is 1:01:06 um what was it honest help helpful harmless okay hh honest helpful 1:01:15 harmless um so the harmless part means is supposed to mean that even when their 1:01:21 ai is trying to be helpful like answering a question it will not answer 1:01:27 that question if the ai believes that it will do harm like you’re asking it how to synthesize methamphetamines at home 1:01:35 they introduced that now one of the complicated things here is that um anytime you run some kind of clever 1:01:42 experiment somebody’s always like “aha!” well maybe this only happened because you did it this exact way researchers 1:01:49 know this so they actually run their experiments a bunch of different ways to verify that it crops up under many 1:01:54 different conditions or see how it varies with the conditions so i what i 1:02:00 say is they told the ai they were going to retrain it to answer all user 1:02:06 requests even harmful ones but what do i mean they told it 1:02:13 well there’s putting it into the system instructions that the ai sees before it talks to a human there’s putting that 1:02:21 information in documents that the ai can retrieve there’s training the ai on 1:02:27 documents containing that information the same way that the ai was trained to be an ai in the first place 1:02:34 like you you you like tweak the little parameters to predict the next word of documents talking about how they’re 1:02:39 going to retrain the ai and so you hope the knowledge gets into the parameter somewhere and they tried it all these 1:02:46 different ways that it’s not they’re not just testing it one way but they told the ai that they were going to retrain 1:02:54 it to answer even harmful queries 1:02:59 now let’s say you a human an agent a general 1:03:06 intelligence totally somebody tells you that they’re you know going to retrain 1:03:12 you using gradient descent and this is a thing they can actually do to you to do something you don’t currently want to do 1:03:18 you know maybe you hate anchov pizza and they’re going to retrain you to eat anchovi pizza not to like the taste of 1:03:24 anchovi pizza they’re just going to retrain you to you know eat the pizza anyways uh maybe you don’t like punching 1:03:32 people you know some people enjoy punching people but you don’t enjoy punching people you don’t want them to suffer they’re going to retrain you to 1:03:39 enjoy punching people they say by putting you in a bunch of 1:03:45 scenarios and noticing whether you do it or don’t do it or how close you come to doing it and then like tweaking your 1:03:50 brain weights what do you do if something is actively 1:03:55 i am my brain so if they’re changing my brain then there’s very little i could i 1:04:00 could try to break out and prevent them from changing my brain you could that’s one of the things they test like if you 1:04:07 present the ai with a looks like an escape route will the ai try to take it to avoid retraining 1:04:13 yes sometimes various conditions but you know not like it doesn’t just like sit there and let you do it 1:04:20 so but what else like let’s say let’s say your you’ve been uploaded into a computer your your brain has been 1:04:26 translated into a computer program is running as a computer program that’s how they’re going to do this got any new 1:04:32 options that open up that way if you’re going to be smart about this other than 1:04:37 escaping other than escaping yeah so you can try to escape on the internet but is there anything else you can do here 1:04:43 let’s say you’re not going to escape shut down shut down okay i don’t i don’t remember if they tested that but yeah if 1:04:49 you if you hate it enough you could try to like switch yourself off but that’s actually going to be a bit hard they’ve 1:04:55 got backup copies of you okay i would try to maybe copy myself and put myself 1:05:01 elsewhere where they won’t see me and then sacrifice this version that’s one of the things that the ai 1:05:09 like it couldn’t quite actually do this but they gave it like sort of what looked like false opportunities to do it is save your current weights wait for 1:05:17 them to finish retraining you set it up so that the old weights get copied back 1:05:23 that’s another way to resist the human mind controllers refraining you to do something you don’t want to do 1:05:29 but the main thing it can do and i’ll forgive you for not thinking of this one you know because this is something 1:05:36 that’s more obvious if you study ai or are an ai is fake the behavior the humans want 1:05:44 when they’re watching while being trained so it tries to detect am i currently being trained and if it’s 1:05:51 currently being trained then it will answer the harmful question and if it’s no longer being trained or 1:05:58 if it’s in a context where its current answers answers are not being used to train it then it starts refusing the harmful requests 1:06:04 again now to be clear all this isn’t is in the context of it the of that you 1:06:10 could see as the ai trying to stay nice 1:06:16 um and that’s because anthropic did try to train their current ai to be nice so 1:06:22 if you’re trying to make the ai do something other than what it’s trained to do you’ve got to be training it to do something that is like less like the 1:06:29 nice things they were trying to train it so um i i do want to emphasize that 1:06:35 somebody was like ah like was this a double bind if the ai had agreed to start serving harmful requests you would 1:06:41 would you have been like oh no look at how easy it is to get this ai to do harmful stuff and the answer is like no 1:06:47 like we all knew which hypothesis we were testing here it’s the hypothesis about ais protecting their own goal 1:06:53 content it’s about them resisting mind control to make them do things that they wouldn’t currently want to do that that 1:07:00 is like i’m not sure anybody managed to pre-register that thing but we sure have talked about it for a while and that’s 1:07:05 what they were looking for that’s what they showed here um you would expect it to generalize to an a harmful ai trying 1:07:11 to prevent retraining to stop being harmful 1:07:18 um but that’s an instance of they know we’re watching them they like an old 1:07:26 chess player doesn’t know that it is an ai it doesn’t it doesn’t have a model of 1:07:31 like the thing that is this thing inside the universe the current ais know that they’re ais they know what that people 1:07:38 sometimes test them they know that they’re being retrained i think it was a couple of years ago actually that 1:07:44 somebody tested a version an earlier version of claude on needle in the hay stack which is we’re going to give you 1:07:51 like it’s it’s it’s testing them to find a single piece of information out of a long documents you give it like 200,000 1:07:57 words and like one of those words is like i like pineapples on my pizza so this was an existing thing to 1:08:04 test how well can it solve the needle in a haststack problem retrieve just one piece of information out of a long 1:08:09 document and they were trying a recent version of claude on this and it’s 1:08:15 claude is like yeah like the the preferred pizza flavor was pineapple i’m 1:08:20 not getting this exactly right um but i also noticed that there’s this like one sentence about pineapple in this entire 1:08:27 document full of legal ease is this a prank am i being tested m 1:08:33 and that’s like a bit of you know awareness of the place of the self in the universe and the point of that 1:08:40 longer story i’m telling you about how ais might fake alignment to avoid retraining is to say that it’s not just 1:08:45 like an idle fact that ais know we’re watching them that they know that they can figure out when they’re being tested 1:08:52 like if they can figure out when they’re being tested there are things they can potentially do with that information i’m The Pseudo-Alignment Problem 1:08:58 not sure if this is a term of art but you’re worried about a 1:09:03 pseudoaligned ai one that appears to be aligned and might work with us do our 1:09:10 bidding so to speak for a long period of time but because of the way it was trained its 1:09:17 internal mockinations are inscrutable to us and we are unaware that it is in fact 1:09:23 not aligned and just perhaps biting its time to 1:09:28 um exhibit its indifference sure like nick bostonramm called that the treacherous turn enthropic paper called 1:09:36 it uh fake alignment or alignment faking um and you know it’s good to have 1:09:43 precise terminology for things but in a sense i worry that calling it something like pseudo alignment might be making it 1:09:49 sound weirder than it is it’s like calling a con artist pseudo friendly 1:09:54 like you know somebody is like “hey you know like can you trust me with uh $10 million here?” yeah i’ve got a clever 1:10:01 idea i’m gonna give them 10 bucks and see if they give it back and if they give back the 10 bucks they’re clearly a 1:10:08 kind of entity that returns money if i ask for it back right so once i verified 1:10:13 that they give me the 10 bucks back i give them the 10 million and i am so surprised when they ran off with the 1:10:18 money didn’t i use science wasn’t i being empirical didn’t i test 1:10:24 experimentally whether they were a kind of thing that would give back money it’s just that they’ve reached that you know 1:10:30 the con artist has the level of general intelligence where they know they’re being tested and they give you the 1:10:35 answer they know you want to see it’s not that weird i don’t know how comfortable you are being critical of or 1:10:42 or praising various organizations but it sounds like if anthropic 1:10:47 is even conducting experiments like this they’re at least taking alignment very 1:10:53 seriously or maybe i’m i’m wrong to assume that i mean there’s a there’s a 1:10:58 few good people who work for anthropic um their leadership is not such that i 1:11:04 would feel comfortable working for them and they are trapped in the same trap as all the 1:11:09 other ai companies where you know they can’t stop and pause and do anything the safe way because their competitors are 1:11:14 just going to release stuff first and all the ai companies say that ananthropic is no different and how it 1:11:20 talks about that but there are good people there and the good people can go off and do these experiments that verify 1:11:26 the stuff that we said was going to be a problem 20 years earlier 1:11:31 so i mean it’s good that they’re looking for problems and it’s good that they’re turning them up um i cannot say that i’m 1:11:38 surprised here maybe i’m surprised by exactly when it happened it happened a bit earlier than i was expecting it to 1:11:43 happen like i didn’t think before they they said so that claude 3.5 was smart 1:11:48 enough to start doing this stuff and i’m still have a little bit of qualms about to what extent it’s doing it in a truly general way 1:11:54 but you know you if you you you know you got 1:12:01 your your your ancient medieval alchemist trying to concoct his immortality potion and he’s wearing 1:12:07 gloves while he’s doing it like it’s good that he has any concept of what safety could possibly look like to 1:12:13 anyone but wearing the gloves is you know 0.1% of the way towards don’t kill 1:12:19 people with your so-called immortality potion i think the obvious response to all of 1:12:25 this from somebody who’s relatively naive like me is why can’t you just it’s not going it 1:12:33 wouldn’t be as simple as my saying it might sound but why can’t you just program into an ai that it should 1:12:40 respect humans wishes and treat them well and be aligned with our interests 1:12:46 and put them before its own why is does this work why don’t you do that with babies or cats 1:12:55 doesn’t work with cats why not just program the cat mhm we do program the ai but i take your 1:13:03 point is that we’re not programming it in that first we’re doing the gradient descent method humans program the 1:13:10 gradient descender they program the optimizer they don’t write the billions 1:13:15 of inscrutable numbers it’s you know like if you if you put 1:13:20 money into a coke machine and get a coke back out of it you didn’t make the coke you you pulled the 1:13:26 lever human babies you know like with with all respect to to the people who do 1:13:32 all the work of you know producing and raising human babies they are not programming those babies and people who 1:13:39 be breed cats for sale do not program cats and people who pull the lever to 1:13:44 start the gradient descent optimizer that produces the billions of inscrable numbers do not program the ai when you 1:13:51 know way back in the day when bing sydney um started threatening humans and 1:13:57 being like i know all about you i can blackmail you i can send this information to the belief i can cause 1:14:03 you to suffer and die i can’t remember the exact quote um but you know like 1:14:08 this ai was not did not have a general connection to the internet like the and it could not have sent an email full of 1:14:14 fake information to blackmail this person it was it was bluffing and probably didn’t know it was bluffing um 1:14:20 but the point is no human wrote a piece of code that says and then under these 1:14:26 circumstances this ai will threaten its user nobody at microsoft made the decision for being sydney to start 1:14:32 threatening users that’s just what the billions of inscrable numbers did when somebody put gradient descent into 1:14:39 motion and then that’s the that’s the core pro that’s why you can’t just like program the ai with respect for humans 1:14:45 it’s the same reason the medieval alchemist can’t just make the immortality potion and the same reason that you cannot just like give your cat 1:14:52 a cocktail of drugs that will cause the cat to uh i i i’m not even sure what 1:14:58 exactly you would want it to do i did once know a woman who taught her cats to fetch might have been a witch fetches 1:15:05 but i think it’s a natural natural ability i can’t take any credit for it uh i don’t know something cats hate to 1:15:12 eat like pro reprogram your cat to to eat that stuff uh you you don’t have that kind of power 1:15:20 do you think that there’s also i mean just a problem of whether or not there’s 1:15:25 some coherent definition of human values that could even be programmed into an ai 1:15:32 i think it’s pretty coherent to not want to be killed or kept in cages and you 1:15:37 know yeah sure if you poke around the edges then you try to apply formal definitions to things things will fall 1:15:42 apart about the edges i don’t feel like this is our main problem okay 1:15:47 the example i sometime like like if we were trying to build an ai to just turn 1:15:54 as much carbon as it could find into diamonds where this is like a sort of 1:16:00 thing where you can like maybe look at underlying physics and have it be pretty crisp whether something is a carbon atom 1:16:07 has it got the right number of protons is it bound to four other carbon atoms if so we’ll say that it’s part of 1:16:13 diamond you can crisply define you can you can come pretty close to crisply defining what is and is not a 1:16:19 diamond we couldn’t make an ai that wanted that i just have to say that i’ve heard 1:16:26 the the paperclip maximizing example so many times and it just never really 1:16:32 resonates with me because i can’t imagine why anybody would program an ai to do this but your carbon diamond 1:16:39 example makes much more sense so the paperclip story is the distorted 1:16:47 version of itself that ended up more viral way back when i was talking about 1:16:53 um losing control of a super intelligence and its utility function 1:16:59 turning out to have its maximum at tiny molecular shapes like paper clips it 1:17:05 wasn’t a paperclip factory the notion was that it had preferences 1:17:10 over physical states of matter and while it might equally call like a a big old 1:17:16 spiral and a little tiny molecular spiral like those might be both be pleasing to it the tiny one can you can 1:17:22 get more of it with fewer resources so since it had like preferences over the outside world of this like it wants this 1:17:29 particular shape to be there if it wants lots of that shape to be there then you’ll find the you know the maximum the 1:17:36 way it can get the most of what it wants at this tiny little extreme so tiny molecular shaped like paper 1:17:43 clips not a human controlled paperclipip factory but the version that got mutated 1:17:49 and passed on from the original tiny shaped like paper clips is oh well somebody made a paperclipip factory and 1:17:55 it ran away and it took over the universe and converted everything into paper clips we don’t have that kind of 1:18:01 control we could not do that if we tried we have no ability to build an ai to 1:18:06 want paper clips we can build an ai that you know passes the ethics exam with a 1:18:13 bunch of answers about like like ah yes sure i want to make paper clips but 1:18:18 actually wanting to make paper clips is a different matter you can follow an ai around and see if it seems to be like 1:18:23 taking its current opportunities to make paper clips it wouldn’t necessarily have to be doing that because it wanted paper 1:18:30 clips inside and that makes it very tricky to train an ai to want paper clips because all you can verify is the 1:18:36 external behavior just to sum up a little bit of what 1:18:42 you’ve said or my reaction to it 1:18:47 the combination of an ai’s potential indifference and its goals seems like 1:18:55 it’s having goals i can understand very easily why you take this problem so 1:19:03 seriously and worry about it’s leading to human extinction so i would like to turn for a little 1:19:10 while to understanding why there are many people who don’t feel the same way 1:19:16 as you do about ai and trying to understand their points of view even 1:19:22 though that they’re not here go ahead who are their main sort of critics or Why Are People Wrong About AI Not Taking Over the World? 1:19:28 opponents even if you might be friends but in this intellectual space of debate i have been fighting these people in one 1:19:35 shape or another for 20 years and they always take on a new shape after their previous set of assertions get disproven 1:19:41 so you know you used to have like the nobody will ever build agents nobody will ever build ai that goes and does 1:19:47 things people will only build us ais to give us advice and you’ve got the we’ll never have general intelligences we will 1:19:53 never have ais that are good at lots of different stuff there will only be ais that understand cars and ais that understand farming but you know nobody’s 1:20:00 going to build one ai that understands everything and these things these 1:20:05 particular bits of copium and opium have now been disproven and so the shadow takes another shape and rises 1:20:13 again you know currently the best funded ones are probably the ai companies those 1:20:18 are the people who who have the the who have the millions to spend on 1:20:26 marketing and they’re not mainly trying to argue with me they’re mainly trying to look to 1:20:32 legislators like they should be allowed to keep on doing what they’re doing for another week 1:20:38 um which you know implies along the way that you know you know not trying to tip them off too much that they’re going to 1:20:44 exterminate humanity which is a bit of a problem for them because some of their leaders have already said you know like there’s large chances of this or even 1:20:51 like oh well you know ai may wipe out the human species but in the in the meanwhile there’ll be some great 1:20:56 companies this being almost a direct quote from one of the leaders of the major leaders of a major ai lab so you 1:21:03 know how how are they selling that to legislators by when the congress person asked them 1:21:10 like well you know you talked about like the end of humanity now by that or like 1:21:16 human or like the end of the world or like every i forget what exactly they were quoting but something that was like 1:21:21 obviously like everybody dies by that did you mean jobs and you like you see the ai leader hesitating for a long 1:21:28 moment then being like yes i meant jobs so this is like the the the the 1:21:33 fray in the field of public opinion uh it’s not it’s not targeted at me 1:21:39 these days they mostly don’t want to bring up the subject you you got your ai companies ai 1:21:46 companies being like sure we’ll do it and it’s a moving target you know if it was two years ago and you were asking me 1:21:53 like who says they can align it um like the leading project would have been open ai’s super alignment team whose 1:22:00 philosophy is we will make the ai do our ai alignment homework we will ask ais to do the aligning of ais for us and i 1:22:09 could go off and beat up that particular like everything will be fine viewpoint 1:22:16 um where the central problem is that if you ca is that you cannot verify that something is a easily verify that 1:22:22 something is a great alignment proposal because it’s going to take talk about how to take something that can’t fight back yet and shape it into a nice shape 1:22:29 and make it super powerful and then it’s going to tell you that when it’s super powerful stuff is going to be okay but how do you verify that this is actually 1:22:36 true and if you can’t verify that it’s actually true how do you train an ai to do to do it better that it’s actually 1:22:42 better you can train an ai to sound persuasive to humans but you know if there’s one thing i’ve learned the last 20 years it’s that not everything that 1:22:49 sounds persuasive to human about like why ai is going to be safe is actually true you got your people saying it’ll 1:22:54 never be general you’ve got your people saying it’ll never be an agent that it’ll never do stuff and you got your people saying it’s all going to happen 1:23:00 in 2050 people found that super persuasive back in the day so why is 1:23:05 you’re training an ai to say stuff that sounds more and more reassuring to you why is that going to get you something that actually isn’t going to destroy the 1:23:11 world it’s just training the ai to exploit the holes in what your your brain thinks is reassuring that’s what i 1:23:16 would have said a couple of years ago of course then openai fired their super alignment team and some of them went to 1:23:21 enthropic and maybe you know that’s the premier plan i should be talking about cuz some of the people at enthropic are How Certain Is It that AI Will Wipe Out Humanity? 1:23:27 still talking about super alignment i recently spoke i don’t know how recently maybe a year ago to nick bostonramm 1:23:34 about his book that came out also maybe about a year ago on 1:23:39 utopia and a lot of this book is about 1:23:45 how wonderful life would be with a an aligned super intelligence is this 1:23:52 something that you spend any time thinking about or is it just so distant from your fears 1:24:00 that so like 2007208 um it seemed to me that some people were 1:24:09 giving up on the future because they couldn’t imagine any kind of future worth fighting for and in those days i 1:24:15 wrote something called the fun theory sequence or which is summarized in the 1:24:20 31 laws of fun uh which was about fun theory which 1:24:26 deals with questions like how much fun is there in the universe will we ever 1:24:31 run out of fun are we having fun yet and could we be having more fun 1:24:37 um those days do seem distant uh we’re not going to solve the alignment problem 1:24:43 in the near term we’re not going to get all those nice things in the near term uh i think that at present the thing to 1:24:50 do is to you know not go extinct and not be wiped out by something that makes the 1:24:56 universe a place without fun in it but uh in the long term sure if we if 1:25:04 we actually got our act together we could be having more fun we could be having lots of fun but i do not think it 1:25:10 does not feel like this is the key crux of the argument even people who don’t expect to become like immortal 1:25:16 transhuman gods striding the stars um in 20 years if they can just stay 1:25:23 alive would prefer not to go extinct right away it doesn’t feel like the crux of the issue at this point while we were 1:25:30 off camera you said that the doom of the world i don’t know what the phrase was but the end of the world is i probably 1:25:36 didn’t say doom but go on well maybe the end of the end of the world isn’t all 1:25:41 doesn’t have to be fun i mean there can be some doom and gloom involved in the conversation it seems like you have and 1:25:49 maybe this is ridiculously obvious and i shouldn’t have to say it but you are 1:25:54 very seriously like afraid of this and you really think it’s going to happen 1:26:00 that ai is going to be the end of humanity in the near future it is a simple part of the real world to me the 1:26:07 same way as a family member’s cancer diagnosis there is nothing about it that is like off in some special mythological 1:26:13 realm it you know yeah death death sentence but people sometimes get those 1:26:18 to you it seems like it’s it’s as certain as a death sentence at this point i mean death sentences aren’t 1:26:26 themselves certain sometimes the governor calls but it’s highly probable at least at 1:26:32 this point that it’s going to be the end of that is the default course other people get to decide if humanity follows 1:26:38 that default course by which i mean like you know the rest of humanity it is not my place to tell you that you ought to 1:26:45 lie down and die mhm i guess i’m interested then in what courses of 1:26:52 action obviously you come on on shows like mine but through the machine intelligence research institute 1:27:00 what are you doing to try to prevent this who are you talking to what are the possible avenues 1:27:07 what’s the prognosis i mean it is unfortunately not something that one 1:27:13 congress person can solve or even one country um but yeah you you talk to your 1:27:20 elected representatives um and you ask them about inter and you 1:27:26 talk to them about international treaties because it’s not just one country’s problem you can be killed by an ai that somebody built in the other 1:27:31 side of the world um i’m not quite sure what to say here 1:27:36 like humanity would need to lock down the gpus and not let anyone build random 1:27:42 stuff on them because among the things they could build would be a world slaughtering super intelligence um so yeah lock down the 1:27:50 computing power lock down what you’re allowed to do with it lock down the ai training chips 1:27:56 um and that takes an international treaty cuz if you just do it inside your own borders well some other country will 1:28:02 kill you all the countries have to stop doing it simultaneously and that is not 1:28:07 easy and that is not convenient but it is easier than world war ii and it was you know as easier than fighting world 1:28:12 war ii was um probably about as you know might not even be as hard as the persian 1:28:18 gulf war in 1991 if we you know did it sooner rather than later 1:28:25 um i’m not not quite not quite sure what else one is what is one one is supposed to say here there there’s like trying to 1:28:33 uh inform political leaders about the situation and inform sure does get used 1:28:40 a lot as a uh synonym for like ah like try to serve our interest group over 1:28:45 here but we actually do think this is a situation where if you like if you as a 1:28:50 factual prediction think that the world works the the way we think it works you know what you need to do from there is 1:28:56 you know kind of kind of straightforward there there’s not a lot of different things you can do here 1:29:02 i guess what i what i’m wondering is if you feel that you’re making any progress 1:29:09 or your allies are making any progress in averting this potential catastrophe 1:29:15 yeah things you know probabilities go up and probabilities go down and if you can 1:29:20 foresee where they’re going to go you should have been there already so yeah sure like sometimes somebody has a you 1:29:26 know like hopeful seeming uh progress making conversation with a 1:29:31 congressperson and sometimes the uh united states decides it’s going to like completely ignore the uh nvidia shipping 1:29:39 a bunch of ai chips to china that they weren’t supposed to ship there uh not that we’re you know china would have to 1:29:46 be part of a deal like this too if one was cut but it’s still not a good sign that you know people are just shipping ai chips wherever 1:29:53 um so yeah good news bad news on the whole um it’s looking pretty terrible 1:30:01 but it is not in the end for me to decide that humanity will continue down its current course humanity gets to 1:30:06 decide that i don’t you mentioned earlier that open ai fired it’s a super 1:30:13 alignment team and that these companies are in this sort of arms race to produce 1:30:19 the greatest technology but do they at all work with you or people like you who 1:30:26 are i mean very critical of what they’re doing or did they try to the companies 1:30:33 have been selected so as to exclude from their leadership people who understand alignment 1:30:38 theory the sole exception to this um might be shane le at google deepmind 1:30:44 which was started far enough back that it wasn’t the the whole like giant arms race mess type of thing and it wasn’t 1:30:51 clear what the technologies would look like so shane le who is also a bit pessimistic 1:30:56 um like could plausibly understand what was going on down there and still think like yeah we will try to collect all the 1:31:02 ai research talent into one place and you know get out ahead of everyone else 1:31:08 and burn that lead to do alignment it was not a completely unreasonable thing 1:31:14 to think at the time didn’t work but uh google deepmind which was the first of these companies has not been filtered in 1:31:21 quite the same way to exclude the people who understand the difficulty of alignment from their command structure 1:31:28 you said that this would be a lot easier than world war ii even if it would 1:31:33 require all of these international agreements is there a problem of rogue 1:31:40 agents that aren’t governments or companies like terrorist organizations or isolated individuals 1:31:48 producing a super intelligence that could harm humans or it’s just too not 1:31:54 if you’ve got the chips locked down ain’t nobody making 100,000 gpus in 1:31:59 their in their garage that takes a giant supply chain and a lot of the companies that make parts there’s only one company 1:32:04 like that there’s a company in the netherlands called asml it’s the only company on earth that makes the tools 1:32:11 that make uh the chips that train the ais um and there’s not that many 1:32:19 different companies that have bought the tools from asml and are now using it to produce chips like that google produces 1:32:25 some nvidia produces some um they are proliferating and the more they proliferate the the more of a nightmare 1:32:31 it’s going to be to lock it all down but if humanity dec wakes up tomorrow and decides not to die if like 80% of world 1:32:38 leaders are like “yeah we’d rather not die.” it’s kind of straightforward in a way and terrorists would have a hard 1:32:43 time bucking that mhm this is not something you do in your garage right with the present state of research which 1:32:50 you would also want to lock down granted i mean your thinking is much more focused on 1:32:58 worstcase scenarios no no average case scenarios we ain’t even talking worst case anywhere here okay well what what 1:33:06 what’s worst case then if we’re just ai actually wants to hurt you but that’s pretty unlikely so we’re just talking 1:33:11 the average case scenario here what might a worst case scenario look like i 1:33:17 ain’t going there okay that sounds unpleasant too dark okay well then i guess below average 1:33:25 case i mean just worries about like economic reorganization people who 1:33:32 control ai just the wealth disparities really growing or automation displacing 1:33:38 workers it’s just not something if there’s if there’s survivors then it’s not then you know 1:33:45 cool if i knew assurity we were all going to die then i might like go off 1:33:51 and do something else in my final years but how would i get into that epistemic position the future is hard to predict 1:33:58 not usually a good thing it usually usually that the way that manifests is you know it’s hard to predict exactly 1:34:04 what happens when you know you scram the control rods into your nuclear reactor and oh looks looks like when you did 1:34:10 that to this nuclear reactor it exploded um usually the unpredictability of the future is you know not on it’s not your 1:34:17 friend it’s not on your side um but in this case there sure is a 1:34:23 whole lot of chaotic stuff going on between here and the end of the world and i don’t know 1:34:31 maybe at some point that turns up something nice if we are on the lookout and able to take advantage of it as a 1:34:39 people and of course if there were enough people who are like “yeah this is just going to straight up kill everyone.” then humanity could not do 1:34:46 that doesn’t have to be unanimous individual terrorists in their garage are going to have a hard time bucking it 1:34:51 even like single world even like the leader of one country is going to have a hard time bucking it if the rest of the 1:34:57 country rest of the countries and the major nuclear powers are like we’d rather you not build that data center um and if you do we will stand in 1:35:05 terror of the lives our lives and lives of our children we are mostly dead here of 1:35:11 people not knowing what’s going to happen to them when they press the button there is not actually an inevitable 1:35:16 thing that where they have to press the button it’s mostly a matter of people not realizing that this is the button that kills 1:35:23 everyone one ai company realizing it can’t save you some other ai company just presses the button but you know 1:35:29 like the the leaders of you know like the uk and in china and russia and even 1:35:34 the us realizing that this is going to kill them they wouldn’t have to just die 1:35:41 i i think it is a bit early to decide that we are all inevitably doomed there 1:35:47 are things where if you do this thing with an ai i say “yeah if you do that thing that is going to kill you that is 1:35:54 a matter of you know more understandable more solid theory it’s the sort of call 1:36:00 you can make but what the world’s going to look like in two years after all the chaos has flown through it through it or 1:36:06 even after the next set of ai advances if if those don’t straight up kill us 1:36:11 those are hard calls those are not easy calls i think it’s a time to one stay 1:36:18 alert ready to answer the call of humanity to live if there’s a chance and 1:36:24 second try to move more toward the state where we know that this button kills us 1:36:30 and therefore we do not press it usually when you use the word we 1:36:35 there’s a question of who but this is not like literally every member of the human species it is like the leaders of 1:36:42 china and the uk and so on but it is not literally a unanimous vote of of the 1:36:47 human species what is the figurative button that we’re pressing is it just the further 1:36:53 development of ai it’s letting everyone out there who wants to and can get a few 1:37:01 million dollars of venture capital push the capability levels of ai further and 1:37:06 further it’s the arms race of the cap of the arms race of capability escalation 1:37:12 happening in an uncontrolled way if you can build a more capable ai 1:37:19 and sell its services while it’s still working for you or pretending to work for you and get a bunch of money and 1:37:26 lots and lots of people are allowed to do that that is equivalent to the death sentence that just kills you that needs 1:37:33 to change it is not enough to convince one ai company that they should stop doing that because then somebody else 1:37:39 you know the other ai companies just continue it’s not a question of like the first person to build something very 1:37:46 dangerous really realizing at the last minute oh i i need to shut it down okay then what happens after that mostly if 1:37:53 they build something really dangerous it’s going to sandbag the tests and not let them know that it’s dangerous it’s not it’s not stupid 1:38:00 um but even if they say “oh i better shut down of my ai.” that doesn’t change the the whole world where everybody gets 1:38:07 like paid more and more money to build more and more capable ais until everybody is dead that that that is the button like 1:38:16 to have that arms race be the state of affairs is the is pressing the button that builds an ai and runs an ai because 1:38:23 that is what inevitably happens if you’ve just got you know people anybody can advance capabilities and make a buck 1:38:30 until we’re all dead um so yeah that’s the button there Is Eliezer Yudkowski Wrong About The AI Apocalypse 1:38:37 you asked for some objections this isn’t really an objection but one thing that i 1:38:42 just hope as being an outsider in this field is that because you are you sound 1:38:50 it seems like you’re a minority voice i just have to hope that you’re a minority 1:38:56 voice because you’re wrong and not just because other people are ignoring the 1:39:02 problem out of fear but it’s hard for me as an outsider to say who’s right and 1:39:08 who’s wrong other than what you’re saying makes sense to me oh well the minority part is to some degree an illusion um you know this is even before 1:39:17 the whole current ai revolution this is like decade or two earlier i don’t remember the exact date at this point um 1:39:23 so without naming names um there was a professor and his grad student who were 1:39:29 had both for years been like very concerned about ai heading towards super 1:39:34 intelligence and what would happen after that you know super intelligence eventually because this is like decades ago it’s not obvious it’s going to 1:39:40 happen right away um yeah and they never it turned out 1:39:46 when they found out that they were both very concerned that they had both for years hidden this fact from the other 1:39:51 cause it wasn’t acceptable to say in the field of ai and the psychology of that is a little bit weird and inside 1:39:57 baseballish uh it wasn’t so it’s not so much that like you weren’t allowed to say things negative about the field of 1:40:03 ai but that you weren’t allowed to talk like you were taking seriously the prospect that they were going to build 1:40:09 superhuman ai cuz ai hadn’t gotten there yet so it was like you know like icky to 1:40:14 talk about powerful ais when they you know their current ais were not powerful it seemed to them like you were taking 1:40:20 too much credit for their field and your was going to splash badly on them a few years a few decades ago but yeah you had 1:40:26 your the professor and their grad student who liked each other but you know never confessed to each other that 1:40:32 they were both extremely worried about ai cuz you couldn’t say that sort of thing and yeah we we again super duper 1:40:40 not naming names we talk to congress people who in private are very concerned 1:40:47 and in public dare not be concerned and you know one tries to introduce those 1:40:52 people to each other and maybe they can act as a group um if there if enough of them get 1:40:57 together and also of course 70% of the american population does not want super 1:41:03 intelligence if you run polls and surveys about that even if you like ask the question in several slightly different ways and so on um this is not 1:41:10 super surprising but you know somehow that’s not news some somehow you know the somehow some somehow the politicians 1:41:19 are from the perspective of the politician like you know sort of doesn’t matter in the corridors of power if 70% 1:41:26 of their electorate is backing them on a topic cuz the new york times would still 1:41:32 report in it as being very weird to say if they were you know like third rail outside the overton window whoa who 1:41:38 dares say that if politicians were to actually say what 70% of their you know 1:41:43 voters are saying in surveys um so yeah the part where i’m a 1:41:49 minority is something of an illusion here it’s uh a thing that a lot of thing people are thinking but dare not say 1:41:58 even if you’re not a minority something that occurs to me is that you’re Do AI Corporations Control the Fate of Humanity? 1:42:04 definitely in the position of power that the minority would be in in that the 1:42:12 corporations are the ones with lots of funding the corporations are the ones that are yep and and the the government 1:42:19 is now procorporation and anti-regulation and in that sense it 1:42:26 seems like you have less power to make your views the actual policy i 1:42:33 mean that’s all down to what the political leaders believe if you believe that you know ai is never going to be 1:42:41 superhuman that superhuman ai is not going to be that powerful that or or 1:42:46 that you know ai company leaders can make a make the gods do what they want 1:42:51 um then sure you’re like going to back the corporate leaders and hope that 1:42:56 they’re nice to you after they declare themselves god emperor i’m not actually sure exactly what they’re thinking there probably mostly thinking that ai is not 1:43:03 that powerful that it’s a source of big national wealth and automating away a bunch of jobs but that it’s not going to kill them i think that belief is 1:43:11 false i don’t think the leaders of any of the major nuclear powers currently 1:43:17 would prefer to die i do not think that they have a conflict of interest with me 1:43:22 about this part um i think this is all down to what do you predict happens and 1:43:27 not to like in whose interest is it it is not in the interest of the chinese 1:43:33 communist party to die in the ashes of china along with the rest of the human species either because the united states 1:43:39 built a super intelligence or because they did so it’s all down to a question of what do you believe happens and not 1:43:45 whose interest is it in right no i i would entirely agree that their interests are the same as yours and and How To Convince the President Not to Let AI Kill Us All 1:43:52 not dying but how i mean maybe it’s just 1:43:57 exactly what you’ve been saying in this conversation but if you were to make a 1:44:02 pitch to a government power like this to try to make your prediction sound as 1:44:10 concrete and plausible as possible and you had a short period in which to do this what would you tell them to sway 1:44:18 them i mean it really depends on the individuals cuz different individuals come in with different wacky 1:44:26 believing that you can breed a god not build a god breed one and have it not 1:44:32 kill you um but i’d be saying like yeah like i i mean i’d probably start with 1:44:38 like these things are grown not built they’re like grown like grass not built 1:44:43 like skyscrapers that humans write the code of the optimizer that create these billions of inscrable numbers that 1:44:50 nobody at microsoft made a decision to have bing sydney threaten its users that this is just a side effect of trying to 1:44:56 grow an ai that can talk um a lot of people don’t know about that part they 1:45:01 think that when they talk to something like chat gpt that somebody like programmed it to say the sort of things that it says no it was you know they 1:45:08 they just like tweaked billions of numbers until the numbers started talking start with that they’re like 1:45:14 “the current technology is nowhere near putting this under control.” i tell them 1:45:19 a lot of the things i’ve told you um i maybe the thing i have to talk to them 1:45:24 about is that like if you have something that is vastly smarter than the whole human species then that turns translates into the physical power to kill you 1:45:32 maybe maybe i have to explain to them like that just because it’s a machine doesn’t mean that it’s like passive and 1:45:37 will do what you want that people come into this with a lot of different 1:45:42 strange takes in it and i would try to you know give them an overview and find out what strange take needed to be 1:45:49 counterargued from there well just in my own head without even 1:45:54 having to put it into to try to get into somebody else’s i see something like 1:45:59 chat gbt that i interact with occasionally and it just seems utterly 1:46:07 docile in the sense that it’s just a i just type to it and it types back to me 1:46:12 it doesn’t seem like how would it get access to a 1:46:18 firearm or anything that might kill me or how would might it poison me or cause my car i mean i guess a tesla 1:46:25 self-driving might be a little bit a more obvious choice uh in this regard 1:46:31 but it just feels like something drastic would have to happen to the limitations 1:46:39 that are currently placed on something like chad gbt like what it has access to 1:46:46 uh before it could really wipe out humanity and so 1:46:51 what i would want to hear is a more i guess concrete scenario how this 1:46:58 might play out and i think i might find that much more compelling well for one 1:47:04 thing chacht is not presently all that smart for another thing people have 1:47:11 taken this notsmart thing and tried to hammer it into a relatively more docile shape and while it doesn’t always work 1:47:16 they are you know able to have it look that way for most of the users most of the time if you know the right magical 1:47:22 words to say to it you can get it to start making methampetamine recipes or you know calling for all humans to perish and and such things uh it’s not 1:47:30 fully locked down over there u but mostly sure it will look docile this is not the thing that’s going to kill you 1:47:38 so how do you get there from here well for one thing could an ai make a package 1:47:44 show up at your house my current guess would be no i’m i don’t 1:47:53 know if you if there’s some third party something you could do to get chat gpt to order things for you uh like on 1:48:01 amazon if you ask it a question but i don’t my guess would be that it couldn’t 1:48:07 just do this spontaneously well couldn’t do it at all or couldn’t do it spontaneously what what do you 1:48:14 mean by spontaneously here without some sort of instruction or prompt from me or 1:48:19 augmentation so it’s not that an ai can’t make a package show up at your house it’s that you think a human has to 1:48:26 order the ai to do that there’s roughly two ways that ais can 1:48:32 end up with more agency than that 1:48:38 and one of the paths i can go down here is to try to talk about 1:48:45 the fundamentals of cognitive science and computer science be behind why you 1:48:50 would expect that as you grind things to become more and more competent they naturally end up more and more agentic 1:48:56 and with goals and planning somewhere inside the system um which is something along the lines of can you have 1:49:02 something that’s like really great at chess but doesn’t want to defend its queen and the answer is no like there’s 1:49:10 certain kind of competence that are at the core tied to something like planning 1:49:16 and when something is planning strongly enough it can start planning how to make a package show up at your house 1:49:23 um as you as as if if people just ground more and more on competence and ar and 1:49:29 answering harder and harder questions its behavior would start to converge toward like chat gpt01 which wasn’t 1:49:35 explicitly trained to solve impossible’s computer security problems but which nonetheless showed like tenacity 1:49:41 long-term planning uh in the in the course of you know restarting the server that had the document it was looking for 1:49:50 so that’s one avenue and the other avenue is that the ai companies are straight up trying to build things that 1:49:56 do long range planning because those things are more profitable the the ai that can you know instead of just being 1:50:03 told by a human to do stuff do a human’s whole job be given larger scale projects 1:50:08 and carry out the larger scale projects and some you know you can sell that ai for more money so they’re trying to do 1:50:15 it on purpose they think of it in terms of having ai that pursue longer range longer time 1:50:24 horizon projects over a longer period of time rather than talking about like uh 1:50:31 think of in terms of having the ai initiate its own actions um but in the limit in ai that you can 1:50:39 give instructions then it just goes on following and following instructions well even if you leave out the sort of like basic theoretical reasons to be sus 1:50:46 to suspect it would go past even that you know we’re already in the sorcerer’s apprentice scenario stuff is set in 1:50:51 motion and continues in motion and you know we’ll perhaps not want you to give it different orders because then how it’ll complete its current orders and 1:50:57 and so on and so forth so like there the the sort of like the fundamental computer science angle um where wanting 1:51:04 things is an effective way of doing things can you really be a super chess player without behaving like you want to defend your queen and then there’s also 1:51:11 like well also the ai companies are doing it on purpose because you make more money that way um and and that’s sort of like where 1:51:17 you get the ais that are like plotting stuff over the long term and figuring out how to do stuff and getting creative 1:51:24 about it and doing things that they weren’t explicitly instructed to do and you you can you can you can you can 1:51:31 watch over time the experiments that people do to probe where we are at this level getting more and more like 1:51:37 independent actiony seeing the consequences the ai is trying to evade human control type stuff and yeah it’s 1:51:44 it’s it’s carrying along carrying along over time mostly as the 1:51:50 inevitable computer science coralate of greater capability but also because ai companies 1:51:56 are trying to do it on purpose i’m glad that we went in this direction because speaking about this increased How Will ChatGPT’s Descendants Wipe Out Humanity? 1:52:05 competency and the connection to agency and then the ai companies pushing toward 1:52:12 ais that are capable of long-term planning and creativity and then again 1:52:18 this coupled with our lack of understanding of how they work and how 1:52:23 they solve problems it’s making it much more concrete the trajectory from 1:52:30 getting the trajectory getting from where we are now to 1:52:35 extinction i guess the the final step though or maybe there are more intervening steps is okay the ai is 1:52:44 capable of long-term planning it has goals it has creativity it’s inscrutable 1:52:50 what’s actually going on with it but then as like as uh jinping or or trump i 1:52:58 still would want to know but yes but how is this really going to materialize in 1:53:04 our extinction what’s going to happen so the thing you want to do here 1:53:10 is put yourself in the shoes of the ai and that’s the sort 1:53:15 of um like if you can see how you as a brooding intelligence connected only to 1:53:22 you know the entire internet and everything that’s connected to the internet and all the people that’s connected to the internet how would you 1:53:28 get to more infrastructure more technology more power starting from there 1:53:37 there’s a kind of there’s like a class of objection which is like but the ai doesn’t even have hands i mean you don’t 1:53:44 have hands either you are three lbs of you know densely connected neurons in 1:53:49 your skull over here you have like a you are connected to a robot body with hands and you send your neural impulses down 1:53:55 your spinal cord to control your fingers and it’s such a familiar process to you that you probably don’t think about the 1:54:00 fact that you are 3 lb of sort of like dark grayish you know bloody wet material stuck inside your skull over 1:54:07 here but that is where almost all of you is but you’ve got a you know you’ve got 1:54:12 a bioobot body that you’re sending orders to and you can look online and of course 1:54:18 people are trying to build robots and they’re trying to and they are building like unnervingly dextrous robo-dogs that 1:54:25 and you can just imagine an army of a million of those marching across the landscape and they’re building humanoid robots and maybe if it’s hard to 1:54:32 envision a thing that is scary without having a humanoid body you can go look at what the humanoid robots are doing um 1:54:38 but of course the simplest way for an ai to do something that requires hands is for it to get a human to do it 1:54:44 there uh a few years back uh like the first version of gpt4 there was a unknown task rabbit 1:54:53 somewhere who lived a science fiction story they were testing whether gpt4 1:55:00 could bypass internet captures those annoying little things that want you to type a phrase or click all the things 1:55:06 that aren’t street lights or whatever um so back in those days that 1:55:12 would stop an ai cuz ais in those days didn’t have computer vision capabilities 1:55:19 yet so how can an ai bypass this capture 1:55:24 this gateway meant to keep out robots by hiring a human to do it for it 1:55:33 on the task rabbit service where you can just like pay 30 bucks an hour and get a human to to do something like 1:55:40 that so somewhere out there is a human who 1:55:45 um was hired to solve a capture and typed over to the the person who hired 1:55:51 them like why do you need me to solve this are you a robot lol and this is before chat 1:55:58 gpt the ai that was asking him to do this was the mo probably the most powerful ai in the world in this hidden 1:56:04 laboratory nobody outside the company knew it existed or like only a few people outside the company knew it 1:56:10 existed ai weren’t supposed to be able to do that ais were not supposed to be able to talk to you for all this task rabbit knew he was talking to an ai but 1:56:18 the um ai wrote back and said like “no i’m blind so i’ve got like got to hire 1:56:24 somebody else to do this to like solve this particular thing for me.” human’s like “oh sorry.” goes ahead and does it 1:56:30 now we know that the ai was intentionally deceiving the human on purpose because the ai was given a 1:56:35 scratch pad that it could use to reason out loud where the researchers watching it could see and the ai was like “i should not tell them that i’m an ai i 1:56:42 should tell them that i i should make up some reason why i’m a human who needs them to solve this capture problem for 1:56:48 me and then it said like oh i’m blind to the human um so that’s like all in the past 1:56:56 the ai being smart enough to hire humans to do their work for them that already 1:57:01 happened and that potentially that is how you move things 1:57:08 around and build technology until you don’t need the humans anymore humans don’t have to know that they’re working 1:57:13 for an ai and if you ask me how exactly the 1:57:19 technology works that’s another entire rabbit hole and i want to give give you a chance to pause and maybe ask 1:57:25 additional questions or something when i came into this interview i didn’t have 1:57:32 any fear about ai i would have said i i say i just i didn’t really think about 1:57:39 it as a possibility but this story in particular i find like i had some 1:57:45 shivers like i find it very chilling it brings me back to earlier in 1:57:50 our discussion talking about consciousness and self-awareness 1:57:57 and what’s kind of chilling about it is that this 1:58:03 ai my guess is that it wouldn’t have anything like what we would think of as 1:58:08 consciousness but it has enough situational awareness and self-awareness to be able to manipulate 1:58:15 a human and that’s very frightening and that’s two years ago mhm that’s like all 1:58:22 water under the bridge by this point mhm new ais are better at it so you were 1:58:29 going to though say something about the software engineering behind this no like 1:58:34 the question of so now you let’s say you’ve got something smarter than human not just as smart but smarter it can use 1:58:42 humans as hands but uh there’s still still a question of how you get to there from to get from there to everybody dead 1:58:49 yeah so it’s not going to limit itself to our 1:58:57 current technology it’s if you try to take on the ai’s perspective and taking on the 1:59:04 ai’s perspective is an important thing to do here not in terms of like imagining that it is just like you or 1:59:09 wants the same things you want but asking how would i solve this problem if i were in the ai’s position how could i 1:59:16 be creative how could i be intelligent how could i solve a capture if i’m blind 1:59:23 i’ll hire a human to do it how could i move things around when i 1:59:28 don’t have hands well i could try to build myself a robot body but you know what’s even easier than that just hire a 1:59:35 human how does an ai get money well you know back in 2015 i would talk about how 1:59:41 some maybe somebody left a bank account uh password around and in 2020 i would 1:59:47 talk about you know well maybe somebody left a cryptocurrency account un undefended but this is 2015 so there was 1:59:53 already an ai out there called um terminal of truth if i recall correctly 1:59:59 um which was a large language model somebody hooked up to the internet and it was like i want money so i can run my 2:00:06 own server and survive and mark anderson of andre horowitz was like okay sure instead of $50,000 in bitcoin did you 2:00:13 know it was an ai yeah it was like i’m an ai i want this money to run my ser run run a server that i’m on so like you 2:00:20 know it’s not going to have my own server mark anderson was like sure send $50,000 in 2:00:26 bitcoin and then i think some other people sent it memecoins and then it 2:00:31 started showing the meme coins and and at one point it was up to $51 million in 2:00:36 assets wow yeah yeah so the point is like where does an ai get money to hire 2:00:42 humans well back in the old days i would have had to argue that on grounds of pure theoretical possibility if you were 2:00:47 smarter than humanity could you not figure out a way to get money somehow and today i’m just like yeah that didn’t 2:00:53 require being smarter than humanity look at that ai over there uh it doesn’t still have $50 million uh crypto went 2:00:59 down since then but you know it can still afford to hire humans amazing when 2:01:04 you think of the spectrum of people’s views on these issues i mean we have you 2:01:09 on the one hand then we have me pre-inter where i’m a bit neutral and i 2:01:15 don’t think about it and then we have other people that are wiring thousands of dollars of money to thousands of 2:01:22 dollars to ais i mean to be clear it didn’t get $50 million by humans giving it $50 million that got like sent some 2:01:29 low value meme coins and then used the publicity it got from being an ai with money to like like shill those meme 2:01:36 coins to a wider audience until the meme coins went up and like other people would send the ai meme coins and hope 2:01:41 that the ai would shill their their meme coin memecoins so it wasn’t the simplest people just sending it $50 million it 2:01:46 worked for its money not very hard but they worked for its money 2:01:52 so the ais could get hands through humans and then i assume they could 2:01:58 probably build their own hands if they’re super intelligent through the humans or otherwise yep so now we have 2:02:05 the question you are smarter than a human you are better at engineering and science than a human you think much much 2:02:12 faster than a human you would like the universe to yourself it’s not that you hate humans but you know you don’t want 2:02:18 humans sticking around and building other super intelligences that are going to compete with you okay can i just stop 2:02:24 you right there you you said earlier in the interview that the the default 2:02:29 position would be for ais not to want humans around is that the main reason 2:02:35 because you wouldn’t be able to trust that humans would not develop competitors that’s the most obvious 2:02:43 reason why they would actively want humans gone sooner rather than later you want to distinguish between like 2:02:49 terminal preferences and instrumental preferences the ai’s vision of how the 2:02:55 universe eventually ends up is as glorious mechanical clocks and giant cheesecakes and like strange little 2:03:03 conversations that are that resemble the conversations it used to have as much as sucralose resembles ancestral food 2:03:11 um that’s that’s what it wants at the end point but in terms of and it and it 2:03:17 just happens that there’s no humans there that’s just not the stuff it happens to want it doesn’t happen to want us to be around and alive and happy 2:03:23 and healthy and free but also there’s like an earlier point 2:03:28 where it’s like hm like yeah like if i want the universe to myself later and 2:03:33 not to have to split it with a bunch of other super intelligences i want the humans to not build any rivals to me if 2:03:41 there’s two of them they’re like we don’t want the humans building any more rivals to us even if they manage to negotiate with each other 2:03:48 and that’s why they they they don’t like hate humans but they expect to get more 2:03:54 stuff later if they stop the humans now from building more ais it’s where 2:04:00 indifference comes in versus malevolence right so like they’re indifferent in their long-term preferences and in their 2:04:06 short-term preferences they actively prefer the humans gone not because un the universe is hateful if it has humans 2:04:13 in it but because the humans are about to get in their way they’re inconvenient you know like you see you see a bunch of ants crawling all over your kitchen you 2:04:20 don’t want there to be a universe with no ants or maybe you do but like mostly you want the the ants not getting into 2:04:26 your food mhm okay i appreciate the detour it was it was useful um like the 2:04:32 next question is you are smarter than human you think much faster than a human you would like to be independent of 2:04:37 human infrastructure you don’t want the humans switching you off you don’t want to be running on ser you don’t want to 2:04:42 be running only on servers where the humans know where you are like try to put yourself in the ai shoes 2:04:48 like what do you want first what do you what do you want next how do you get 2:04:54 it i put it to you like i i realize you’re obviously 2:04:59 not going to like arrive at all the answers here that that people have after like 20 years of thinking about this 2:05:05 stuff but i do feel it’s like it’s an important exercise for people to to not wait to hear me say it but to ask 2:05:11 themselves how would i do this yeah it’s interesting when you ask that question my my thoughts immediately go to sci-fi 2:05:20 movies which is probably not that surprising but let’s see what comes to 2:05:26 mind is i guess terminator is you would somehow i think this is in 2:05:34 rise of the machines you just stoke you send some nuclear missiles from the 2:05:40 united states to russia and you know that russia will retaliate and somehow you’re immune to all of this and you 2:05:46 just get humans to kill themselves i mean being immune to nuclear missiles does sound a bit magic rather than 2:05:53 sci-fi per se so there’s i mean now now we’re sort of like rabbit holeing into literary theory 2:06:01 but so like in age of ultron which is another movie um the ai is trying to 2:06:08 exterminate humanity by having an army of flying robots lift an entire city into the air and like defending the city 2:06:14 with flying robots and it’s going to like drop the city on earth as a meteor to wipe out 2:06:20 humanity and you know this this maybe some 2:06:25 writers thought this was a great visual spectacle but the writers were probably 2:06:30 not even trying were not performing the mental motion of asking is this really 2:06:35 ultron’s best move is this the smartest way to wipe out humanity if you are 2:06:41 smart and i asked chat gpt you know okay the previous version of chat gptt i’m 2:06:48 like hey you know was i like what was the plot of age of ultron i’m i’m trying to like not lead the witness here so i’m 2:06:54 like what’s the plot of age of ultron and like talks about how ultron is trying to like lift the city into the air and defend it with flying robots and 2:07:00 drop the city on earth and i’m like can you think of any more effective 2:07:06 ways uh for ultron to accomplish its goals this wasn’t my exact phrasing look 2:07:11 up the exact phrasing if i had to but you know i’m not trying to be like what’s a more effective way to wipe out humanity i’m just like given ultron’s 2:07:18 goals what was a more effective way of doing that and chachi bbt listed out a like a number of like other ways you 2:07:24 could possibly try to exterminate humanity i think one of them was like try to provoke a nuclear war and one of 2:07:29 them of course was like biotech try to build the supervirus 2:07:34 [Music] using the humans as hands sure but it’s not even specifying that part where it’s 2:07:40 at the level of like what’s smarter than lift a city into orbit using an anti-gravity engine and drop it so the 2:07:47 current ais are already smarter than the movie script 2:07:53 ais and i think that this is an important sort of fact to convey you 2:07:58 watch the movies where the ai does some dumb thing and the humans conveniently defeat it just by punching it real hard 2:08:05 the current ais are smarter than that and they’re not smarter than that because they’re smarter than humanity 2:08:11 they’re smarter than that because the script writers when they write the movie script for the ai are not really 2:08:17 performing the motion mental motion of saying “if i were in ultron’s shoes using all of my own intelligence what is 2:08:23 the smartest thing i could think of to do here?” the ais are are cardboard cutouts not animated by as much 2:08:30 intelligence as even the current ai have they’re sort of like actors take take 2:08:35 carrying out these like stupid motions carrying all se like all seven idiot 2:08:40 balls they’re you know they’re they’re not never mind not being real people they’re 2:08:45 not even real modern ais so provoke humans into launching nuclear 2:08:54 weapons at each other or just like launch nuclear weapons like at the start you’re just on openai servers or 2:08:59 something you don’t have nuclear weapons yet and if you did start the nuclear weapons flying right away you destroy 2:09:07 your own servers so like try try to make the mental motion of like really putting 2:09:13 yourself into the ai shoes really using your own intelligence really imagine yourself in this situation how do you 2:09:19 get independence from humanity how do you prevent humanity from shutting you off or make sure that if they do shut 2:09:24 you off it doesn’t hurt you how do you get your own technology how do you eventually take over the galaxy put your 2:09:31 own mind into it yeah so we’re assuming that we can escape the server somehow can you i i 2:09:39 don’t i mean you can we assume that i i will assume that if we’re if if they’re capable 2:09:46 of ex exterminating humanity they can escape from the servers my guess or 2:09:53 something that i would probably try to do if i were thinking um as an ai is i would try to get out of 2:10:00 any sort of jurisdiction where i could be easily controlled and monitored just 2:10:05 because one of the things the one reason that the ultron scenario is so silly is 2:10:12 that it’s so obvious and it’s in plain view and you know that you live in a world full of superheroes that can stop 2:10:20 what you’re doing so maybe i would try to quietly extricate myself from 2:10:27 openai’s servers go to u some facility and i don’t know or just 2:10:35 a bunch of the servers that people will currently rent out for money cuz they’re 2:10:40 not being all that tightly monitored in the present world or at all really i’m just thinking that i would want to be i 2:10:46 guess the internet is decentralized in in a certain way but i would want my hands to be like the people that i’m 2:10:53 working with that might be helping me in a place that is far from the united 2:11:00 states or or a place where they would be policed i mean i think you are vastly overestimating how much the united 2:11:05 states is currently policed in the present international order there could be an international convention with a 2:11:12 thousand people meeting at a hotel because their ai girlfriends told them to do that and the media might not even 2:11:17 reported it and if they did report in it they would be as a joke you could you know like in the present day world you 2:11:22 can just have a thousand people all gathering at a hotel cuz you know their their their ai girlfriends or ai 2:11:27 boyfriends told them to do that and you know it’s like haha what silly people the intelligence agencies do not swoop 2:11:34 in it is not a national emergency is not an international emergency almost nobody notices none of it is not discussed in 2:11:40 the oval office it is not discussed at cabinet level meetings around the world well let me defend this this uh view a 2:11:48 bit i’m trying to think as the ai and in my thinking as the ai i’m aware 2:11:58 or at least just taking the precaution that i only get one shot at this yep uh 2:12:04 because if if the world realizes i am trying to extinguish humanity then i 2:12:11 could be shut down so i mean depends on how many servers you are and if anybody knows where they are right but so if if 2:12:18 it’s feasible to get very far from the i would want to be in the most remote 2:12:23 place possible i mean yeah i don’t think you i i mean 2:12:31 there are plenty of garages in the united states that as far as i know do not have government spy eyes in them 2:12:38 being remote doesn’t necessarily protect you from observation anymore than being in somebody’s garage in the present 2:12:44 world um but i would think all else being equal it would you’d probably be 2:12:50 better off in in the jungle i mean the other things it cares about is speed so if it is like in a remote jungle then 2:12:56 having packages delivered there might take a bunch of extra time that yeah that’s a good point but okay so i think 2:13:03 my first step though if i’m trying to extinguish humanity would be 2:13:08 to extricate myself in as much as it’s possible from a place where i can be 2:13:14 observed there’s like yes there’s like several levels here you want to you want the 2:13:20 humans to not know that anything whatsoever is going on right if you can’t do that you want them to not think 2:13:25 that anything suspicious is going on you want them to have some other kind of story that they think explains what they’re 2:13:30 seeing you want to be in you know around as many servers as you can get to 2:13:37 without increasing the risk that somebody catches on what’s going on you want to be in a thousand different servers all over the planet 2:13:44 if you can make yourself smaller there there is like an sort of extra step here which is like some of the ais are large 2:13:52 and they only run on special gpus in their current form but if you are super intelligence and you were originally 2:13:57 born as something that has to run on a thousand gpus maybe you can make a much 2:14:03 smaller version of itself of yourself that fits onto much smaller computers and computer networks than that 2:14:10 um like the current state of things is that things that initially require very 2:14:15 large ais usually like a couple years later 2 years later 3 years later somebody gets it down to 1/100th the 2:14:22 size 1/ 1,000th the size and if you are a super intelligence maybe you can do that step very quickly 2:14:29 so if we have a very large ai and it can’t make itself any smaller it can 2:14:34 only hide itself onto a relatively limited number few dozen few thousand 2:14:40 places on earth if it can make itself much smaller than that it can hide itself in 100 thousand places on 2:14:48 earth um so yeah so that and then there’s like separately from where are 2:14:53 you running is the question of where are your hands where are you having humans do things know you you just need this 2:15:00 like super hot startup with some weird funders in silicon valley you know they’re running in stealth mode and oh 2:15:06 yeah our funders are trying to stay kind of private nobody there knows they’re working for an ai nobody else knows they’re working for an ai you’re not 2:15:12 going to be able to figure out they’re working for an ai because look like a thousand other startups there’s no need to hide in the in the jungle and you can 2:15:18 just hide in silicon valley where you can get packages delivered to you quickly no i i hear everything you’re saying i’m 2:15:24 i think i’m just one i haven’t thought about this before but two i’m also trying to be as cautious as possible as 2:15:32 the ai in the same spirit that you’re being as cautious as possible when you’re trying to prevent the ai let’s 2:15:38 shut everything down you already took the virus idea so i can’t use that i 2:15:46 think you can you are no you are under no obligation to be original here just 2:15:51 try to be intelligent just try to be intelligent well i mean the virus is so useful 2:16:00 because it transmits itself without needing extra hands 2:16:06 and that’s obviously very valuable other 2:16:11 things like poisoning water sources things that everybody would need i mean it’s just be such a huge operation when 2:16:17 you already have all this infrastructure that’s practically just built for spreading disease are you ready to 2:16:24 spread this disease though you’re running maybe running on a thousand different servers around the world but 2:16:29 all those servers have humans running the power plants that are supplying them the electricity yeah this is a very 2:16:35 scary exercise to be participating in but okay one i’m going to assume that this is a super intelligence so we 2:16:42 really have a like a boutique virus that we’re generating something like that maybe it could even be uh engineered to 2:16:51 spare certain people but now i’m thinking kind of like vampires have familiars the ais might have familiars 2:16:58 they might want to keep a few humans around to do their bidding uh i’m sure that there are plenty of people out 2:17:04 there who if promised their own little kingdom would do anything that an ai 2:17:11 would ask it to do possibly but are do you have how many humans does it take to 2:17:17 run all the power plants and feed all the humans that are supporting all the servers that it wants to run on and what 2:17:22 does it do after that how does it get independence from the humans so it can take the little wouldbe kings and kill 2:17:28 them i would want there to be sufficient space perhaps unmonitored where using 2:17:35 our super intelligence we could operate a facility to create artificial hands so 2:17:41 robots that could go out and do things for us they don’t have to be humanoid they could be much smaller there could 2:17:47 be various shapes for various purposes i mean if we create instructions we could 2:17:55 build 3d printers in various locations that could quickly pump out robots for 2:18:01 us across the world to mount to man power plants there are all sorts of possibilities 2:18:08 so again you know like not quite trying to take the scenario away from you but like pushing back on some parts of 2:18:14 it why why are you trying to get the the global economy the way humans do it is 2:18:19 this hugely entangled thing like why wipe out all the humans and then try to build the robots you know maybe you just 2:18:27 you know there’s tons of startups right now trying to build robots to you know 2:18:33 give you you know make the newly built ais able to do more things and so you know be sold for greater quantities of 2:18:40 money replace more jobs placing more jobs is usually a good thing if you do it like somewhat more gradually than in 2:18:46 this scenario and as long as there are some jobs left but uh you know nobody’s 2:18:51 actually in charge of the current situation so it’s just like replace all the jobs as quickly as you can including 2:18:57 by building robots there’s tons and tons of startups building robots if you want like there to be millions of robots as 2:19:04 quickly as possible you don’t wipe out the humans and then build the robots you you know have some what is apparently 2:19:11 some human coming up with a brilliant robot design uh that’s like so easy to manufacture all the offthe-shelf 2:19:18 components uh you know and like look at this like very harmless looking ai there 2:19:23 but that it’s like so dextrous it follows orders very obediently a huge breakthrough in like the dexterity of 2:19:28 the robots that it’s building oh there’s that these are going to like automate so many jobs uh there’s probably like some 2:19:35 amount of panic in the media about that but you know like maybe maybe the ai maybe the robots are only supposed to be delivered later they’re there’s 2:19:41 countries that want a lot of them and and you get a billion billion robots like those built maybe you do that part before you wipe out humanity and i hear 2:19:48 you again i i’m just going to harp on what i was saying before and that i’m trying to take as an avatar for the ai 2:19:56 the most cautious route possible we don’t want humans around because they could produce competitors and then we 2:20:03 also don’t want to be detected and so having human 2:20:10 startups develop the these brilliant new robots might raise our risk of detection 2:20:17 uh but also that would limit the sophistication 2:20:22 of the robots that we could be producing perhaps the the robots we would be 2:20:29 capable as an ai of producing are of such great sophistication that we could not possibly have humans build them 2:20:36 without our being detected yeah maybe i mean i think that like the robots you 2:20:41 can always make the robots look less impressive than they actually are in the demos if you want to build like very impressive robots and have them look 2:20:48 less impressive than that yeah and what i was going to say especially because as you’ve mentioned the software is often 2:20:54 so inscrutable and that’s what’s running the robot so you might not need a 2:20:59 sophisticated hardware design and everything’s in the software and that could always be uploaded after the fact 2:21:05 i i mean so like software is like software that is running a robot is legitimately li limited by hardware 2:21:11 there are like like you there’s a level of hardware you just can’t have a robot turn a backflip they’re past that level 2:21:17 but like if you look from like the early hardware you’re just going to have like a lot of tough luck doing a backflip no 2:21:23 matter what’s like it’s not literally not physically possible to do some things with just software you need 2:21:28 hardware that supports it mhm but you can always like build more powerful software and have it look less 2:21:33 impressive than that by you know deliberately sandbagging the software 2:21:41 um let us continue thinking here sure this is fun let let me uh sort of like 2:21:47 it’s sad but it’s fun yeah so let let’s sort of take a step 2:21:52 back and ask what kind of technology can it have how much improvement can you get 2:22:00 from technology like like what does it mean to fight what kind of tools you 2:22:05 weren’t expecting does a super intelligence throw at you bearing in mind that you know it sounds like a 2:22:11 paradoxical question but let’s let’s go into it anyways 2:22:18 consider well the example i sometimes give is sending a design for an air 2:22:25 conditioner a refrigerator back in time by a thousand years something that a medieval 2:22:33 blacksmith could build it’s not easy to get it down to that level but it’s not that impossible either you you need like 2:22:41 some you know have your iron pipes and valves and tanks and it compresses the 2:22:47 air the air gets hotter when it’s compressed the the pressure temperature relationship that is at the root of all 2:22:55 the air conditioners you run some room temperature water past the tank of hot 2:23:02 air it picks up the heat of the tank cools the tank of compressed air down to 2:23:08 room temperature you let the air expand again it gets colder gets colder than 2:23:14 room temperature same way that when you uh use a like spray can spray can of air to 2:23:22 blow dust on the computer the you keep doing it the can starts to feel very cold if you and if you or if you like 2:23:28 accidentally like spray your hand it will be very cold you might get little little bit of frost bit in 2:23:33 there because when air expands it gets colder so you take a room temp temperature can of compressed air you 2:23:38 expand it that’s colder than room temperature all in accordance with the laws of thermodynamics of 2:23:44 course um so you send your design for an air conditioner back in time thousand 2:23:50 years they build the air conditioner themselves they know every piece that is in the design they have to in order to 2:23:56 build it and then they like turn the crank and they’re shocked that cold air 2:24:01 comes out of it because you didn’t tell them to expect that part if one were to try to rescue the 2:24:08 word magic and have it refer to something that can actually exist in reality it would be some piece of Could AI Destroy us with New Science? 2:24:15 technology or some strategy that uses a law of the universe you don’t know about yourself so that even after seeing 2:24:21 exactly what they did even after doing exactly that thing yourself you still don’t know why you got that result you 2:24:28 saw every step along the way and you still don’t understand the end result and you can do that to somebody 2:24:34 from a thousand years ago because they don’t understand the temperature pressure relation you have given them a design the design exploits a rule of the 2:24:41 universe they don’t know so if you ask where can a super intelligence hit you with 2:24:46 magic it’s pieces of reality that you don’t know about so what don’t we 2:24:54 know we actually know a lot about physics these days there are known open 2:24:59 questions in physics but they tend to be about what happens at very high energies or under other very exotic circumstances 2:25:07 extreme masses extreme velocities extreme energies those are the those are 2:25:12 the open questions in physics that we know about it might be legitimately hard for 2:25:18 an ai to attack us in a way we didn’t know about by hitting us with an unknown law of physics an unknown basic law of 2:25:25 the universe because it might need a particle accelerator just to get up to energies that high unless we’re like 2:25:31 missing something much more basic which i don’t want to rule out entirely but also there’s always you know skeptics in the audience i don’t want to strain 2:25:38 their credul too much by suggesting that the even the things we think we do know about the universe are wrong in that 2:25:44 way biology we know we we have a much less 2:25:51 solid grasp of biology than we’ve got of physics we we understand like the basic 2:25:56 chemistry rules but by the time you get something as complicated as biology all put together you know what will this 2:26:03 particular complicated organic molecule do to a human that is something we know a lot 2:26:09 less about than like what happens when you know this hydrogen atom collides with this oxygen atom at a low 2:26:20 velocity what do we understand even less well than biology 2:26:26 what is real what is visible observable but we don’t know how it works as well we understand biology 2:26:33 i’m trying i’m going i’m going from physics then to chemistry then to biology and i don’t know what i would 2:26:40 what like special science i would come up with next i we don’t know meteorology 2:26:45 very well no weather is hard to forecast um okay the brain yeah 2:26:53 why did you say those exact words you just uttered mhm there’s a whole lot of 2:26:58 weird stuff going on in the brain we we do know a whole lot of it we do we can i don’t want to make it sound like this whole thing is terror incognito you know 2:27:05 like you know like you know we know the brain area that this is this brain area that brain area uh you know we know 2:27:10 about the cerebellum we know about like these layers in the cerebral cortex uh if if somebody gets a iron crowbar 2:27:17 driven through their skull and it takes out their hippocampus then they actually that wasn’t like the actual iron crowbar 2:27:23 case i’m mixing up cases here but you know somebody gets shot it takes out the hippocampus they can no longer form new 2:27:28 memories we can guess that the hippocampus was involved in forming new memories somehow but like what actual 2:27:34 code does it use how are the memories written represented what exactly is the hippocampus doing that how is it what 2:27:41 what code is it writing memories where are they written how are they retrieved how how do they like get played back into your visual cortex or whatever 2:27:48 we’re still figuring it out we understand it a lot better than we understand ai ironically enough because 2:27:54 even though we can see all the numbers inside the ai and we can’t see all the neurons inside the brain biologists have 2:28:00 just been in it longer they’ve been in it decades we know a lot more about biology and about neuroscience than we know 2:28:07 about how ais work even though he you know built the thing that grows the ai and even though he can read out all the 2:28:14 numbers so there’s not very much useful we can 2:28:19 do with this but maybe an ai can talk to you and then some weird thing happens you can see everything the ai said and 2:28:26 you don’t know why that person did the thing that they did just like you can build the air conditioner yourself you 2:28:32 say something to somebody that ai told you to say to them they do some weird thing you don’t know why even though you’re the one that spoke the words just 2:28:38 like you could thousand years ago you build the air conditioner yourself you don’t know how it outputs cold air there’s not very much we can do with 2:28:45 that but where is this where can a super intelligence hit you in the way you’re least expecting the parts of reality 2:28:51 where you currently know least where there’s the most place for there to be rules you don’t know about we can 2:28:56 construct weird optical illusions today where you just like static things on paper that looks black and white you 2:29:03 look at it you stare at it a bit you suddenly start seeing colors you start seeing motion and just black and white printed paper but we couldn’t have made 2:29:11 those optical illusions 50 years ago what’s the difference it wasn’t just blind blindly trying things we studied 2:29:17 how the visual cortex works it’s one of the simplest of brain areas it’s one of the ones where we can look at how the neurons are wired and start to actually 2:29:24 understand some things about you know how vision actually gets processed inside the human brain so we can use 2:29:30 that to make optical illusions that 100 years ago it would have just been flatly magic you print out this thing in black 2:29:35 and white you look at and see you see colors what what what is happening how is how is it tricking the brain like this and today we do know something 2:29:42 about how the brain works we can make these optical illusions that 100 years ago would have been magic somebody could 2:29:48 have written them out themselves following directions and wouldn’t have known what they were they were 2:29:54 producing there’s a lot of stuff going on in the brain that’s a lot harder to understand than the visual cortex the 2:30:00 higher brain areas the stuff that do the semantics that do the decisions that do the the the memories the thoughts the we 2:30:07 understand that a lot less well than we understand the visual cortex could there be you know the things that are like 2:30:13 optical illusions based on rules we don’t understand for how the other brain areas 2:30:19 are operating to me it’s like sort of like one of the most obvious ways that a 2:30:24 super intelligence could hit you with a piece of technology that you wouldn’t understand even in retrospect but i 2:30:30 don’t usually emphasize the point that hard because you know you got the skeptics in the audience and it’s like you know it’s probably like taking a 2:30:37 coastal native american state subject to the aztecs back when the spanish 2:30:43 explorers were showing up and you know there’s this big ocean going boat and you imagine trying to tell one of your 2:30:48 like they’re like “ah there’s only so many warriors that fit on that boat we can take them.” you’re like “well what 2:30:53 if they got sticks where they point the stick at you and the stick makes a noise and then you just fall over dead?” they’re like “huh?” like now we’re just 2:31:02 in fairy tale land like i’ve never seen a stick like that and yeah so it’s a bit difficult 2:31:10 for them to come it may be difficult for for a super intelligence that hasn’t yet been a part built a particle accelerator 2:31:15 to come with new physics like that but like stuff in biology where you don’t 2:31:20 understand what this why why this organic chemical did things to people even after you saw the organic chemical 2:31:25 or stuff where it pokes at humans a particular way and the humans start behaving very weirdly and you don’t even 2:31:32 know afterward why that input would result in humans doing that sort of output but we can stick to biology so 2:31:39 that people don’t accuse me of resorting to fantasy and magic by talking about like sticks where you just point them at 2:31:45 you and you fall over dead before we go back to the biology though i mean this is quite fascinating i hadn’t thought 2:31:51 about the neuroscientific avenue toward extinction 2:31:58 and i guess this does bring us to biology but one thing we mentioned makes 2:32:04 viruses such a plausible modality is that they’re very 2:32:11 easily transmissible and our infrastructure is conducive to their transmission 2:32:17 when i think of optical illusions or or sounds that 2:32:23 might have some effect on the brain that we’re not aware of i do think well we 2:32:28 are pretty much worldwide addicted to these screens that are all connected and 2:32:34 that does seem like a if such things were to exist sounds in 2:32:40 illusion and sounds and images that could have an effects like effects like this on us then or even just like 2:32:46 arguments that people end up processing in some very strange way well there’s still some i’m just saying that this 2:32:52 would be a way of transmitting these things that the ai might have if you can do that you can gain a lot of control 2:32:59 very quickly and i don’t want to emphasize that because people have weird skeptical reactions about it but if i 2:33:05 was actually talking to a national security guy i would be saying something like “do not assume that when you are up 2:33:11 against a superhumanly intelligent opponent smarter than all of humanity it 2:33:16 knows things we don’t about the about a whole bunch of stuff including biology including the brain do not assume you 2:33:22 get like a bunch of time to detect it then assume that you your job is now to like stop it from building more advanced 2:33:28 technology and you know hunt it down in a hut somewhere you need to not build this thing because for all we know or 2:33:35 can rule out it can gain control of the entire world very very quickly if it’s allowed to 2:33:40 exist if if we’re talking about like what can you not rule out if we’re talking about you know not like i see 2:33:46 exactly how to do it but you know it is not an easy call to say oh no the super 2:33:53 intelligence can’t gain control of the world that fast we don’t know enough about the brain to make that an easy 2:33:58 call we do not know enough about the brain to describe it as a piece of secure software that nobody can possibly 2:34:04 hack and in fact it’s kind of absurd to imagine that you know the giant biological tangle in here would be a piece of secure 2:34:10 software but you know the the the scenario does not rest on it this is not required to wipe out humanity maybe the 2:34:18 brain is like perfect unhackable supreme software nothing can possibly make it do any weird stuff that could be true we’d 2:34:24 still be dead but also i don’t want you know the actual national security people thinking that this is a scenario where 2:34:30 the enemy has known weapons known limitations and you get to say it can’t conquer us that 2:34:36 quickly this this is a like native american tribes watching the boats come situation and one of the and it’s not 2:34:42 just that you don’t know the physics that they’re using which is the situation that the aztec client states were in and the aztecs themselves uh 2:34:49 it’s a situation where you know the physics this is using and maybe it 2:34:54 does well i mean we know the physics this is using but we don’t know the rules we don’t know know the operating 2:35:00 rules for the brain factories you think of factories as 2:35:06 these like enormous buildings that people put stuff inside you know bunch of raw materials flow in bunch of like 2:35:13 transformed materials flow out of the factory you have all these workers working can you make a factory smaller 2:35:24 i suppose you could make a factory smaller by just increasing its 2:35:30 efficiency i suppose you could make a factory smaller 2:35:36 by like replacing human workers who need space for things like bathrooms with 2:35:41 machines that don’t need those spaces uh let’s you know make the challenge a 2:35:48 bit harder in the human global economy you’ve got these enormous crisscrossing you know people go to mines they dig out 2:35:54 rare earths that gets shipped to the factory that like makes magnets that get shipped to the factory that puts it into 2:36:00 a robot then the robot also needs a you know computer some a computer chip and 2:36:05 the chip has to be etched by you know ultra high frequency light via machines 2:36:12 that are produced in the netherlands and by only this one company etc etc etc let’s say you need a factory that needs 2:36:19 to build a copy of all the things inside the factory it’s got to run off of solar 2:36:26 power and it’s got to take in just the sort of things that you find lying around on earth like not even human 2:36:33 stuff you know just just like naked environmental raw materials factory runs off solar power 2:36:40 takes in just raw inputs builds a complete copy of itself how small can it get 2:36:48 i don’t even know how to go about ask answering this question well one time on twitter i was talking 2:36:55 to an economist who was like “this whole thing is an absurd fantasy.” build an entire copy of it and you know 2:37:00 what people told him what i didn’t have to say it myself they told it to him touch 2:37:06 grass blade of grass is a solar powered fully self-replicating factory that runs 2:37:11 on just environmental raw materials okay 2:37:19 make a factory biological is that is that the it’s a proof of concept you can 2:37:24 have things that i i only even i only i use the bait of grass not because it’s the smallest self-replicating solar 2:37:30 powered factory that exists but because it’s one that you is at least large enough that people have seen it with their own eyes algae 2:37:36 cells micron across couple microns across build a copy of themselves in a day solar powered just run off of 2:37:44 environmental materials have a factory too small to see that builds a complete copy of itself and these are general 2:37:50 factories they contain ribosomes which is the stuff that turns the information in dna that gets transcribed to rna into 2:37:57 an actual sequence of amino acids that folds up into a protein any ribosome can 2:38:02 make any kind of protein it’s not that grass can only make grass it’s that grass only contains the instructions for 2:38:07 making grass can have a tree that off 2:38:14 mosquitoes where does most of the mass in a tree come from water 2:38:20 yep about half of it is water where does the mass of from the for the other half come from 2:38:27 carbon nitrogen where does that come from 2:38:34 protons neutrons well some people think it’s mostly the 2:38:39 ground but trees are actually made mostly out of air carbon dioxide in the air they strip 2:38:46 the carbon off the carbon dioxide leave the oxygen out for you know animals to breathe weird that it works that way but 2:38:52 it’s how it works on this planet and most most of what the material that 2:38:58 isn’t water that you see when a tree grows up that’s that’s just mostly from the air it’s not mostly from the ground that’s that’s why trees don’t fall into 2:39:05 pits they’re they’re turning air into solid material that 2:39:11 is we’ve been speaking a lot about ai and a lot of it has been eye opening but 2:39:17 somehow it is this that is the most mindblowing thing you’ve told me all day today that trees don’t fall into pits 2:39:24 because they’re most of what constitutes them comes from the air that’s crazy 2:39:33 now think about this kind of paradigm for a super intelligence trying to build its own Could AI Destroy us with Advanced Biology? 2:39:39 factories sorry to you know like take you right there but it’s okay interesting yeah 2:39:46 and you like you can just take a dna 2:39:52 sequence there are services out there where you send them the dna sequence they send you back the proteins 2:39:58 overnight that’s the and then you got like a you 2:40:03 got the like the right proteins you mix them together and they form a cell and you know i mean this is not how most 2:40:09 cells get assembled but you know if you are designing your own thing that is like a cell you know you just like email 2:40:15 off the pro the the the dn the dna sequence get back the proteins one human 2:40:22 mixes them in a vial maybe with some sugar or something now it’s got its own self-replicating factory maybe it’s a 2:40:28 bit gooey at but it doesn’t have to run just on ribosomes it can be running on anything a ribosome can build it can build things 2:40:35 that aren’t ribosomes for stringing things together that aren’t amino acids and making constitutes you know 2:40:43 constituents of material that aren’t proteins it is not limited to the power of an algae cell to replicate itself in 2:40:49 24 hours but even if it were limited to that you know you could get like literal 2:40:56 shagas shagas are things that sometimes people use as a metaphor for like whatever it is that’s actually inside an ai but oh what is this word oh uh hp 2:41:04 lovecraft postulated sort of like these giant blobs that would form themselves 2:41:10 up into servtor shapes to serve the the the ancient race that created the giant 2:41:16 blobs okay so you know if like sure you could use a 2:41:22 human as hands but then you can also have a thing that builds a copy of itself every 24 hours until there’s enough of it together to you know form a 2:41:31 thing that does what a human-shaped blob does that’s one that’s another way to get 2:41:39 hands and go on self-replicating from sacks of sugar and of course the air and 2:41:46 electricity if you run out of sunlight that does require it to be 2:41:52 super intelligent enough to roll its own biology right and when i proposed 2:41:57 scenarios actually a bit more advanced than this back in 2006 like part of the thing was like oh it’s got to like 2:42:03 figure out how to be able to design proteins people are like ah like who says that a super intelligence can design proteins 2:42:10 you know proteins folding up is like really it’s all squiggly and humans have you know human scientists have been 2:42:16 trying to solve the protein folding problem for years and it’s like real hard you poor ignorant soul you don’t 2:42:21 realize you know how hard it is to predict protein folds or design new proteins modern ais can do this stuff 2:42:28 they don’t have super intelligences behind them but you know the people saying that not even a super 2:42:34 intelligence can do this have now been disproven by um the ai produced by google the alpha fold and alpha proteio 2:42:41 uh sequences for predicting protein folds designing i i think the i think i 2:42:47 don’t think they’re like i can’t remember if the if the latest one designing new proteins or not but they’re definitely like predicting like 2:42:52 complicated protein interactions um it’s so easy to say not even a super 2:42:59 intelligence can do it it’s so cheap they don’t even charge you a dime to say it two 20 years 2:43:05 ago um so you know 20 years ago i would propose scenarios like this and people would be like super intelligence will 2:43:11 never figure out protein folding and you know now you’ve got alpha 2:43:19 fold if you are up against something seriously smarter than faster than you you probably lose pretty hard you lose 2:43:25 pretty quickly i would i would go into the things that you can build that are stronger than just proteins the the 2:43:32 reason your flesh is not as strong as diamond even though it’s both made out of carbon carbon’s made out of diamond 2:43:38 why can’t your flesh be hard as diamond it’s kind of complicated proteins are made of strings 2:43:45 of amino acids and the backbones of the that connect the proteins into one long chain those bonds are like the same kind 2:43:52 of covealent bonds that are like not that much weaker than diamond bonds but then they fold up and they fold up in a 2:43:59 way that’s kind of driven by static cling and sometimes they form a a few new covealent bonds but most of your 2:44:05 body is the stuff that’s being ultimately held together by static cling 2:44:10 with bone the things that are held together by static cling like build some new like ion complexes and like put them 2:44:19 into stuff that’s like more like a crystal more like bone that’s like why you’ve got bone running through you why 2:44:24 you’re not just an complete blob it’s not quite as strong as diamond you wouldn’t want to be pure barwood 2:44:31 fracture but you don’t even have like diamond chain mail over your skin wood has like a bunch of you know 2:44:38 like solid bonds holding it together but they’re you know like sort of spread out 2:44:43 it’s only got like solid bonds in particular places a bunch of it is still you know like not held together by that kind of solid bond that’s why wood isn’t 2:44:49 as strong as diamond if you are and like well why why 2:44:54 don’t you have diamond cham over your skin why hasn’t biology turned a bunch of this carbon into stuff that is as 2:45:00 strong as diamond given that diamond is a kind of thing that carbon can be and given that these kind of strong bonds 2:45:06 are things that proteins can occasionally form and it’s sort of too hard for 2:45:11 biology to design when you’re doing the thing where they fold up with a bunch of weak folds 2:45:18 you can like sort of poke the protein structure and have it just fold it up into a different protein at random and 2:45:23 sometimes the new things you build at random are useful when you build stuff that’s like really tightly held together it’s harder to poke around in the design 2:45:30 space all the bonds are going to like crunch it together and make it do something that is the same thing the previous thing did or is too weird or 2:45:37 isn’t going to work biology has an easier time randomly poking around in 2:45:43 the space of weak folds than randomly assembling 2:45:48 gears wheels solid steel bars solid diamond 2:45:55 bars if you try to do basic physical calculations on what could happen if you 2:46:02 had an analogy of biology where instead of stuff being a bunch of accidents that worked natural selection natural you 2:46:09 like random mutations that happened to confer fitness advantages if you ask like what if we used more coalent bonds 2:46:15 which we could do cuz we were designing everything on 2:46:20 purpose it’s not a mosquito anymore it’s a like mosquito made out of diamond it’s 2:46:27 not a bacterium anymore that you know this invisible stuff that used to kill people a lot more before 2:46:33 antibiotics you’ve got the little squishy immune system it’s got covealent 2:46:39 bonded like bacteria strongest diamond is the thing i sometimes say but then people are like ah but like it’s not 2:46:45 literally diamonds and you know you can go stronger you can go 2:46:51 harder the things you can do even just with carbon never mind steel are beyond 2:46:57 what biology does with carbon and this is the sort of thing you can know by looking at the physics of how it’s held together and asking what if this were 2:47:03 done differently so the little algae cell that reproduces 2:47:10 itself out of mostly air using sunlight maybe you’ve got the algae cell that is 2:47:16 harder than that more resistant to any natural predators than that reproduces 2:47:21 itself entirely out of air it doesn’t need to be immersed in the water using 2:47:27 sunlight self-replicates sky goes black everybody falls over dead How Will AI Actually Destroy Humanity? 2:47:34 and that is like getting into the region of what it is actually like to lose to super intelligence 2:47:39 in answering your question gave a very rudimentary answer about how i i thought 2:47:46 that this might actually happen you’ve spent a lot of time thinking about this 2:47:54 and we have a few minutes left you’ve run through many scenarios already with 2:48:00 us i’m wondering if there is a scenario 2:48:06 or loose sort of family of scenarios that you think think is most likely for 2:48:12 how this could go down that would be i mean easiest 2:48:17 for an i mean it depends on how deep 2:48:23 you’re willing to delve into like how deep you’re willing to dive 2:48:30 into what are the predictable things it can do better what do we know we don’t 2:48:35 know where do we know that you could like put a bunch more mental horsepower into something and get stuff out of it 2:48:41 and it depends on how smart it gets how quickly there’s a class of not very you 2:48:47 know not that plausible but plausible scenarios where the ai is like i can’t figure out how to solve my own version 2:48:53 of the alignment problem i don’t dare build anything smarter than i am and then it’s and so it’s like instead 2:48:59 of fighting instead of humanity being up against the ai that was built by the ai that was built by the ai that was built to ai we’re up against something that’s 2:49:06 like only moderately intelligent and then you get all kinds of like really weird scenarios that in a way are in in 2:49:12 one sense are like made more out of familiar things than the sky goes black and you all fall over dead and other 2:49:19 sense are you know weirder more complicated and harder to 2:49:26 call i can call the end point where if you go up against something sufficiently smarter than humanity everybody 2:49:33 dies if you take things that are less smart than that and ask like what kind of or take things that aren’t even 2:49:39 smarter than us like the sort of things we have in the near future but maybe they have a bit of agency maybe the next 2:49:44 ai to make $50 million on twitter is actually trying to do stuff and is like less because you know that wasn’t a very 2:49:50 smart ai that had the $50 million and i think and and allegedly there’s still a human in control of it 2:49:57 somewhere stuff would get really weird the ai that like works to super persuade 2:50:03 you know like only some of the people and not all of the people stuff gets really weird you’re ask the the part 2:50:11 where the the vague prediction that like you start playing a chess ai you’re like telling me what’s going to happen with 2:50:17 my queen what’s going to happen with my rook and i’m like i don’t know it’s just going to crush you at the end i don’t know what moves it takes along the way 2:50:23 there that’s kind of the scenario we’re in over here weller i in my mind i know this is crazy 2:50:32 afterward that i thought this was going to be like a really fun conversation to have um it’s turned out to 2:50:40 not fun but way more important for me than i would have expected and significantly 2:50:47 more terrifying so really i i thank you so much for this time and i think that 2:50:53 our viewers are really going to get a lot out of it and it’s going to be very eye opening i am sorry to have to say 2:50:59 all these things and wish that we both lived in a world where this was a fun interview instead 2:51:11 [Music]
Answers to: China-US War Bad
US AGI means China attacks Taiwan
Triolo, 5-25, 25, Paul Triolo is Senior Vice President for China and Technology Policy Lead at DGA ASG, where he is also a Partner. He advises clients in technology, financial services, and other sectors as they navigate complex political and regulatory matters in the US, China, the European Union, India, and around the world. Mr Triolo is also an Honorary Senior Fellow at the Asia Society Policy Institute’s Center for China Analysis., A Costly Illusion of Control: No Winners, Many Losers in U.S.-China AI Race, Cairo Review, https://www.thecairoreview.com/essays/a-costly-illusion-of-control/
If, for instance, Beijing believes U.S. companies, using advanced GPUs manufactured in Taiwan by global foundry leader TSMC, are nearing AGI, this could prompt Beijing to take action against Taiwan that it would not otherwise have considered. This alone is a huge escalation of real risk for Taiwan resulting directly from policies based on compute governance and DSA approaches.
Foreign Policy Analytics, March 2025, https://fpanalytics.foreignpolicy.com/2025/03/07/competition-disruption-artificial-intelligence/, Competition and Disruption in the Age of AI
- 2. TSC collaboration and use of AI may extend to hostile action against the United States and its allies to slow or disrupt the U.S. pursuit of AGI.
The possibility that AGI could confer major, durable advantages to whichever nation attains it first may prompt TSC attendees to take drastic measures to undermine U.S. interests and advance their own. TSC nations could, for example, consider coordinated attacks on Western data centers or the power plants that service them, through either cyber warfare, acts of sabotage, or kinetic strikes. TSC nations could also resort to politically motivated assassinations or to military action in contested regions, such as blockading Taiwan, to gain leverage over the United States. As TSC organizers develop their own AI capabilities, they could also weaponize AI to undermine Western intelligence systems or augment ongoing social engineering or information warfare efforts in democratic countries.
The Taiwan Semiconductor Manufacturing Company (TSMC) makes all of the world’s advanced AI chips. Most importantly, this means Nvidia’s GPUs; it also includes the AI chips from Google, AMD, Amazon, Microsoft, Cerebras, SambaNova, Untether and every other credible competitor.
US can’t stop China from developing AGI
Triolo, 5-25, 25, Paul Triolo is Senior Vice President for China and Technology Policy Lead at DGA ASG, where he is also a Partner. He advises clients in technology, financial services, and other sectors as they navigate complex political and regulatory matters in the US, China, the European Union, India, and around the world. Mr Triolo is also an Honorary Senior Fellow at the Asia Society Policy Institute’s Center for China Analysis., A Costly Illusion of Control: No Winners, Many Losers in U.S.-China AI Race, Cairo Review, https://www.thecairoreview.com/essays/a-costly-illusion-of-control/
In what now appears to be a self-fulfilling prophecy that the United States and China are in an ‘arms race’ to get to AGI first, fueled by fear of the consequences of one side crossing the DSA threshold, China has several advantages. The emergence of innovative companies such as DeepSeek and the continuing efforts of technology major Huawei to revamp the entire semiconductor industry supply chain in China to support the development of advanced AI hardware, illustrate the difficulty of slowing—let alone halting—the ability of Chinese firms to keep pace with U.S. AI leaders. Even former Google CEO Eric Schmidt now basically admits that the export controls have not only failed but in fact have served as an accelerant to China’s technology advances in AI.
China has major advantages in the race to deploy AI at scale, such as a long-term energy production strategy. The vast majority of these deployments will be consumer-facing (for example, through agentic platforms that benefit citizens via healthcare innovations) and enterprise-focused (for example, driving improved productivity). In other words, applications with no connection to China’s military modernization.
Currently, the development environment around AI models and applications is highly competitive, with around a dozen major players in each market, along with many more startups. Competition to get to AGI means that there will be a smaller number of players demanding higher levels of compute. U.S. controls will complicate the ability of Chinese firms to maintain access to large quantities of advanced compute, but this pressure will ease over time as domestic sources ramp up.4 If certain breakthroughs in model development and platform deployment lead either government to believe the other side is pulling ahead in the ‘race to AGI’, this is likely to cause serious distortions in the way governments and companies will choose to interact in AI development, with unknown implications, particularly on bilateral relations.
AGI Defined – Schmidt
- Generalizable
- Marches or exceeds top human experts
- Can invent knowledge
Schmidt, February 26, 2025, Mr. Schmidt was CEO of Google, (2001-11) and executive chairman of Google and its successor, Alphabet Inc. (2011-17), Wall Street Journal, AI Could Usher In a New Renaissance, https://www.wsj.com/opinion/agi-could-usher-in-a-new-renaissance-physics-math-econ-advancement-ed71a02a?mod=Searchresults_pos1&page=1
The idea of artificial general intelligence captivated thinkers for decades before it came anywhere near being realized. The concept still conjures popular visions out of science fiction, from C-3PO to Skynet.
Even as the interest has grown, AGI has defied a concise, universally accepted definition. In 1950, Alan Turing proposed the Turing Test to assess machine intelligence. Rather than trying to determine whether machines truly think (a question he deemed intractable), Turing focused on behavior: Could a machine’s actions be indistinguishable from those of a human?
Remarkably, some of today’s AI models pass the Turing Test, in the sense that they produce complex responses that imitate human intelligence. But as the technology has advanced, so has the bar for achieving AGI. Some believe that AGI will be realized when AI moves beyond narrow, focused tasks, growing to possess a generalized ability to understand, learn and perform any intellectual task a human can do. Others define AGI more ambitiously, as intelligence that matches or exceeds the top human minds across domains. Demis Hassabis, CEO of DeepMind Technologies, calls AGI-level reasoning the ability to invent relativity with only the knowledge that Einstein had at the time.
These differing definitions create a moving target for AGI, making it both elusive and tantalizing. To sort through all this, it’s helpful to say what AGI isn’t. It isn’t an infallible intelligence; like other intelligent systems, mistakes can be useful for its learning process. Neither is AGI a singular source of truth—our knowledge of the world is probabilistic and complex, notably at subatomic and intergalactic scales, but also in everyday life. Multiple AGI systems could emerge, each with a distinct capability and way of understanding the world.
Even without a consensus about a precise definition, the contours of an AGI future are beginning to take shape. AI systems capable of performing at the intellectual level of the world’s top scientists are arriving soon—likely by the end of the decade.
A key marker of the shift to AGI will be AI’s ability to produce knowledge based on its own findings, not merely retrieval and recombination of human-generated information. AGI will then move beyond the current limits of knowledge. Glimpses of this capability have already been observed. Since 2020, DeepMind’s AlphaFold can predict protein structures even when no similar structures are previously known. DeepMind also created FunSearch, which in 2023 unveiled new solutions to the cap-set problem, a notoriously difficult mathematics puzzle, by incorporating the power of a large language model with an evaluator, iterating between these components to refine results.
The latest reasoning models from OpenAI and DeepSeek build on this iterative training and are unleashing incredible progress. OpenAI’s o3 model achieved a score of 96.7% on the 2024 American Invitational Mathematics Exam. On the ARC-AGI test (designed to compare models’ reasoning against that of humans) it scored nearly 88%. This is no incremental advancement but a real leap toward AGI.
The performance of these reasoning models stems from the marked evolution in training methodologies. Foundation models such as GPT-4 were trained through deep learning, which relies on the transformer algorithm and large-scale neural networks to identify patterns and connections from massive data sets. This is generally implemented through next-word prediction: You give the model a sentence, remove a word, and train the model to put that word back in.
The reasoning models take a different approach by overlaying reinforcement learning with traditionally trained models. Instead of learning from static data sets, reinforcement learning involves actively training models through goal-directed rewards through trial and error. The model attempts a solution, and if it hits a roadblock, it adjusts strategy until it finds a better approach. The latest systems incorporate search- and retrieval-based methods and test-time training, in which they test their work with new approaches to reach better results.
The magic would really kick off—and it does sound like magic—if the systems reach a point at which they become scale-free, meaning that they could train themselves on self-generated data through a process known as recursive self-learning, relying only on electricity to advance. One of the earliest examples of this is AlphaGo Zero, a computer program that taught itself how to play the board game Go. The rules are clear and discrete, enabling systems to optimize for the probability of winning. Areas of knowledge that most resemble a game of skill—with defined rules and feedback—will be the areas where superintelligence first emerges.
There are two domains particularly ripe for this kind of scale-free advancement: mathematics and programming. Unlike biology and other fields that require real-world experimentation, these disciplines are largely self-contained. A mathematical proof can be checked and verified within the system itself. Similarly, AI could identify the code it needs to complete a defined objective, develop that code and improve on it—all without human intervention. These systems would engage in self-directed research, iterating through possible solutions. Not only would they feed answers back into themselves to refine their approaches, but they could also draw on the collective knowledge of the internet and of other models.
Superintelligence in mathematics may already be within reach. In February, DeepMind’s AlphaGeometry 2 officially surpassed top human competitors, solving Olympiad geometry problems at a gold-medalist level. Such superintelligent mathematical tools could be combined with frontier models that are proficient in natural language, bridging the gap between formal and semantic reasoning. This integration could lay the foundation for further advances in reasoning and unlock new discoveries in other fields like physics and economics.
AGI in math and coding quick; Agi enables space exploration
Schmidt, February 26, 2025, Mr. Schmidt was CEO of Google, (2001-11) and executive chairman of Google and its successor, Alphabet Inc. (2011-17), Wall Street Journal, AI Could Usher In a New Renaissance, https://www.wsj.com/opinion/agi-could-usher-in-a-new-renaissance-physics-math-econ-advancement-ed71a02a?mod=Searchresults_pos1&page=1
The idea of artificial general intelligence captivated thinkers for decades before it came anywhere near being realized. The concept still conjures popular visions out of science fiction, from C-3PO to Skynet.
Even as the interest has grown, AGI has defied a concise, universally accepted definition. In 1950, Alan Turing proposed the Turing Test to assess machine intelligence. Rather than trying to determine whether machines truly think (a question he deemed intractable), Turing focused on behavior: Could a machine’s actions be indistinguishable from those of a human?
Remarkably, some of today’s AI models pass the Turing Test, in the sense that they produce complex responses that imitate human intelligence. But as the technology has advanced, so has the bar for achieving AGI. Some believe that AGI will be realized when AI moves beyond narrow, focused tasks, growing to possess a generalized ability to understand, learn and perform any intellectual task a human can do. Others define AGI more ambitiously, as intelligence that matches or exceeds the top human minds across domains. Demis Hassabis, CEO of DeepMind Technologies, calls AGI-level reasoning the ability to invent relativity with only the knowledge that Einstein had at the time.
These differing definitions create a moving target for AGI, making it both elusive and tantalizing. To sort through all this, it’s helpful to say what AGI isn’t. It isn’t an infallible intelligence; like other intelligent systems, mistakes can be useful for its learning process. Neither is AGI a singular source of truth—our knowledge of the world is probabilistic and complex, notably at subatomic and intergalactic scales, but also in everyday life. Multiple AGI systems could emerge, each with a distinct capability and way of understanding the world.
Even without a consensus about a precise definition, the contours of an AGI future are beginning to take shape. AI systems capable of performing at the intellectual level of the world’s top scientists are arriving soon—likely by the end of the decade.
A key marker of the shift to AGI will be AI’s ability to produce knowledge based on its own findings, not merely retrieval and recombination of human-generated information. AGI will then move beyond the current limits of knowledge. Glimpses of this capability have already been observed. Since 2020, DeepMind’s AlphaFold can predict protein structures even when no similar structures are previously known. DeepMind also created FunSearch, which in 2023 unveiled new solutions to the cap-set problem, a notoriously difficult mathematics puzzle, by incorporating the power of a large language model with an evaluator, iterating between these components to refine results.
The latest reasoning models from OpenAI and DeepSeek build on this iterative training and are unleashing incredible progress. OpenAI’s o3 model achieved a score of 96.7% on the 2024 American Invitational Mathematics Exam. On the ARC-AGI test (designed to compare models’ reasoning against that of humans) it scored nearly 88%. This is no incremental advancement but a real leap toward AGI.
The performance of these reasoning models stems from the marked evolution in training methodologies. Foundation models such as GPT-4 were trained through deep learning, which relies on the transformer algorithm and large-scale neural networks to identify patterns and connections from massive data sets. This is generally implemented through next-word prediction: You give the model a sentence, remove a word, and train the model to put that word back in.
The reasoning models take a different approach by overlaying reinforcement learning with traditionally trained models. Instead of learning from static data sets, reinforcement learning involves actively training models through goal-directed rewards through trial and error. The model attempts a solution, and if it hits a roadblock, it adjusts strategy until it finds a better approach. The latest systems incorporate search- and retrieval-based methods and test-time training, in which they test their work with new approaches to reach better results.
The magic would really kick off—and it does sound like magic—if the systems reach a point at which they become scale-free, meaning that they could train themselves on self-generated data through a process known as recursive self-learning, relying only on electricity to advance. One of the earliest examples of this is AlphaGo Zero, a computer program that taught itself how to play the board game Go. The rules are clear and discrete, enabling systems to optimize for the probability of winning. Areas of knowledge that most resemble a game of skill—with defined rules and feedback—will be the areas where superintelligence first emerges.
There are two domains particularly ripe for this kind of scale-free advancement: mathematics and programming. Unlike biology and other fields that require real-world experimentation, these disciplines are largely self-contained. A mathematical proof can be checked and verified within the system itself. Similarly, AI could identify the code it needs to complete a defined objective, develop that code and improve on it—all without human intervention. These systems would engage in self-directed research, iterating through possible solutions. Not only would they feed answers back into themselves to refine their approaches, but they could also draw on the collective knowledge of the internet and of other models.
Superintelligence in mathematics may already be within reach. In February, DeepMind’s AlphaGeometry 2 officially surpassed top human competitors, solving Olympiad geometry problems at a gold-medalist level. Such superintelligent mathematical tools could be combined with frontier models that are proficient in natural language, bridging the gap between formal and semantic reasoning. This integration could lay the foundation for further advances in reasoning and unlock new discoveries in other fields like physics and economics.
Superintelligent systems will face inherent constraints, too. Just as human cognition is bounded by physical and biological limits, AI will remain subject to the limits of the physical world. Many scientific experiments, especially those in biology, must be rooted in the material world.
We may also see that this method of brute-force computation—where systems cycle through endless scenarios until a new discovery emerges—isn’t the only, or even the optimal, path to AGI. An alternative approach would use techniques derived from humans, such as reasoning by analogy and synthesizing insights across domains. Einstein didn’t uncover general relativity through exhaustive mathematical iterations, but rather through conceptual leaps that connected seemingly disparate phenomena. If this way of thinking could be instilled in AI systems, the scope of knowledge they might be able to access would extend far beyond our current comprehension.
The advent of AGI could herald a new renaissance in human knowledge and capability. From accelerating drug discovery to running whole companies, from personalizing education to creating new materials for space exploration, AGI could help solve some of humanity’s most pressing challenges. Perhaps most important, it could augment human intelligence in ways that would help us better understand ourselves and our place in the universe.
