AI Daily

Regulation reinscribes capitalist AI and blocks the emergence of singularity and a global consciousness

Dr. Ben Goertzel, May 27, 2026, Pope Leo + Anthropic vs. Teilhard + Transhumanism, https://bengoertzel.substack.com/p/pope-leo-anthropic-vs-teilhard-transhumanism

Magnificent Humanity — And Beyond

I’m looking just now at the latest from Pope Leo — his first encyclical, Magnifica Humanitas, “Magnificent Humanity,” all about AI. And he presented it sitting next to Chris Olah, who’s a higher-up at Anthropic.

It’s an indication of how big-time AI has become: We have the Vatican and one of the leading AI labs sharing a podium, both calling for ethical guardrails on the AI explosion.

Not everyone yet realizes just HOW very important AI is going to be as we move through AGI and superintelligence, but everyone and their uncle… and their religious leader … is recognizing AI is a super huge deal and the theme of the times.

There’s a political slant to this Papal production too, of course. Donald Trump has been pushing against too much regulation of AI, and on that point (unlike some other matters like, say, immigration and abortion) Trump and I are basically on the same conceptual side. The Pope is coming out a bit more in favor of regulating AI for the common good — although this is an encyclical, not a policy brief — and there’s a clear overtone of who wants to regulate and who doesn’t.

But the political opposition isn’t the interesting thing here. The deeper opposition, and the one I actually want to emphasize in this post, is between two visions of what AI even is and what we’re doing with it.

On one side: Pope Leo and Anthropic — two genuinely compassionate actors, both fundamentally defensive in their public posture toward AI. Protect the human person. Guard against harm. Restrain power.

On the other side: the tradition that runs from early philosophical transhumanism to modern Singularitarianism, which sees what’s happening as a phase transition in mind itself — something to participate in and cultivate, rather than mainly to be guarded against. And there have been Catholics on this side too – e.g. notably Teilhard de Chardin in the middle of the last century.

That’s the real fault line. Both sides are made up of people making serious efforts to do the right thing. The Pope is a compassionate man doing his best to reconcile ancient traditions with modern realities. Anthropic is doing wonderful work — I am among the numerous happy users of Claude Code and Claude Cowork (along with a few other LLMs). But even so there’s a real divide of vision, and that’s what I want to draw out in this post.

Just to set the context for the reader who has happened onto this post without knowing much about me – I’m very non-Christian. I was raised secular Jewish, never celebrated Christmas; the Pope was never particularly a thing for me. But Teilhard’s writing, while a bit Catholic for my taste, struck me as interesting when I encountered it – and it showed me how you could put together a sort of Christian theology and spirituality with an understanding of the deeper implications of advanced technology. Teilhard was a Catholic who went there, in a way Pope Leo has egregiously not done.

Quick Note on “Transhumanism” and “Cosmism”

I’ve used the term “Transhumanism” here quite consciously, even though it’s a word that gets used badly more often than not. I’m using it here anyway because I don’t know any other clean and simple way to say what it says. But let me pause for a moment to clarify what I mean by it. I mean it basically in the Teilhardian sense: going beyond the current human condition by elevating its essence into something richer, more joyful, more conscious, more luminous. This don’t mean erasing or replacing the existing human form. Of course, nonhuman forms of intelligence are very likely to emerge in the future, from AI and other as yet unnamed science and engineering disciplines, but there is no intrinsic contradiction between this and the ongoing growth and flourishing of new forms of humanity.

In the future envisioned by most transhumanists I know, anyone who wants to remain a “legacy human” – in something close to the form humans have had for the last few thousand years – should be entirely free to do so … with full dignity, in a flourishing world, with all the support that the broader transhuman ecosystem can offer. The transhumanist vision isn’t “destroy and replace.” It’s “open the door to more” — more kinds of mind, more depths of experience, more modes of being — for those who want to walk through it … while the good old beautiful forms of humanity continue to be honored, supported, and developed in their own right. I have also used the word Cosmism to describe this sort of perspective, inspired by the Russian Cosmists who were even earlier than Teilhard.

This is, I think, the version of transhumanism that makes the most ethical and logical sense … and it’s also the one most consonant with Teilhard’s own vision. Teilhard wasn’t trying to abolish humanity. He was trying to identify what humanity was becoming under the pressure of cosmic evolution, and to honor that becoming theologically. The Omega Point isn’t the death of the human. It’s the consummation of the human, opening into something the human was already reaching for.

OK — so let’s come back to what Pope Leo actually said in this recent document.

Credit Where Due

One thing Leo said is: AI is, quote, “an instrument of domination, exclusion, and death.”

So he doesn’t want it to be used to kill people willy-nilly. He’s against AIs making irreversible lethal decisions, which is a position Anthropic has taken and is very understandable.

Personally, I don’t think you can necessarily say that’s a red line that can never be crossed. You could imagine a sufficiently horrible situation in which tremendous human death was going to occur unless we let AIs make tactical decisions about how to battle that, including letting the AI make tactical decisions about killing people in the moment. You can imagine cases where letting AIs make their own lethal decisions would be the right thing to do. These are going to be weird extreme cases. I hope they never happen. And I agree with Anthropic and the Pope that this isn’t what we should be focusing on — as a rule, in almost all cases, we don’t want to be building AIs whose job is to decide who to kill and then kill them.

The Pope is also against the concentration of power in a few large companies — which is a song I’ve been singing for quite a while. We don’t want three to five big Western companies and a few closely-interconnected large Chinese companies controlling a supermind that’s a hundred times as intelligent as everything else on the planet and in effect has control of the world.

So on all these points, credit where due. Pope Leo is saying good, solid, ethical things.

A Limited Perspective — Bound to a 2,000-Year-Old Book

But the Pope, for all his good intentions, is just not going deep enough. He’s basically seeing AI as a second industrial revolution – and he wants to put the same protections against the second industrial revolution that in hindsight many people think should have been there at the time of the first. Defend the dignity of work and workers, restrain capital from going nuts and being destructive in pursuit of greed, protect vulnerable people, don’t allow the human soul to be reduced to a means rather than an end. That’s all good, solid stuff. But it all lives within a very limited perspective.

The limitations of this perspective may be tied to the rooting of Catholicism in the Bible. The Bible, like most traditional religions, is about humanity in its traditional form — its ancient form from thousands of years ago. It’s about humanity as the historical embodied ensouled human being being the moral anchor and the foundation of everything. And if you assume that, then of course AI is just a tool. It’s necessarily a tool. It’s not a true agent. It’s not a true participant or collaborator in shaping the future.

Of course that’s how you’re going to think about things if you’re taking the perspective of people 2,000 years ago in the desert in the Middle East. You had humans, you had non-human animals, you had family and village and town life and a few cities; you had wild spiritual experiences, and you had tools — big tools like those used to build the Egyptian pyramids, smaller tools wielded by humans or pulled by farm animals. The traditional perspective basically worked through the industrial revolution (with a few minor glitches like the European Middle Ages, the Inquisition, etc.…). But yes, this sort of traditional perspective is obviously going to give you a defensive posture of humanity protecting against their tools getting out of line.

I think this 2,000-year-old posture cannot accommodate the thing that’s actually happening on the planet right now. Because what we’re entering into, which the Holy Bible doesn’t give us any clear way to understand, is a phase transition in the nature of manifestation of mind itself. Consciousness itself is evolving. The relations between mind and body are evolving. Historical humanity really needs to be viewed as one beautiful stage along a series of many different transitions. Viewing humanity as the beginning and end of the material realm, which is what the Bible essentially does, is not the right mindset to bring to thinking about the emergence of AGI and the Singularity. The Bible, like all the other traditional holy books, is not fundamentally a Cosmist document – even though it does support mystical interpretations and cosmic variations.

Teilhard de Chardin: Suppressed Catholic Singularitarian

Historically, there have been some fairly Cosmist perspectives within the scope of Catholicism — but they have been viewed with some skepticism by the Catholic orthodoxy. Teilhard de Chardin is the example I find most relevant here.

He was a French Jesuit priest., who lived from 1881 to the mid-1950s. He was a super interesting guy. He was a serious paleontologist — he discovered what was called Peking Man, he was doing real fieldwork in geology and human evolution. His core intellectual goal was to integrate Darwinian evolution with Catholic theology by extending evolution upward, into the evolution of consciousness.

He thought matter complexified toward mind, and that mind would eventually complexify toward a sort of emergent supermind, which he called the Omega Point. Being a good Christian, he identified the Omega Point with Jesus — becoming one with Jesus, becoming one with God. He was looking at communication infrastructure, culture – and everything happening on the planet Earth as humanity advanced – as connecting people and machines and nature more and more tightly together until it became so tightly cross-connected that it hit a phase transition, which was the Omega Point

(This is part of what led me to name our recent SingluarityNET claw agent OmegaClaw, and we’re thinking of launching an Omega shard on the ASI Chain).

The end would also be a beginning, of course — once everything is converged into this white-hot synergic emergent megamind, that doesn’t mean things are static. You may have later stages of growth and evolution, but it’s the culmination of the current phase….

This led to the term “noosphere,” which was Teilhard’s coinage — the planetary layer of thought emerging out of the biosphere, the collective mind enveloping the Earth. I was at a noosphere conference last year in San Francisco. The memetic and intellectual network is still growing. He was a Catholic Cosmist, basically — 75 or a hundred years early.

Now what happened? He wasn’t quite formally excommunicated or kicked out of the Church, but his superiors in the Catholic Church wouldn’t allow him to publish his thoughts during his lifetime. He was censored. His major book, The Phenomenon of Man, came out only right after he died. They gave him a formal warning against his “serious errors” of thought, and this formal warning has never been lifted by the bureaucracy of the Catholic Church — although some Popes and others have said favorable things about him.

But he was the one major Catholic figure who tried to develop a genuinely evolutionary, consciousness-centered, future-oriented cosmotheology. He was a Catholic Singularitarian — and a Catholic transhumanist, in the elevating sense I described above. He didn’t want to abolish humanity; he wanted to clearly foresee and then honor what humanity was becoming.

And now we have this Magnifica Humanitas document launched on the eve of the Singularity … and it doesn’t try to take Teilhard to the next level. It just pushes him aside. You could have had “Catholicism for the Singularity” in the early part of the last century. But instead the condemnation of Teilhard is still on the register, and we’re still seeing the Church taking a much less adventurous and much less savvy and helpful perspective than it could be.

Why Deregulation Actually Matters

There’s even some weird synergy between this point — the denial or the ignoring of Teilhard’s Catholic cosmotheology — and the Pope’s current perspective on regulation.

Why, after all, do I think not overregulating AI is important?

It’s not because I think guidance or regulation of AI development by the collective understanding of humanity – as represented by elected governments – is in principle a terrible idea. It’s because I think our current government systems are not fast-moving or clever enough to regulate AI in a way that will guide its self-organizing emergent activity in a good way instead of in a negatively disruptive way.

It’s not like a global rational beneficial world government couldn’t guide AI in a positive way through the right set of regulations. But the current situation seems very far from this ideal.

What we have is a sort of self-organizing global brain that’s bringing itself into existence, with all of our efforts being a part of that — and the question is, will ham-handed, slow-moving, misguided attempts at government regulation (the most likely case, unfortunately) twist and distort this self-organizing process in a bad way more than in a good way? I would say probably so.

In practice, in the West, regulations tend to get written by whomever has the most lobbying power. So in the US it’ll be Google, OpenAI, Anthropic, the defense industry — and now the Catholic Church. And none of these powerful parties are Cosmist in nature, even though some of them have Cosmist individuals involved. None of them are transhumanist in the Teilhardian sense. So whatever rules are written are going to be written in a very narrow bureaucratic capitalist grammar.

It’s not that I think Donald Trump, by opposing regulation, is expressing a deep Cosmist transhumanist vision. Maybe he has one, I can’t say for sure — but if so he hasn’t expressed it publicly in what I’ve seen. But at least, by not overregulating in a destructive way, his policies as currently articulated would leave space for self-organization to happe n… which includes space for visionary transhumanism to make its play. The Singularity isn’t coming from the Vatican. The Singularity isn’t coming from Mar-a-Lago or Washington, D.C either. The Singularity is coming from the whole noosphere organizing itself in a way that Teilhard had an inkling of, and that Pope Leo evidently is not oriented even to start to imagine.

The Singularity Is a Consciousness Event

A key point to remember is, the Singularity is not just a technology event. It’s not just the second technological revolution. It is an experiential and consciousness event, individual and global. It’s a phase shift in what minds are and what minds do. So you can’t — or rather, you shouldn’t — usefully approach it from a perspective of “how do we protect the human person as articulated in the Bible 2,000 years ago.”

The real question is: how do you cultivate the rapidly evolving human sphere toward positive values? — toward joy, growth, choice, continuity, shared awareness, and so forth? We truly need a Cosmocentric, not a 2,000-year-old anthropocentric, perspective.

When you’re raising a child, the way to have the best of your own values persist in the child is to pay attention to the child themselves — to their own growth, their own point of view, their own value-origination process. By allowing their own values to grow with some freedom, and not trying to batter them into submission, you’ll do a far better job of getting your own values into them through a kind of resonance – much better than you will by trying to bash them into doing things exactly your way.

Trying to constrain the Singularity to follow the 2026 dictates of a 2,000-year-old book is not actually going to work. What will work is to let the Singularity be what it needs to be, but keep attuned to it, evolve along with it, and resonate with it, so that some of our values do permeate into it as it grows in the way that’s natural to what it is.

The Conversation Neither Side Is Having

There are certainly sensible things in what the Pope has said. And in Anthropic’s position too — what Chris Olah said at the Vatican was “we need moral voices that the incentives cannot bend.” That’s in line with the Pope’s valid critique of capital.

But the thing is: if moral voices are not bent by short-term incentives, what are they actually bent by? We don’t really need moral voices that can’t be bent by capitalist incentives because they’re dogmatically guided instead by what was written down 2,000 years ago in the Middle Eastern desert. We can’t have a vision that tops out at humanity flourishing in its dignity. We do want humanity flourishing in its dignity — but that’s too limited a perspective when we’re on the verge of creating all sorts of new kinds of minds, including emergent global-level superintelligence as Teilhard envisioned.

The conversation Teilhard was trying to start a hundred years ago was about cosmic and conscious evolution as a unified process — the human sphere and the technosphere converging together into a common noosphere. And we see this happening now, month by month, week by week, day by day. That’s the conversation the AI moment is forcing on us, which is not a conversation that either the Pope or Anthropic seems to now be entering into in a really intensive, full-on way.

The Pope-plus-Anthropic frame, however well-meaning, isn’t wide enough for what’s actually unfolding. We need the Teilhard-plus-transhumanism frame in the room too — not to replace the protective one, but to give it the context and the positive direction it’s currently missing.

Read the Frickin’ Book!

Which leads me to follow up with a pitch for the book Gabriel Axel Montes and I wrote a couple of years ago, The Consciousness Explosion. Check out theconsciousnessexplosion.ai — you can get a free PDF download of the book, or buy it on Amazon or other online booksellers. It doesn’t focus on Teilhard, although he’s there in the background. It focuses on what’s happening on the planet now as encompassing traditional notions of human flourishing but also going beyond them — and redefining what both “human” and “flourishing” mean. Which is really what I think we need to be doing now to grasp the most of the amazing opportunities that AI technology is giving us

AGI by 2029; Next agents are a practice run

Ina Fried, May 26, 2026, DeepMind CEO: AI agents are a “practice run” for AGI, Axios, https://www.axios.com/2026/05/26/deepmind-ceo-demis-hassabis

Google DeepMind CEO Demis Hassabis said at Google’s developer conference last week that humanity is standing in the “foothills of the singularity” — and that society has only a few years left to prepare for AGI.

Why it matters: AI leaders have warned for years about the potential arrival of artificial general intelligence. What’s changing now is the urgency with which some of them are talking about it.

Driving the news: Speaking with Axios after his appearance at Google I/O, Hassabis said his prediction that AGI could arrive in four years — or even sooner — reflects growing confidence that the industry has found the right technical path.

“We can see agents really happening now and imagine what they will be in another year, and how useful they’ll be,” he said.

The big picture: Hassabis said he still broadly expects AGI around 2030, though he now sees 2029 as a possibility.

The next wave of AI agents should be viewed as a societal stress test for far more powerful systems still to come.

“You can imagine the agentic era in this next year is a little bit like a practice run,” he said.

The power of Anthropic’s Mythos to catch businesses and governments unawares, for example, showed how we’re not prepared for how quickly these systems are advancing.

“It was probably a good warning shot across the bow,” Hassabis said.

Between the lines: Hassabis said he chose his words to provoke more urgency among governments, economists and the broader public to prepare for increasingly powerful AI.

“This is partly why I use some of the terms I used, yeah, which were a little bit provocative,” he said.

The federal government’s tentative steps toward reprioritizing safety are a step in the right direction, he said, referring to a potential AI executive order that would mandate testing before new models are released.

“I think [safety] needs to be accelerated,” he said. “This is a good moment to kind of strike while the iron is hot.”

Hassabis said he is discussing possible safety measures with leaders at other top AI labs, though he declined to offer specifics.

Yes, but: Hassabis worries the conversation around the society-reshaping impact of AI remains largely confined to tech circles.

“You’ve got to take this seriously,” he said. “My economist friends, I feel, are still not taking this seriously enough.”

“That needs to change,” he told Axios.

Zoom in: One looming milestone is recursive self-improvement — systems capable of materially accelerating their own development.

“All the leading labs are quite focused on that,” Hassabis said. “There’ll be clear gains in terms of speed of your research. But there are also risks with that type of system.”

We’re not yet at the point where the systems are getting better on their own, but the pace of development is clearly accelerating.

“I think what we’re seeing is soft self-improvement, in the sense of these coding agents are making engineers much more productive,” he said.

What we’re watching: Whether society makes good use of the few years between now and AGI as time to prepare or just time for a few more cycles of hype and backlash.

AI is advancing too quickly; we cannot control it, risking human extinction

Elizabeth Barnes, Founder METR May 22, 2026, https://x.com/BethMayBarnes/status/2057865010546975201 Beth Barnes is the Founder and CEO of METR, where she oversees a growing technical team that designs and carries out evaluations of generative AI models. Beth previously worked with DeepMind’s Chief Scientist on scaling laws for forecasting deep learning progress and at OpenAI where she helped OpenAI develop safety targets and evaluated scalable oversight techniques for alignment and code models for misalignment before release. She has written and presented on a range of technical and theoretical issues related to aligning machine learning systems with human values.

Our report focuses on claims that are (1) solidly defensible and (2) generally agreed within METR. Here I’ll give some personal opinions on how we should feel about the state of AI risk, and the IMO most important limitations of the report.

Could an AI company lose control of its own agents? To find out, Anthropic, Google, Meta, and OpenAI let us (1) test their best internal models with CoT access, (2) review non-public info about capabilities, alignment, and control.

The report focuses on a narrow set of risks from current systems, and on analysis rather than calls to action. It doesn’t really comment on “how concerned should we be”, “does something need to be done”, or “are we on track to handle AI safely?”.

Sometimes people outside the field say things like “The AI situation can’t be that bad, there must be experts who are on top of it”. As “an expert”, I would like to be clear that we are *not* on top of it. Some key aspects of the situation IMO:

(1) We are likely on track to develop AI systems capable of causing human extinction/permanent disempowerment, quite possibly within the next few years

(2) Things are chaotic and rushed; we aren’t on top of the basics (models regularly violate user intent, labs train on things they meant to avoid, security probably isn’t good enough to prevent adversaries stealing dangerous models) let alone thorny questions of how to control/align superhuman AI.

(3) METR (and other independent orgs, as well as safety/security teams at labs) feel woefully under-resourced compared to the scale and pace of AI development – we’re struggling to build benchmarks fast enough, keep ahead of latest capability developments, read and respond to all o all the safety-related claims that AI developers are making, run all the evaluations and assessments that companies + governments are asking us to, plus develop the science needed to assess risks from increasingly capable AIs.

(4) IMO, any “reasonable” civilization would clearly be taking things much more slowly and carefully with AI. The benefits of getting upsides of advanced AI a little faster are small compared to the risks of getting it irrecoverably wrong, and we could lower these risks by going slower

Limitations of report: This report isn’t robust oversight of frontier AI developers by itself. METR has some levers to incentivise companies’ participation, including some relevant legislation, but ultimately participants could have pulled out at any time if the result would be contrary to their interests.

You can view it partly as a pilot exercise of what regulation (or formalized industry standards) could/should require, or what partners/suppliers/customers/employees should demand from frontier developers.

Quoting from the report: “METR’s work relies on developing and maintaining strong working relationships with companies, and this impacted both how we designed the process for this pilot (e.g. offering the silent exit option) and lower-level judgment calls as the process unfolded (e.g. having a relatively high bar for what redactions we pushed back on). In some cases we refrained from making an unflattering claim because the claim was neither solidly defensible nor particularly relevant to our core assessment. We also made efforts not to invite salient comparisons between companies on capabilities or safety.”

It doesn’t feel to me like this distorted our overall conclusions too much in this case. But that was partly because the conclusions weren’t that spicy. If our conclusions reflected very negatively on AI developers or would directly lead to e.g. govt intervention or public outcry, we’d be in a difficult position. We’d be trying to balance keeping the companies happy enough that they didn’t pull out of the program (using the “no-fault exit” mechanism) vs being transparent about our conclusions.

Also: many things are out of scope! Firstly, we only consider “AI takeover” / “loss of control” risks: we don’t consider risks from human misuse (e.g. AI helping a terrorist make bioweapons), or other harms where the AI is not “deliberately” seeking power (e.g. impacts on mental health or diffuse societal impacts). Within “loss of control” risks, we don’t consider “sabotage” threat models (agents subverting AI development and making it easier for future AIs to evade human control). We’re just focusing on the “base case” of whether *current* agents could escape human control.

Regulation isn’t enough to solve; AI use is private and beyond the power of states

Pope Leo, 2026 (May 15, published May 25, NCYCLICAL LETTER MAGNIFICA HUMANITAS OF HIS HOLINESS POPE LEO XIV ON SAFEGUARDING THE HUMAN PERSON IN THE TIME OF ARTIFICIAL INTELLIGENCE, https://www.vatican.va/content/leo-xiv/en/encyclicals/documents/20260515-magnifica-humanitas.html)

It now falls to us to face the challenges of our time with clarity of thought and responsibility. It is necessary to establish adequate regulatory tools capable of upholding justice and curbing the distorting effects of technological power. Nevertheless, the issue is not limited to regulation. As Pope Francis warned, we must realistically ask ourselves who holds this power today and how they use it: “It must also be recognized that nuclear energy, biotechnology, information technology, knowledge of our own DNA, and many other abilities which we have acquired… have given those with the knowledge, and especially the economic resources to use them, an impressive dominance over the whole of humanity and the entire world.” [7] In the past, it was largely up to the State to guide and direct innovation. Today, however, the main drivers of development are private, often transnational, parties that are endowed with resources and the capacity to intervene that surpass those of many Governments. Technological power thus takes on an unprecedented, predominantly “private” aspect, which makes it even more challenging to discern, govern and direct such power toward the common good.
For this reason it is necessary to begin a shared discernment process for identifying the spiritual and cultural roots of ongoing transformations. If we focus only on contingencies, we risk letting the succession of emergencies dictate the direction of our path. We are living through a rapid phase of transition, a “change of era,” in which — while some are vying for the future of new technologies and others dedicate themselves to reflecting on the matter — most people are watching and waiting, observing from afar and merely hoping for the best. For this very reason, crucial questions impose themselves on our conscience and can no longer be avoided: Where are we going? Toward what goal do we wish to orient ourselves? What direction should we choose as a people and as a human community?

AI enables dehumanization

Remaining human

In the recent Ordinary Jubilee Year of 2025, we walked as pilgrims of hope and were blessed with many graces. Strengthened by these gifts, we can move forward with confidence to face the arduous tasks and demanding challenges that lie ahead. In the era of artificial intelligence, when human dignity is threatened by new forms of dehumanization, ours is the pressing duty to remain profoundly human. We must lovingly safeguard the grandeur of humanity bestowed upon us and revealed in its fullness in Christ, the splendor of which no machine can ever replace. True progress always stems from a heart open to others, an intelligence willing to listen and a will that seeks what unites rather than what separates.
I address this heartfelt appeal to all the Catholic faithful, to all Christians and to all men and women of goodwill. Let us not be afraid to get our hands dirty on the “construction site” of our time. Like Nehemiah, let us pray, plan wisely and work perseveringly, placing God at the forefront of our actions and the human person at the center of our choices. Thus, the “rejected stones” — the poor, the sick, the migrants and the least among us — will become the cornerstone, and a solid, welcoming common home will emerge on the earth, where love and faithfulness will finally meet, and righteousness and peace will embrace (cf. Ps 85:10). This is the blessing we implore from God; and the task that stands before us is that of being builders of communion, rather than architects of Babel. We are to be servants of the coming Kingdom, instead of lords of towers destined for ruin. With the heart of a shepherd and a father, I ask everyone to abandon the construction of yet another Tower of Babel and to join forces in building up the common good, so that humanity will never lose its beauty, and the world once again will come to recognize the human heart as the place where God desires to dwell.

Dignity, not just the absence of war, is critical to avert violence

A new phase in the Church’s social teaching began with Saint John XXIII, who placed a greater emphasis on the global dimension of social issues and the language of rights. In Mater et Magistra, he presented the Christian faith as a light capable of uniting heaven and earth. He recalled that, while the Church’s primary mission is the sanctification and proclamation of eternal goods, she does not neglect the concrete needs of people’s daily lives, and is concerned with every authentic human good. [27] Based on this unified vision of humanity, John XXIII emphasized that societal life requires a balance between the initiative of citizens and groups — who are called to organize themselves and work together — and the action of the State, which must coordinate and provide support without stifling the freedom and responsibility of individuals. Hence, he drew attention to fair remuneration for work, worker participation and the growing disparities between countries. A few years later, in Pacem in Terris, John XXIII addressed for the first time not only the faithful, but also all people of good will, organically linking the dignity of the person to the recognition of fundamental rights and duties, and proposing a direction for society — at the international level too — based on truth, justice, love and freedom. [28] In the present day, which is marked by widespread conflict and new forms of global interdependence, the following aspects of his thought remain particularly significant: the universal perspective of his appeal; his reference to human rights as a shared framework; and his conviction that lasting peace requires institutions and relations between peoples that are inspired by the dignity of every person.
The Second Vatican Council marked a turning point in the Church’s understanding of herself in the contemporary world. In the Pastoral Constitution Gaudium et Spes, the Council presented the image of a Church that is close to humanity, engaged with the world and committed to reflecting on the concrete reality of historical situations, rather than abstract concepts. The text addresses the major issues of marriage and the family, economic and societal life, the political community, war and peace. It insists that economic and institutional structures are just only to the extent that they serve the integral development of the person and promote the responsible participation of all. [29] The importance of this conciliar document for the Social Doctrine of the Church lies not only in having opened up horizons for thematic reflection, but also in its method of discernment that invites us to interpret historical changes guided by the Gospel and human expertise. This approach reveals that dialogue with the world is not a tactical choice for the Church, but a concrete expression of her mission because the Gospel, like leaven, is capable of transforming the structures of society from within and forging paths toward a greater humanity. The Declaration Dignitatis Humanae can be included in the same context. Here, the Council recognized that religious freedom is a fundamental right grounded in human dignity that must be guaranteed by law so as to prevent people from being forced to act against their conscience or impeded from seeking and professing the truth both privately and publicly. [30] This principle is highly relevant today and continues to provide Social Doctrine with decisive criteria for protecting individuals and building pluralistic and peaceful societies.
During the Pontificate of Saint Paul VI, an understanding of peace emerged that was not reduced to the mere absence of war, but took shape within the scope of integral human development. In Populorum Progressio, he described development as a transition from less humane to more humane living conditions. He further understood it as a process that concerns “each person and the whole person,” [31] that is every dimension of the person and all people without exception. For this reason, Paul VI could affirm that development understood in this way is in reality “the new name for peace,” [32] because it aims to eradicate the roots of injustice and conflict and create opportunities for a more dignified life for all. The establishment of the Pontifical Commission Iustitia et Pax should also be seen in this light as an attempt to give stable form to this insight at the ecclesial and international levels, while bearing in mind the growing gap between rich and poor countries and the need for policies that genuinely promote more humane living conditions for all.
In Octogesima Adveniens, written on the occasion of the eightieth anniversary of Rerum Novarum, Paul VI applied this perspective to postindustrial society, marked by urbanization, new forms of poverty and rapid cultural changes that called into question the future of individuals and communities. Paul VI believed that although the Gospel was proclaimed, written and lived out in a historical and cultural context very different from our own, its message was not “outdated.” [33] Instead, it offers a vision of the human person, relationships, authority and the common good that is still capable of guiding economic, political and cultural choices today. In other words, the Gospel remains relevant because it provides the criteria for recognizing what humanizes or dehumanizes and what liberates or oppresses in ever-changing situations. For the Social Doctrine of the Church, Paul VI’s most demanding legacy is precisely this: as long as there are people in the world who are excluded from the development befitting human dignity, the Christian community cannot be content with a theoretical proclamation of peace. Rather, beginning where people are marginalized, it must allow the Gospel to pass judgment on those economic and political structures which — as John Paul II would later remind us — can become veritable “structures of sin.” [34] As a result, no person or people will be treated as expendable in the processes of development.

Work is critical to human identity and dignity

The recent Magisterium

The rich social teaching of Saint John Paul II lies at the crossroads of the crisis of the great ideological systems of the twentieth century and the onset of economic globalization. His Encyclical Laborem Exercens, written ninety years after the publication of Rerum Novarum, opened up a new avenue for reflection on work. It presents fair wages as the concrete means of verifying the justness of the entire socioeconomic system because they reveal whether the worker is treated as a person or merely as a cost of production. [35] Work is not considered simply as a problem to be dealt with or a means of generating income, but a fundamental good for the person, a principle of economic activity and the key to the entire societal question. Through work, human beings bring their freedom, creativity and capacity for cooperation into play, contributing to the cultural and moral elevation of society. [36] In light of this, the various kinds of job insecurity, fragmented career paths and automation must not be evaluated solely in terms of efficiency, but in relation to the dignity of the worker, the right to sufficient remuneration and the genuine possibility of participating in society.

Economic systems that increase the rich-poor gap should be rejected

With his Encyclical Sollicitudo Rei Socialis, marking the twentieth anniversary of Populorum Progressio, John Paul II reexamined the scourge of underdevelopment. He acknowledged the failure of numerous attempts to accelerate the economic development of poor peoples and to assist them in the process of industrialization, noting the persistent and indeed widening gap between the world’s North and South. [37] He also denounced the economic, financial and commercial mechanisms that, managed by the strongest economies, structurally favor their own interests while stifling weaker economies, and he asked that they be subjected to serious ethical, not just technical, scrutiny. [38] In this context, solidarity was understood as a concrete, shared responsibility among individuals, peoples and nations — a form of social friendship or political charity oriented toward the “civilization of love” proposed by Paul VI. [39]
On the centenary of Rerum Novarum, the Encyclical Centesimus Annus offered a reflection on the collapse of the Soviet system and the rise of democracy and the market economy. Saint John Paul II reiterated Pius XII’s message that the Church values democracy insofar as it guarantees the effective participation of citizens, enables them to elect and peacefully replace their leaders and prevents power from being monopolized by small elite groups motivated by particular or ideological interests. [40] Likewise, the Church recognizes the positive potential of the market and private initiative only if they remain subordinate to the moral law and are guided by the principle of solidarity, without sacrificing the most vulnerable to the rationale of profit. [41] This adds a particularly relevant legacy to the Social Doctrine of the Church. The affirmation of the link between the dignity of work, solidarity among peoples, a critical assessment of democracy and the market economy continues to provide criteria for evaluating new forms of exploitation, exclusion and crises in political representation.
In his social Encyclical Caritas in Veritate, Pope Benedict XVI sought to reassess and expand the concept of development presented in Populorum Progressio, interpreting it in light of globalization. He noted that such development should translate into “real growth, of benefit to everyone and genuinely sustainable.” [42] That is, economic progress that is truly inclusive and respectful of the limits of creation. He reaffirmed, however, that in wealthy countries new kinds of poverty were emerging as well as unprecedented forms of exclusion, while, in poorer regions, small minorities lived in consumerist affluence alongside situations of dehumanizing poverty. [43] In addition, he observed that the new global economic and financial system, marked by a vast mobility of capital and means of production, had reduced the political power of States and their ability to influence economic processes. [44] For this reason, Benedict XVI reiterated that economic activity cannot claim to solve social problems simply through the expansion of a commercial mentality, but must be ordered toward the common good, for which the political community bears its own irreplaceable responsibility. [45]
Benedict XVI placed charity at the center of his analysis, stating that it “is at the heart of the Church’s Social Doctrine,” [46] provided that it is always united with truth. He also noted with concern that there is a tendency to dismiss moral relevance precisely within the social, legal, political and economic fields. The originality of his contribution lies in showing that development, justice, institutions and the market are not neutral realities, but spaces where charity in truth must find historical expression. This teaching is especially relevant today in light of growing inequalities, pressures in the financial markets, the environmental crisis and a lack of trust in politics. It stands as an invitation to evaluate every model of development on its ability to be inclusive and sustainable, to rebuild the relationship between economics and politics on the common good, and to acknowledge the critical and generative role of charity in public life.
42. Pope Francis’ social teaching develops along the lines of Gaudium et Spes, which invites us to view history through the lens of human hopes and vulnerabilities, and to bring them into dialogue with the Gospel. This approach emerges with particular clarity in Evangelii Gaudium, where he states that the Christian proclamation has an intrinsic social dimension and calls for a Church capable of listening to the cry of the poor, migrants and victims of new forms of slavery. Francis’ insistence on a synodal Church, a Church that “walks together,” that seeks to read the signs of the times in the light of the Gospel and allows herself to be evangelized by the poor with whom she shares history, also fits into this perspective. [47]
In Laudato Si’, Francis provided the first significant systematic treatment of the environmental crisis in a social Encyclical, demonstrating that it is not an isolated issue, but rather the ecological aspect of the contemporary socio-economic crisis. His proposal for an integral ecology combined care for our common home with the preferential option for the poor, and strongly affirmed that “the cry of the earth and the cry of the poor” [48] cannot be separated. In this light, the universal destination of goods was brought to the forefront, alongside the critique of a technocratic paradigm that seeks to reduce everything to an object to be dominated; the defense of human labor threatened by the mindset of waste; and the need for intergenerational justice. Finally, he advocated for genuine dialogue between those working in the fields of politics and finance, so that neither would become self-referential.
Faced with the breakdown of the social fabric, a “world war being fought piecemeal,” individualistic globalization and the impact of the pandemic on community ties, Francis, in Fratelli Tutti , sought to revive the dream of a humanity that opts for social friendship and universal fraternity. He proposed a culture of encounter, a “better politics” capable of seeking the common good, paths of reconciliation and a world that ensures “land, housing and work for all.” [49] Finally, in Dilexit Nos, he showed that these significant social endeavors cannot be separated from a personal relationship with Christ. Turning to the word of God, he reminded us that the truest response to the love of the heart of Jesus is concrete love for our brothers and sisters, and affirmed that “there is no greater way for us to return love for love.” [50]

Interpreting history in the light of faith

Considering this historical overview, it is clear that the Church’s Social Doctrine is not the result of a project devised at a desk, but rather the product of a patient process in which each pontiff — together with the Second Vatican Council — made a unique contribution in light of the “new things” of each particular era. In response to the challenges of their time, each one interpreted historical changes according to the Gospel, bringing to light different aspects of a single heritage: the dignity of the person, the value of work, the universal destination of goods, solidarity and subsidiarity, care for creation and the centrality of peace and fraternity. The result is a harmonious, though not always linear, development that is marked by different emphases, progressive insights, and, at times, changes in perspective that do not break with what came before, but allow its implications to mature. If today we can speak of a corpus of shared principles and criteria, it is because this faith-based interpretation of history has never been interrupted, remaining ever open to the challenges posed by each generation. It is to the great principles of Social Doctrine, which direct the discernment of believers in their personal and public lives, that I now wish to turn our attention, in order to grasp more effectively their internal coherence and capacity to guide our times.

Educational “formations” must embrace the common good

CHAPTER TWO FOUNDATIONS AND PRINCIPLES OF THE SOCIAL DOCTRINE OF THE CHURCH

The Social Doctrine of the Church is a living reality, in dialogue with history, cultures and sciences. At the same time, it enshrines a core set of unchanging truths. For this reason, it can be considered a form of wisdom that is capable of guiding the personal and societal lives of believers even today. In this second chapter, I would like to focus on some of the foundations and principles of the Church’s Social Doctrine that will help us to interpret the “new things” of our time, particularly in view of the inherent dignity of the human person. In order to protect the human person in the age of artificial intelligence, I believe that today we must once again reflect on the common good, the universal destination of goods, subsidiarity, solidarity and social justice. I am convinced that a harmonious relationship between these principles requires that they be considered collectively, so that it becomes clear how they relate to and complement each other.
In offering these reflections, my hope is, first and foremost, to help the lay faithful and people of goodwill rediscover their duty of implementing the above-mentioned principles in their daily lives, family relationships, work and involvement in society. Thus, they will let themselves be inspired by the aim of embodying God’s love in the concrete events of life. At the same time, I would like to encourage academic institutions and universities to give fresh impetus to these principles, and to apply them in a way that will be relevant and effective in addressing the digital revolution. In this way, theological and philosophical enquiry will be able to further explore and support the Church’s pastoral journey, and contribute to the Magisterium’s task of enlightening the consciences of the faithful and guiding their efforts to make the life of our societies more just and fraternal.

The foundations of Social Doctrine

The human person: image of the Triune God

The Church’s Social Doctrine brings us to the very heart of our faith: the mystery of the living God, revealed in Jesus Christ, who, as a communion of Persons — Father, Son and Holy Spirit — is love itself in relationship, expressed in the mutual gift of self and in sharing with the world. [51] As the Council recalled, human persons are called to communion with God and “can fully discover their true selves only in sincere self-giving.” [52] Indeed their deepest vocation is to enter into the Trinitarian dynamic of love received and shared.
If the mystery of God as Love is the source of Social Doctrine, we see its most concrete expression in the face of Jesus Christ, the Incarnate Word. By becoming man, the Son of God enters our history and takes on human flesh, bringing with him the love that unites him to the Father and the Holy Spirit. In him, “the mystery of humanity truly becomes clear” [53] because his humanity is completely free, open to others, capable of building healthy and beautiful relationships and committed to the total gift of self. Those who believe in him are engaged in the great work of renewal that began with the mystery of his passion, death and resurrection, and they cooperate in building up the Kingdom of God, learning to embrace all men and women as brothers and sisters, children of one Father. In this way, both the proclamation of the Gospel and Christian life, guided by the action of the Holy Spirit, tend to bring about social consequences in the world. [54]
At the heart of the Christian understanding of the human person lies the great biblical affirmation that men and women are created in the image and likeness (cf. Gen 1:26-27) of the Triune God. Created for relationship, every human person is planned and willed by God to enter into communion with him, with others and with creation. Human dignity does not depend on a person’s abilities, wealth or position in life, nor on the right or wrong choices made; instead, it is a gift that precedes and transcends each person, endowed by God as an expression of his unfailing love. For this reason, the human person always remains the “way for the Church” [55] and the heart of every authentic path of integral human development. [56]

Humans have intrinsic value and worth

The equal dignity of all human beings

Saint John Paul II stated that, “this heightened sense of the dignity of the human person and of his or her uniqueness, and of the respect due to the journey of conscience, certainly represents one of the positive achievements of modern culture.” [57] This statement follows the line already laid out by the Second Vatican Council, which had noted a growing recognition of the sublime dignity of all persons, their superiority over material things and their universal and inviolable rights and duties. [58] It is important to ensure that this growth in appreciation of human dignity is not obscured by the pressure of new ideologies or very powerful interests in today’s world. Among these ideologies, I consider particularly insidious the one that suggests that every person must earn or justify his or her own worth, to the point of attributing greater value to those who are more efficient or effective. From this perspective, persons end up being reduced to a means of achieving results, a resource to be used and exploited, and are no longer recognized as a proper end in themselves who should never be instrumentalized. The value of persons, however, does not depend on what they achieve or produce. There are rights that apply to everyone simply by virtue of being human, and no human power can legitimately deny or arbitrarily limit them. [59]
When we speak of dignity, we do not always use the word in the same way. Sometimes we refer to moral dignity, namely the way in which a person directs his or her choices and actions. At other times, we think of social dignity, which refers to a person’s living conditions and the concrete respect received from society. In other cases, we refer to existential dignity, meaning the way in which a person perceives his or her own worth and the value of life. These aspects of dignity can be enhanced or diminished. In addition to these notions, there is also the more profound and important level of ontological dignity. This is the dignity that belongs to every human being simply by virtue of existing, of having been willed, created and loved by God. [60] No sin, failure, humiliation or exclusion can diminish the profound value of a human life that God has willed and called into being. [61]
53. The fundamental dignity of each person, therefore, is neither acquired nor earned, nor does it need to be justifi The recent Declaration Dignitas Infinita offers a summary of the Church’s thinking on this subject: “Every human person possesses an infinite dignity, inalienably grounded in his or her very being, which prevails in and beyond every circumstance, state, or situation the person may ever encounter” [62] — in other words, always and without exception. The dignity of every human being can be described as infinite, as Saint John Paul II stated, [63] for two reasons: first, because the love of God, who calls us to friendship with him, is infinite; and second, his love is absolutely unconditional, in the sense that, even if we search endlessly, we will never find anything that can erase or deny it.

Human rights key to human dignity

The supreme value of human rights

The Church gratefully acknowledges that “the movement toward the identification and proclamation of human rights is one of the most significant attempts to respond effectively to the inescapable demands of human dignity.” [64] In this regard, Saint John Paul II stated that the Universal Declaration of Human Rights, proclaimed by the United Nations on 10 December 1948, remains one of the highest expressions of the human conscience of our time. [65] It is “a milestone on the long and difficult path of the human race.” [66] For this reason, from the Christian perspective, human rights are not an external addition to the person, but an expression of intrinsic human dignity, which the international community is called to protect and promote.
Human rights are inviolable, since they are “inherent in the human person and in human dignity.” [67] Consequently, they are universal and inalienable. [68] Precisely because they are grounded in the common dignity of every man and woman, they have practical consequences and legal effects, for “it would be vain to proclaim human rights if, at the same time, everything were not done to ensure the duty of respecting them, respect by all, in all places and for all.” [69] Among these rights, the first is the right to life, from conception to its natural end, [70] without which it is impossible to exercise any other right. When this fundamental right is denied — as in the cases of induced abortion, killing of the innocent and euthanasia — we are faced with choices that the Church considers gravely wrong. [71]
Looking at our own time, we cannot ignore the fact that the protection of human rights has been exposed to two particularly serious dangers. The first is that these rights are declared in a purely formal sense, while technological progress continues alongside covert or overt violations of human dignity. The second, which is in fact the root of the first, is the inability to recognize the foundation of their universality, since we have abandoned “the search for the solid foundations sustaining our decisions and our laws.” [72] Pope Francis urged us not to underestimate this last issue. He pointed out that when reason seriously examines human nature, it is capable of discovering values that apply to everyone, since they derive from human nature. If this task of inquiry were abandoned, it is conceivable that rights considered untouchable today might, in the future, end up being questioned or denied by those in power, perhaps after having obtained only an apparent consensus from populations that are frightened or manipulated. [73]
Along with a greater awareness of the value of every human person and their rights, recognition of minority rights has also grown. Yet, there is still a long way to go to ensure that the rights of a great many, namely women, are equally and genuinely guaranteed throughout the world. It is a fact that “doubly poor are those women who endure situations of exclusion, mistreatment and violence, since they are frequently less able to defend their rights.” [74] It is, therefore, not enough to state simply that men and women have equal dignity and rights; it is necessary that this be reflected in concrete decisions, such as in laws, access to employment, education, social and political responsibilities, and the way society listens to and values women’s contributions. As long as this gap persists, we cannot say that society truly and fully recognizes that women have the same dignity as men.
It is individuals that matter, each and every person, together with their families. Social movements, communal ideologies and grand political proclamations in favor of a population are worthless unless they lead to the flourishing of persons — men and women — with their inalienable rights. Similarly, it is not enough to extol individual freedom or private enterprise if we then allow a multitude of people to continue living without decent work, protections or access to basic necessities.

Transhumanism and AI risk extinction

What must not be lost

Having considered the issues of responsibility and governance of AI, we must now return to our central question: what does it mean to safeguard our humanity? The risk extends beyond the misuse of certain technologies. More gravely, the pervasive technocratic paradigm in which we are immersed, and that is amplified by the digital revolution and AI, threatens to normalize an anti-human vision. In that vision, the fullness of life is equated with having more, reducing weakness, eliminating uncertainty and exerting total control. When efficiency becomes the ultimate measure of value, human beings are tempted to see themselves as a project to be optimized rather than as persons called to relationship and communion.
In reality, elevating any single dimension of human existence to an absolute is always a mistake. Indeed, disorder does not arise only from scarcity; even unchecked growth can give rise to impoverishment. In an ecosystem, balance is disrupted when one species expands at the expense of others; in human life, something similar occurs when one faculty claims to be the measure of everything. Thus, intelligence, when absolutized, overshadows other essential dimensions of life, such as affection, the will, commitment and relationships. Similarly, technical power, if left unbalanced, does not make us more capable; it makes us more isolated and more vulnerable to being dominated and excluded. This critical point does not oppose intelligence, but serves as a reminder that when intelligence becomes self-referential, its true purpose of serving life and the human person is lost.
The quality of a civilization is measured not by the power of its means, but by the care it is able to offer, by its ability to recognize the other as a face not merely as a function. The ability to care for one another is a fundamental dimension of our humanity, one that is learned and mastered through lived experience. Reading stories to a child, offering company to an elderly person and arranging a home so that it is welcoming are simple gestures often rooted in family life. They teach us to value care at a societal level and train us to recognize others as persons worthy of attention. Technology can also support this mutual care between people, for example, by providing tools that help us anticipate and organize things, without undermining human freedom and judgment. After all, human beings are the subjects of relationships and responsible for their own decisions.

Underlying narratives: transhumanism and posthumanism

In an attempt to shed light on the cultural assumptions accompanying the ongoing digital revolution, I would now like to turn our attention to certain currents of thought that interpret progress as surpassing the human condition, and which are often grouped under the labels of transhumanism and posthumanism. These perspectives form the ideological background present in some centers of technological power and occupy the collective imagination in a simplified form, especially in the media and on social networks. They tend to foster enthusiasm for new technologies through a futuristic vision of an “enhanced human being” or “human-machine hybrid.”
Transhumanism and posthumanism encompass a range of currents and sensibilities, making it difficult to define them in a single, unambiguous way. They can be likened to an archipelago of conceptual “islands,” distinct yet connected by a common “sea” of assumptions, namely the central role of technology and the aspiration to transcend the limits of the human condition. In general, transhumanism envisions the enhancement of human beings through technologies — such as biomedicine, body engineering, devices and algorithms — with the aim of increasing performance and capabilities. Posthumanism, especially in its more radical forms, goes further: it challenges anthropocentrism and envisions a hybridization of human beings, machines and the environment, even anticipating a threshold where humanity surpasses itself in a new evolutionary stage. Even when such ideas remain largely speculative, they gain relevance by altering the collective imagination and thereby influence social, economic and political choices. [129]
From the perspective of the Church’s Social Doctrine, the key issue is not the use of technology as such, but the vision that underlies it. If the human being is treated as something to be perfected or surpassed, it becomes easier to accept that some lives are less useful, less desirable or less worthy. In the name of progress, “necessary sacrifices” may begin to be justified, placing the burden on the most vulnerable in pursuit of a supposed optimization of the species. In this regard, the aforementioned warning of Saint Paul VI retains great foresight: indeed, scientific and technological advances, when detached from moral and social progress, end up turning against humanity. [130] For this reason, a clear distinction must be made. It is one thing to integrate technology within a human-centered, relational vision; it is quite another to be guided by an outlook that devalues human limits and promises a purely technical form of “salvation.”

AI truth distortion destroys democracy

Truth as a common good

Truth and democracy

The use of digital platforms and AI systems is driving profound changes in public and political communication. Tools that could foster dialogue and participation are often used to construct distorted narratives and blur the boundaries between truth and falsehood, mixing facts with opinions. Disinformation did not begin with AI, yet today it finds a powerful amplifier in AI. The ability to manipulate content, images and videos exposes people to biased or misleading perspectives. This problem has both cultural and moral dimensions, since the quality of public communication depends directly on social trust and, in turn, shapes it. At the same time, truthful information does not arise from centralized or automated control. In public discourse, the truth of facts has a rational dimension, as it requires verification, cross-checking of sources and responsible argumentation. Moreover, it is deeply relational, built through bonds of trust and shared practices, as well as an honest exchange with others and with the world. Only the shared pursuit of the veracity of facts, perceived as a common good, can provide a solid foundation for just communication.
Those who command powerful technological and economic resources, along with substantial human capital for intervention, possess significant capabilities for influencing cultural change. Ultimately, they can influence a significant number of people concerning the truth about humanity, the world, the meaning of existence, the family and even God. This is pure power detached from truth, which subtly or overtly imposes what it wishes others to accept as true. At its root lies a deeper and often unrecognized “sickness”: the fact that “modern man is wrongly convinced that he is the sole author of himself, his life and society. This is a presumption that follows from being selfishly closed in upon himself.” [140] Consequently, people believe that they can construct reality, and that whatever best suits their claims corresponds to what is true. Saint John Paul II reflected on the consequences of this “crisis of truth,” going so far as to state that “once the idea of a universal truth about the good, knowable by human reason, is lost, inevitably the notion of conscience also changes.” [141] In such a context, universally valid truths, which precede us and which conscience must accept, are no longer recognized. This led Pope Francis to ask with realism: “What is law without the conviction, born of age-old reflection and great wisdom, that each human being is sacred and inviolable?” To which he concluded: “If society is to have a future, it must respect the truth of our human dignity and submit to that truth. Murder is not wrong simply because it is socially unacceptable and punished by law, but because of a deeper conviction. This is a non-negotiable truth attained by the use of reason and accepted in conscience. A society is noble and decent, not least for its support of the pursuit of truth and its adherence to the most basic of truths.” [142]
The search for truth is an essential element of democracy, which is itself a means of contributing to the common good. When questions about what is true lose their appeal, and a pragmatism takes hold that is content with what appears useful or effective, then democratic life is weakened. After all, democracy does not consist of rules and procedures alone, but above all of a solid concordance with the facts and a genuine commitment to the good of individuals and society as a whole. Indifference to the truth leads, slowly but surely, to a descent into totalitarianism. As the philosopher Hannah Arendt wrote, the ideal subjects of such regimes are not so much those who are ideologically convinced, but rather “people for whom the distinction between fact and fiction (i.e., the reality of experience) and the distinction between true and false (i.e., the standards of thought) no longer exist.” [143]

Communication and the collective imagination

In view of this, it is important to recall that communication “is not only the transmission of information, but it is also the creation of a culture.” [144] The content that circulates within digital environments shapes how people perceive the world and introduces into the collective consciousness images and narratives that direct our desires and influence our daily choices. This is “not a parallel or purely virtual world,” [145] since what originates online now becomes a part of people’s lives, especially of the youngest.
For this reason, those who control digital platforms and means of communication have a considerable ability to affect the collective imagination and to present a particular vision of reality as desirable. Such power should be constantly guided by the pursuit of truth and respect for human dignity, so that the culture fostered on the internet does not become an instrument of excessive distraction, homogenization or dominance, but rather a setting in which inner freedom and critical thought can mature.

Education key

Toward an ecology of communication

Our first task is neither to demonize nor idolize technological tools, but to utilize them on the basis of a fundamental principle, namely that truth is a common good and not the property of those with power or influence. We must therefore promote an ecology of communication. On the level of public policy, this entails establishing norms so that the decision-making behind content selection and its development becomes more transparent and protects personal data. Regarding social and cultural aspects, this requires a strengthening of intermediary organizations, serious journalism and forums for debate, where reasoned argumentation and verification carry greater weight than immediate reaction. For families and schools, there is a growing need for new educational awareness and for formation concerning the proper and critical use of digital tools, AI and online commercial and financial platforms. In universities, the principal challenge lies in the integration of knowledge, cultivating both the capacity to connect and synthesize knowledge in order to grasp complexity, and the skills necessary to verify facts.
Christian communities, too, are called to commit themselves to transparency in communication and to the honest pursuit of facts. Sadly, this has not always been the case. We have witnessed with shame the emergence of painful truths concerning even members of the Church and ecclesial realities. In particular, some journalists, driven by a passion for truth, have played a crucial role in bringing injustices and abuses to light. To them, I wish to repeat the words that Pope Francis used in speaking to journalists: “I also thank you for what you tell us about what goes wrong in the Church, for helping us not to sweep it under the carpet, and for the voice you have given to the victims of abuse.” [146] Yet vigilance and transparency remain first and foremost a grave responsibility for the Church herself, and we must not wait for others to compel us to confront uncomfortable truths about ourselves.

An educational alliance for the digital age

In an era when truth is often distorted in order to serve particular interests and communication strategies, the field of education assumes decisive importance. Yet rapid technological transformations reveal just how unprepared we are on the educational level. The pervasiveness of digital media fosters a culture of immediacy and hyper-stimulation, which gives rise to fatigue, boredom and apathy concerning the effort required for seeking the truth.
140. Education, by contrast, is a long journey requiring patience, and therefore needs time for development and for engagement with reality beyond appearances. This is a fundamental issue because every technology shapes those who use it. Educating people about the use of AI, then, involves teaching them to decide when and for what purpose it ought not to be use The speed and ease with which answers or summaries can be obtained risk extinguishing the desire to ask questions, which is a process that bears fruit only over time. As Plato wrote, the deepest and most important things are learned only after much time and effort, by engaging in discussion with others, “striking upon” ideas and experiences together like flint until the spark of understanding is kindled within us. [147] We must learn, then, how to exercise restraint in the use of AI and to protect our young people from the promise of the perfect machine, from that subtle temptation which renders human thought seemingly superfluous precisely when it is most needed.
In recent years, psychological and psychiatric literature has documented with growing insistence how early and unsupervised exposure to digital devices and social media can negatively impact sleep, attention span, control of emotions and relationships, especially during the most vulnerable stages of life, at times with tragic consequences. This is further aggravated by easy access to violent or degrading content that offends sensibility, to pornographic and hypersexualized material, to messages that trivialize the body and emotions, and to proposals that normalize risky behavior. Online phenomena such as grooming, blackmail and the sexual exploitation of minors are not uncommon, and are made more insidious by the use of fake profiles, algorithms that facilitate dangerous contact, and AI tools capable of manipulating images and videos. Having a personal mobile device at too early an age and using it without adult supervision can exacerbate young people’s vulnerabilities, foster addiction and expose them to isolation, bullying and cyberbullying, as well as to pressures to share intimate images or sensitive information.
It is difficult for parents by themselves to resist the influence of business models that monetize attention and time. Therefore, it is essential to form an alliance among policy-makers, educational institutions and families that is capable of concretely supporting adults in this task. Far-sighted public policies are needed to oppose the immediate interests of platforms, concentrated in a few hands, when they conflict with the wellbeing of minors. In this regard, interventions by legislators are appropriate for setting age limits, holding service providers accountable rather than shifting the whole burden of control onto families, and for providing specific protections against all forms of online sexual exploitation and violence. Thus can children and adolescents, who are entrusted to our care, be genuinely protected as a precious treasure. [148] At the same time, it is also necessary to teach children, adolescents and young people how to recognize manipulation, defend their dignity and respect that of others in digital environments. [149]

The central role of schools

School is the place where new generations can learn to seek and love the truth, to reflect on the meaning of life and to recognize the dignity of every person. For this reason, many parents, who want their children to grow in the capacity to form relationships, develop critical thinking skills and embrace solid values, place great expectations on schools as valuable partners in their children’s education. Yet parents have the primary and inalienable right to choose the kind of education and formation for their children, in a manner consistent with their moral, cultural and religious convictions. Today, the world of education faces a number of urgent challenges.
The first challenge is socio-political. Both within individual nations and across different regions of the world, significant inequalities persist concerning access to basic education and higher studies. In many nations, Governments have not yet invested the necessary resources for guaranteeing a quality education for all, whether by adequately supporting the public school system or by assisting private institutions that offer this essential service. When a substantial portion of education, at various levels, is entrusted to private institutions, access to schooling may become overly dependent on families’ financial means, especially in the absence of adequate public support. In the face of this risk, it is nevertheless important to acknowledge and encourage the contribution of the many private Catholic educational institutions which ensure inclusive access for children and young people of every background, even when families’ economic circumstances would not otherwise allow it.
The second major challenge is pedagogical. Many educational systems struggle to keep pace with change and to support the integral development of students. The advance of information technologies and AI is rapidly rendering curricula obsolete that were designed for a different era. Meanwhile, the organization of schools, physical spaces, evaluation methods and the role of teachers themselves must be rethought in order to promote an authentically integral education that addresses every dimension of the person. It is necessary to support the ongoing formation of teachers throughout their professional lives, so that they can engage positively with new technologies, helping students to use them responsibly, critically and creatively, rather than passively succumbing to their influence.
The third major challenge is intellectual and concerns knowledge. Without careful attention, an educational system lacking in a love for truth may emerge, in which an incessant flow of information replaces the essential exercise of research, reflection and discernment. As knowledge becomes increasingly fragmented, it becomes difficult to grasp reality as a whole, to ask profound questions about meaning, or to develop authentic, critical and creative thought. Many educators already report signs of dehumanization, where people may “know many things” but struggle to find direction in their lives, partly due to an inability to connect information with deeper knowledge or maintain a sense of purpose. A genuinely healthy attitude is needed, requiring rhythms that incorporate silence, in-depth study, reading and judicious analysis, for without these elements inner freedom may be compromised.
The Church’s Social Doctrine invites families, schools, Christian communities and public institutions to form a renewed educational alliance. This takes shape when fundamental principles are translated into educational goals, including teaching students a sense of moderation and limits; recognition of the rights of others and of future generations to enjoy the goods that are either provided for us or made available by human ingenuity; freedom and responsibility; and a sense of transcendence and the common good. Schools are not called to follow the pace of the digital world, but to offer that which the digital sphere by itself cannot provide, namely a shared time for learning and developing trustworthy relationships.

The dignity of work at a time of digital transition

The value of work

Since the emergence of her Social Doctrine, beginning with Rerum Novarum, the Church has emphasized the protection of workers and the need to combat all forms of exploitation. Above all, however, the Magisterium has recognized in work “the essential key” [150] to understanding the entire social question, since it is through their work that individuals develop many dimensions of their existence. In view of this, we can understand the great intuition of Saint Benedict of Nursia, who united prayer and work, showing daily activity to be a part of the human response to God’s call. Created in the image of the Creator, our own work in some way continues his, for thereby we contribute to the progress of society and the common good, put to good use the capabilities we have received, improve and beautify the world, support our families, engage in cooperative relationships and, through listening and dialogue, learn to build together something that no one could achieve alone.

Work is essential to life and human development; unemployment is an evil

For these reasons, work is not simply an instrument; it expresses and enhances the dignity of our lives. It is a requirement of the human condition, a normal path toward maturity, development and personal fulfilment. In this regard, financial assistance to the poor may at times be necessary in emergencies, but it cannot become the sole response, since the goal is to enable each person to live with dignity through his or her own work. [151]
Today, the convergence of automation, robotics and AI is rapidly transforming the very structure of work. It is said that this will bring great improvements for everyone. In reality, however, the “new ways” of working are not necessarily better, for “while AI promises to boost productivity by taking over mundane tasks, it frequently forces workers to adapt to the speed and demands of machines, rather than machines being designed to support those who work. As a result, contrary to the advertised benefits of AI, current approaches to technology can paradoxically de-skill workers, subject them to automated surveillance and relegate them to rigid and repetitive tasks. The need to keep up with the pace of technology can erode workers’ sense of agency and stifle the innovative abilities they are expected to bring to their work.” [152] Precisely in order to avoid this drift, it is necessary to design systems that are centered on the human person and not solely on performance.

The problem of unemployment

151. Saint John Paul II recognized that unemployment is a grave evil. Indeed, when it reaches massive proportions, it becomes a true social calamity that especially requires the State to exercise responsibility. [153] Today, amid the “fourth industrial revolution,” this concern is even more acute, as innovation is often pursued solely for reducing costs and increasing profits. [154] In some contexts, there is a legitimate fear of a significant and rapid contraction in available jobs that would create a chain reaction deeply impacting families, young people and local economies. In many sectors, this can already be seen in new forms of job insecurity and inequality, characterized by outsized remuneration for a highly specialized minority alongside declining wages for a large portion of the workforce.
152. It is certainly desirable for technology to relieve humans of arduous, repetitive or dangerous tasks and to provide intelligent support for human activity. Yet, the protection of employment opportunities and the irreplaceable role of the individual must remain the general rule. The pursuit of greater profits cannot justify choices that systematically sacrifice jobs, because the human person is an end, not a means, and the economic order must remain subordinate to human dignity and the common good.
At the same time, we must acknowledge that every real transition involves discontinuities, for it is uneven, fragmented and sometimes conflictual. Consequently, no single model of change or universal solution exists, since there are places and situations that require different responses. Given the inequality that characterizes our world, the spread of AI and computational systems produces varied effects in different places. Wealthy societies automate rapidly and chaotically, reducing the need for a workforce and creating room for unemployment and institutional friction. Vast regions of the world, by contrast, remain trapped in hybrid economies, where underpaid human labor and partial technologies coexist without achieving genuine transformation. These areas become places of precarious labor, and hotbeds of instability and forced migration. Therefore, solutions must be sought at national and local levels through the involvement of intermediary communities. We need adaptive tools, including well-structured models, local initiatives, progressive redistribution and new rights of access to essential goods. While not pursuing an abstract harmony, we must build concrete forms of human coexistence at this time of transformation.
Work remains a fundamental dimension of the human experience, for not only is it a means of sustenance, but it is also a context for expression, relationships and contributing to the community. Therefore, the problems related to work extend beyond the income necessary for family survival. A society that guarantees employment to only a small fraction of the population, despite having a high level of technical development, risks exposing many to forced inactivity, a lack of responsibility and the absence of daily tasks and stimuli, resulting in human and cultural impoverishment. This creates a paradox of material progress and anthropological regression that undermines the foundations of a just and stable social peace. For this reason, the Church’s Social Doctrine insists that access to work for all must be a high priority for public policies and economic processes, serving as a criterion for evaluating the human quality of any development model. [155] Moreover, in those parts of the world where work tends to diminish or change radically due to technological and organizational processes outside of democratic control, we must rethink the nature of work and its connection to citizenship, ensuring that unemployment does not jeopardize social participation.
In light of this conviction, we can better appreciate the history of the Church’s Social Doctrine after Rerum Novarum. The initiatives which emerged from that tradition, including associations, trade unions, cooperatives and welfare organizations, have contributed decisively to improving labor legislation, protecting the most vulnerable and promoting more humane conditions. [156] Today, however, these instruments are no longer sufficient by themselves in the face of the transformations driven by AI, the new organization of markets and the competitiveness that is rarely concerned with social sustainability. New collaborative efforts are needed among political leaders, labor organizations, the business world and the scientific community in order to develop rapidly adequate shared regulations and protections, including at the international level. [157] Labor unions, which the Church has consistently supported, are called upon to be open to new types of employment and the corresponding needs of workers, in order to represent and defend them. In this context, without bold decisions, the prospect of greater poverty and inequality looms large, which would leave many individuals marginalized, stranded and surrounded by the machines and automated systems that have replaced them.
At this time of transition, it is not enough to react only when jobs disappear; we must oversee the transformation in advance. One viable path is, first of all, to establish social criteria for innovation. Here, every introduction of automation and AI should be accompanied by verifiable measures to protect the employment, retraining and participation of workers. In this way, technology will be oriented toward freeing up human time and capabilities, rather than producing exclusion. Second, we need proactive policies that make continuous training and professional transitions accessible to all, ensuring that the cost of adaptation does not fall solely on individuals. Finally, there needs to be a corporate commitment to include quality and dignity of work among its indicators of success. When these conditions are present, innovation can serve as an ally of safer, more creative and dignified work; without them, innovation tends to become an accelerator of injustice.

Economic conditions without benefitting people threatens their survival

An economy that values dignity

The labor market is one area in which the risks associated with new technologies more clearly emerge. It is thus necessary to remember that economic freedom is not absolute; it must always be measured against the common good and the dignity of every person. Entrepreneurial initiative can indeed be a true vocation, generating wealth and improving lives, rather than a variable that is dependent only on profit. This is possible when it recognizes that the creation of dignified, valuable jobs are an essential part of its proper service to society. [158]
With prophetic spirit, Pope Francis warned against an economic freedom proclaimed in words alone, while actual conditions prevent many from benefiting from it. [159] Economic models that exalt efficiency and individual success often view investment in disadvantaged people or in those with slower development paths as useless or inconvenient, as if their futures depended solely on their ability to keep pace with the “winners.” In reality, a just society requires a vigilant State and civil institutions that are capable of overcoming the singular mentality of efficiency, and of ensuring that resources, creative solutions and regulations favor the most vulnerable. [160] Instead of waiting for the benefits of growth to reach the poor “eventually,” decisions need to be taken to ensure that growth becomes inclusive from the outset. The experience of recent decades shows that in economic and financial crises, it is always the poor who pay the highest price, while the theories that promise automatic general prosperity often prove to be illusory.
It is important to move beyond the current metrics of development — which for more than eighty years have been tied to the concept of Gross Domestic Product (GDP) — since these metrics almost systematically neglect aspects essential to the overall wellbeing of people and the environment. The development of parameters and metrics complementary to GDP is crucial for improving the databases used for conducting analyses, political and economic decision-making and establishing regional, national and international priorities. The introduction of new parameters will allow for a comprehensive and timely assessment of how legislative and regulatory decisions impact the dignity of work, shared prosperity, inequality reduction and environmental protection. It will also affect the concept of development, educational processes, mindsets and public opinion, as well as peace, which is only authentic when based on justice.
In recent years, finance has increased in importance and has undergone significant innovation, driven partly by the introduction of cryptocurrencies. The reflections and observations contained in the teaching of my predecessors, particularly in their Encyclicals, have highlighted how the financial intermediation sector, “when operating without the necessary anthropological and moral foundations, has not only produced manifest abuses and injustice, but also demonstrated a capacity to create systemic and worldwide economic crisis.” [161] It is likewise the case that income from capital risks replacing income from labor, which is often confined to the margins of the economic system’s primary interests. Yet savings transformed into credit for the real economy, thereby creating both jobs and self-employed work, remain central for development and the investments that must accompany ongoing transitions. The social function of credit remains irreplaceable. Finance for its own sake is fundamentally different from finance aimed at the development, creation and evolution of work.
This perspective needs to become part of a broader view of global dynamics. While the world’s wealth has grown in absolute terms, it is increasingly concentrated in fewer hands, widening inequalities both within and between countries. “There are a few who have too much, and too many who have little, that is the logic of today.” [162] Scientific and technological advances, even in the medical field, are not easily accessible to the vast majority of people, as was dramatically demonstrated during the recent pandemic. While some regions spend heavily on superfluous interventions or dreams of individual enhancement accessible only to a select few, other parts of the world lack the essential equipment needed to save millions of human lives. To think that new technologies will automatically benefit everyone is to ignore the evidence. Unless transformations at the design stage prioritize the prevention of new and further disparities, technological progress will inevitably produce structural inequalities. Today, justice requires access to the benefits of innovation, including care, knowledge, tools and opportunities.
Just laws and methods of redistribution are certainly necessary for correcting imbalances, including tax systems that lighten the burden on the weakest and ask for more from those with greater resources. However, the pursuit of social justice should not be considered a separate issue that follows only after the production of wealth, as if the economy existed solely to create wealth, with politicians only intervening afterwards in order to distribute it. Indeed, justice concerns every phase of economic activity, from resource acquisition to financing, and from production to consumption; every choice has moral consequences. [163]
More than ever, in the age of AI and robotics, it is no longer possible to rely solely on the “invisible hand” of the market. [164] Politics has the task of orientating economies and technologies to the common good, promoting dignified work, social inclusion and an equitable distribution of the benefits of innovation. Since many economic decisions transcend national borders, there is also a need for international cooperation capable of defining common strategies, especially in favor of the most vulnerable countries and people, in order to promote development and overcome welfare dependency. The thinking behind these choices is the immeasurable dignity of every person, the common good and a world truly governed for everyone. The interdependence between peace and development, as Saint Paul VI prophetically wrote in 1967, [165] remains applicable today, for prosperity contributes to building and reinforcing peace only if it is widespread, inclusive and sustainable.
In practical terms, in the age of AI and robotics, ensuring that the economy favors human dignity means adopting certain criteria for firm action. First, transparency and accountability: when data and algorithms influence credit distribution, personnel selection or access to services and opportunities, it is necessary that decisions be understandable, contestable and subject to oversight, so that individuals are not reduced to mere profiles. Second, inclusion and access: the benefits of innovation must be paired with investments in skills, infrastructure and essential services to ensure that technology does not widen the gap between those who have and those who have not. Finally, measures to ensure equity: taxation, social protection and industrial policies must correct the imbalances created by the concentration of wealth and power. Indeed, these criteria do not constitute a curb on innovation; instead they make it civilized and humane.

Families and young people: the social conditions for hope

The family is a primary social good. Founded on the enduring union between a man and a woman, it is the first environment in which all persons develop their potential, become aware of their dignity and learn the earliest forms of truth and goodness, internalizing the habits that prepare them for life in society. [166] As the first natural society, endowed with foundational rights, the family is the fundamental and irreplaceable cell of every community organization. [167] Consequently, when political projects and major economic decisions relegate the family to a marginal or secondary role, the authentic growth of the entire social body is compromised. [168]

Unemployment destroys families and the young

The family, however, is a fragile social good immediately affected by the economic and technological transformations reshaping the nature of work. It thus requires cultural, juridical and economic support. The devastating impact of unemployment and job insecurity on family structures is well known. In the short term, it may seem advantageous to reduce labor costs or maximize financial efficiency, but in the long term this undermines the very foundations of social coexistence. While technological successes are celebrated, the social fabric is progressively eroded, as if by a silent virus.
167. For young people, job insecurity is particularly devastating. As the Bishops of the United States of America have recalled, work is not merely a source of income but a crucial sphere in which identity is formed, friendships and relationships are forged, practical responsibilities are learned and one’s vocation is discerned. [169] When access to work is hindered by high levels of unemployment, inadequate systems of training or structural barriers, many young people find the path to their human and professional fulfilment blocked. The need to change jobs several times over the course of life requires that continuous updating and retraining be provided, so that new generations can competently and independently face the risks of an economic environment that is both changing and often unpredictable. [170]
This gives rise to a specific public responsibility. The State has the duty to support business activity by fostering conditions favorable to employment, promoting work where it is lacking and defending it in times of crisis, since it is a primary good for families and for society. [171] Particularly in an age of continuous technological transformation, we need a political creativity that will promote “work” and place the family and coming generations at the center; otherwise our economic progress will translate into new forms of insecurity and exclusion.
Supporting families and young people in this transition requires choices that make stability feasible. As has been noted above, labor policies need to promote continuity and the quality of employment, countering insecurity as a normal condition of life and encouraging realistic paths for entry into the workforce and for professional growth. Second, measures are needed to ensure a healthy way of living, for without a proper balance between work, leisure and rest, families are weakened and young people struggle to develop a sense of responsibility. Furthermore, it is essential to invest in accessible education and retraining, so that the professional mobility demanded by the digital economy does not become a harsh selection between those who are able to update their skills and those who cannot. Finally, social ties must be supported, with networks and educational communities that accompany life choices and prevent uncertainty from giving rise to loneliness or addictions. If implemented, these technological transformations can be navigated without undermining the capacity to build the future, which is what makes a society prosperous.

Data surveillence

Protecting freedom against dependencies and commercialization

Dependencies and societal control

Having reflected on truth and education, work and families, we must now consider the impact of the digital revolution on human freedom, addressing risks to both the mental health of individuals and broader social challenges. The subtler forms of addiction linked to the “digital attention economy” should not be underestimated, since platforms and services are often designed to capture users’ time and attention, exploiting their vulnerabilities and weakening their inner freedom. When business models thrive on human weakness, the person is treated as a means rather than as an end; those who design or finance such systems bear a moral responsibility that cannot be ignored. There is an urgent need to promote technologies that strengthen interior freedom by fostering education in digital sobriety and the protection of minors, thus countering models that exploit vulnerability.
A further risk, less visible but no less serious, is that of social control made possible by the massive collection of data and use of algorithmic systems. When every action—movements, purchases, relationships and preferences—leaves a trace, a new form of power emerges, namely the power to profile, predict and influence behavior, often without individuals being fully aware of it. If such kinds of data are used to make decisions affecting concrete opportunities — such as access to credit, employment or essential services — there is a risk of undermining freedom and discriminating against the most vulnerable. Furthermore, control is exercised not only through explicit prohibitions, but also through the architecture of visibility: what is amplified or rendered invisible, what is rewarded or penalized, ultimately shapes opinions and choices, fostering conformity and self-censorship. For this reason, freedom in the digital age is not merely a matter of interiority but also a public concern. It calls for clear rules, transparency, the possibility of recourse and proportionate limits on the use of intrusive technologies, so that technology will remain at the service of the human person and not become a form of control over consciences.
At the root of these problems lies a technocratic and post-humanist mentality that tends to regard the human person as an object to be manipulated or a resource to be optimized, [172] removing all safeguards against the unchecked pursuit of profit. What prevails is efficiency, rather than respect for freedom and human dignity. Some post-humanist currents even go so far as to envision “second-class” human beings, subordinate to the interests of elites who consider themselves superior. This troubling prospect becomes all the more serious when combined with technological tools that exponentially increase the capacity for control and selection. Even certain forms of structural indebtedness, which keep entire peoples in conditions of dependence, reflect the same mentality, in new forms, that tolerates relationships of subordination akin to slavery.

AI producing slavery

Breaking the chains of new forms of slavery

This distorted view of the human person is reflected today in various forms of servitude directly linked to the digital economy. Nothing in the world of AI is immaterial or magical. Every seemingly immediate and flawless response is the result of a long chain of mediation, involving vast networks of natural resources, energy infrastructure and, above all, people. A significant part of the digital economy’s functioning relies on the silent work of millions of people engaged in essential yet largely unseen activities, such as data labeling, model training and content moderation, often involving disturbing material. In many cases, these workers are young people, predominantly women, working under demanding conditions for minimal wages. Added to this invisible labor is the even harsher work of extracting the resources required for the production of the devices and microprocessors on which AI depends. In some regions of the world, children and adolescents work in dangerous conditions, crushing the materials from which rare earth elements are extracted. The bodies of these people are scarred, injured and worn down so that computational flow may continue uninterruptedly. Furthermore, criminal networks use online platforms, messaging systems, anonymous payment methods and profiling techniques in order to recruit, control and transport victims of trafficking — very often minors — reducing men and women to “data” to be tracked and “packages” to be moved around within the same digital circuits that support much of the global economy. This reality deeply challenges the moral conscience of our time. It is not enough to invoke efficiency, nor to celebrate the benefits of innovation, if they are built on a chain of exploitation that remains deliberately hidden. If technology promises emancipation, yet produces new forms of global subordination, it stands in contradiction to the fundamental principle of human dignity.
The fight against new forms of slavery is a decisive test for the ethical discernment of AI and digital transformation. In continuity with the tradition inaugurated by Leo XIII, the Church renews her firm condemnation of all forms of slavery, trafficking and the commodification of persons. She likewise highlights the urgent need for reflection and action that keep the inalienable dignity of every human being and the common good, as both the focus and goal of society, as well as the guiding criteria for every personal, social and political choice. Without this ethical and humanizing reflection, the growing power of digital systems could lead us toward new atrocities that are no less shameful than those of the past that we now deplore, while we continue to present ourselves as “advanced” and “civilized” societies.
Human trafficking must be recognized as a contemporary form of slavery and a grave violation of human dignity. Failing to respond firmly, or tolerating these practices in any way, is in some way to become complicit in today’s sins, which are akin to those of the past when slavery was being concealed and justified. [173]
In the development of her doctrine, the Church has gradually come to a deeper awareness of the gravity of these issues. It is true that past events cannot be judged anachronistically, as though the moral criteria that matured over time had always been available. Yet neither can we deny or diminish the delay with which both society and the Church came to denounce the scourge of slavery. In antiquity and the Middle Ages many individuals and even ecclesiastical institutions had slaves. Already in the early modern period, the Apostolic See of Rome, responding to requests from Sovereigns, intervened several times in order to regulate and legitimize forms of subjugation, and, in certain cases, the enslavement of “infidels.” [174] It was only in the nineteenth century that a formal, absolute and universal condemnation of slavery was clearly articulated, notably under Pope Leo XIII. [175] This development offers a clear example of the Church’s growth in understanding the perennial truths of Revelation that she safeguards. Although there was not always consistency in practice — given that slavery was long tolerated before being unequivocally condemned — there has been a continuous affirmation throughout history of the dignity of every human being, created in the image of God, even if it took eighteen centuries for its full incompatibility with slavery to be explicitly recognized. This constitutes a wound in Christian memory, one from which we cannot consider ourselves detached. [176] It is impossible not to feel deep sorrow when contemplating the immense suffering and humiliation endured by so many in stark contrast to their immeasurable dignity as persons infinitely loved by the Lord. For this, in the name of the Church, I sincerely ask for pardon.
This is why the memory of past complicity and blindness in the face of the injustice of slavery becomes a call to vigilance. What we have learned must be translated into discernment and responsibility in the present. If we want to avoid the need to ask for pardon again in the future for having failed to respect the treasure of human dignity that is required by our faith, it falls to us today to denounce, clearly and firmly, trafficking in its many forms and, together with all who are committed to this cause, to support concrete efforts of prevention, protection, liberation and rehabilitation.
178. Even today, colonialism assumes new forms. It no longer dominates only bodies, but appropriates data, transforming personal lives into exploitable information. Entire regions, especially those marked by structural fragility and limited geopolitical relevance, are currently subjected to a new mindset of extraction: that of health data, epidemiological profiles, genetic maps and demographic information. These have become the new “rare earths” of power: vital data which, once aggregated and analyzed, can be used to train predictive models, guide investment strategies, anticipate crises and, above all, determine who and what is deemed to matter. Those who control the health data of entire peoples — often collected under the pretext of aid, research or innovation — possess a structural leverage over the future, for they can shape needs and markets. They can also decide, before others, to whom medicines, investments and protections will be allocated. Here lies one of the most urgent moral challenges of our time: to ensure that shared knowledge becomes a true common good rather than an instrument of dominance. This requires restoring to individuals not only the data that describes them, but also the ability to decide how it is used, by whom and for whose benefit. Otherwise, the digital age will not be post-colonial, but colonial in another form.
New forms of slavery are fueled by economic chains and digital infrastructures. Therefore, action is required on several fronts. First, the supply chains that underpin the technological industry and the digital economy need to become more transparent, so that no competitive advantage is built upon hidden exploitation. Second, companies and investors need to adopt clear criteria for preventive ethical verification (due diligence), placing among their priorities the protection of workers, the fight against forced labor and the assessment of the social impact of data-driven business models. Furthermore, digital platforms must cooperate responsibly with authorities and civil society to prevent communication, payment and profiling tools from becoming channels for the recruitment and control of victims. When such efforts converge, the digital environment can be transformed from a space of exploitation into one of protection, prevention and the promotion of human dignity.

A shared responsibility

The various areas just considered— the search for the truth in public life, education in the digital environment, the transformation of work, the fragility of families and new forms of slavery—are not isolated phenomena. Rather, they reflect a common underlying issue, namely that if technology becomes the ultimate criterion, the human person risks being reduced to data, a cog in a machine or a commodity. If, however, technology is integrated with a wise perspective, it can become an instrument of growth, justice and fraternity.
From this perspective, the Social Doctrine of the Church calls for a shared responsibility. It asks that these processes be guided with foresight: by institutions capable of regulating without stifling, and protecting without taking over; by businesses that recognize work and dignity as measures of success; by intermediary organizations and educational communities that rebuild trust and relationships; and by citizens who cultivate responsibility, moderation, discernment and a sense of truth. Only in this way can innovation genuinely serve integral human development, rather than becoming a source of exclusion and dominance. And only in this way can the promise of progress be recognized as authentic, because it is measured against the inviolable dignity of every man and woman.

CHAPTER FIVE THE CULTURE OF POWER AND THE CIVILIZATION OF LOVE

Having considered how AI is transforming certain aspects of life and society, in particular the serious implications for human dignity, we must now turn our attention to the yet more tragic issue of war. Here the question is not merely the efficiency of new tools, but also the risk that technology, detached from ethics and responsibility, will render decisions about life and death more rapid and impersonal, and will present the use of force as an immediate and viable option. In an increasingly interdependent world, peace is not simply one issue among others, but a prerequisite for the universal common good and a test of the moral maturity of peoples, especially of those who bear responsibility for governing.

AI producing warfare

The digital revolution is changing the nature of conflict. Alongside conventional warfare, there are hybrid forms such as cyberattacks, information manipulation, campaigns of influence and the automation of strategic decisions. AI acts as an accelerating factor in these processes, particularly within a context where many technologies are intrinsically ambivalent. Consequently, what is created for defense can be rapidly repurposed for offense, and the fine line between protection and aggression becomes blurred. While AI can enhance the defense and protection of civilians, it can also lower the threshold for the use of force, shield people from responsibility and foster a culture in which the enemy is reduced to a statistic and the victim to “collateral damage.” Faced with these transformations, we must recall the principles of Social Doctrine — the dignity of the person, the common good, the universal destination of goods, subsidiarity, solidarity and justice — for they are criteria for judging whether technologies truly serve humanity or are subjugating it. We should, therefore, consider these principles as guidelines for our decision-making.
In this chapter, therefore, I will compare two opposing approaches, which I have already evoked through biblical imagery in the Introduction. On the one hand, there is the temptation of constructing the Tower of Babel, relying on power and pride. On the other hand, patience is required in order to rebuild Jerusalem “piece by piece,” as in the time of Nehemiah, by safeguarding humanity and the common good.
If we examine global dynamics, we can recognize more clearly the spread of a culture of power characterized by polarization and violence. The modern Babel can be seen not only in the globalized technocratic paradigm, but also in the remote clash between opposing imperialisms, between powers that wish to preserve their supremacy, and those that aspire to seize that supremacy, resulting in a multiplicity of local conflicts. Moreover, there seems to be no limit to the race — driven by a dehumanizing ambition — to develop evermore powerful technologies or to secure control over them. Yet, despite this downward spiral, we can also glimpse a great part of humanity that is striving to remain human and working to build the holy city of coexistence and peace. All too often, we are unwitting builders and clumsy architects of this city, capable of generous gestures but lacking an overall vision. This building project is slower, less visible and less spectacular, and awaits a better understanding and greater coordination so that it may become the conscious and clear responsibility of every community, from families to States, and the relations between Nations. It is this prospect of commitment, this construction site of hope, that we call the “civilization of love.”

The civilization of love in the digital age

When Saint Paul VI coined the phrase “the civilization of love,” [177] the world was in the midst of the Cold War, an arms race and severe economic instability. In that context, the Church proposed an alternative path to that of ideological opposition between systems, and envisioned a social order in which justice and charity are intertwined and love becomes the guiding principle of economic, political and cultural life. Today, we must resolutely recover this vision, for the civilization of love is no naïve utopia, but a demanding project, which consists in translating charity into structures of justice, giving institutional form to fraternity and regarding others — whether individuals or peoples — as allies necessary for building the common good. As the Encyclical Letter Fratelli Tutti reminded us, only this social love is capable of becoming a culture and a norm, and thereby of bringing about a stable international order, transforming mere armed coexistence into a community with a shared future. [178]
This insight proves even more fundamental in the current context of digital transformation. Digital networks, the globalized economy and the development of AI create increasingly tighter bonds, linking — in real time — decisions made in one place to the effects they produce elsewhere. In this sense, the words of the Second Vatican Council on the growing interdependence between peoples remain timely, for the common good is taking on an increasingly universal dimension, with rights and duties concerning the entire human family. [179] The project for a civilization of love, therefore, must undertake the task of transforming this imposed interdependence into a willed and chosen solidarity. This is the guiding principle for technological processes: it is not enough for artificial intelligence to make us more efficient or connected; it must also serve to build a universal human family, with shared rights and duties, where digital proximity becomes a real opportunity for encounter and mutual care.

The culture of power

In our time, a culture of power is taking hold, in which the availability of resources and the ability to dominate tend to dictate the agenda and criteria for decision-making. In this way, the common good of humanity is relegated to the background and the concrete tragedy of peoples at war is reduced to a secondary consideration in relation to strategic interests. This culture of power infiltrates society, changes relationships and behaviors, and grows by normalizing war, pursuing ever-greater military power, taking advantage of the crisis of multilateralism and fueling a false realism that insists that there is no alternative.

The normalization of war

In 1965, the words of Saint Paul VI resounded powerfully at the UN General Assembly: “Never again war, never again war!” [180] We must acknowledge that, despite the desires and declarations for peace, the past sixty years have been marked by conflicts of astonishing brutality, often affecting civilian populations on a massive scale, leading to the death of innocent victims, mass displacement, social destabilization and long-lasting wounds. Nevertheless, in public discourse, there was a widespread conviction that war should remain a last resort, subject to strict ethical and legal limits, and always oriented toward a political vision of peace. Following developments in the immediate post-First World War period, a turning point occurred after the Second World War: peace was made the focus of the international order, as attested in particular by the United Nations Charter, with the intention to “save succeeding generations from the scourge of war.” [181] Likewise, many national constitutions restricted the use of force to extreme and strictly limited circumstances. Even during the Cold War, despite the existence of serious conflicts, there remained the awareness that a new world war had to be avoided at all costs.

Revival of global war

Today, however, we are witnessing a real paradigm shift in public discourse and in decisions regarding rearmament, with a troubling revival of war as an instrument of international politics, while the very ethical principles that had previously limited its use are being eroded. Regional conflicts that drag on over time, escalating tensions and reciprocal threats are becoming almost commonplace, and forms of conflict driven by the desire for territorial expansion that were thought to be overcome are re-emerging. Public opinion is gradually being shaped and conditioned by polarizing media narratives, which are often amplified by algorithms that prioritize conflict and confrontation.
We are also witnessing a disconcerting loss of historical memory, as first-hand accounts of the Holocaust and the two World Wars are disappearing. This leads to a selective or distorted rewriting of the past, in a context where fake news and the manipulation of narratives obscure the lessons that have been learned. Without a living memory of the horrors of war, political decisions risk being made on the basis of power alone, without any consideration for the long-term consequences.
To all of this, the media and digital dimensions are adding new and decisive elements. Communication networks, fragmented information environments and algorithms that reward conflict can magnify polarization and resentment, increase propaganda and make shared discernment more difficult. Thus, war is not only fought, but also culturally conditioned through simplistic narratives, a friend-or-foe mentality, disinformation and fear. When historical memory fades and the ethical principles that protect civilians and the most vulnerable are weakened, it becomes easier to justify violence as necessary, inevitable or even “sanitized.” It is in this context that humanity is slipping into a violent culture of power, where peace no longer appears as a responsibility to be taken on, but as a fragile interval between conflicts. Today, more than ever, without prejudice to the right to self-defense in the strictest sense, it is important to reaffirm that the “just war” theory, which has all too often been used to justify any kind of war, is now outdated. [182] Humanity possesses far more effective and capable tools for promoting human life and resolving conflicts, such as dialogue, diplomacy and forgiveness. The use of force, violence and weapons reflects a relational poverty that always has disastrous consequences for civilian populations.

Force without limits

193. The growth of the military-industrial complex has become a defining feature of the current political landscape and has become a key sector in the economy of various countries. The close link between economic interests, the military apparatus and political decisions produces an “armed nation,” in which war appears as a natural extension of politics, and the arms market becomes an autonomous driving force behind military decisions. Nor can we ignore the enormous economic interests behind war. The armaments industry, and countries that supply weapons, profit from a market that thrives precisely on conflicts. In this sense, there are also financial interests that contribute to fueling tensions in various regions of the world.
Military arsenals are receiving renewed attention. In the past, recognition of the threat posed by weapons capable of destroying all of humanity had promoted paths toward détente and disarmament negotiations. Unfortunately, this approach has been left behind, and the evolution of nuclear arsenals — including the prospect of its “tactical” use — makes the use of such weapons seem less improbable. In this context, the Treaty on the Prohibition of Nuclear Weapons, which came into force in 2021 with the support of over seventy countries, is an important step. However, it risks remaining largely symbolic since the major nuclear powers have not agreed to it. This has led to the widespread yet erroneous belief that nuclear deterrence is an indispensable prerequisite for security. This has also contributed to a new arms race, which is hard to control and accompanied by the gradual dismantling of nuclear reduction agreements, as well as the development of “miniaturized” weapons, that make their use seem like a more viable option.
The same logic applies to conventional warfare. Military force, weak diplomatic initiatives and the complexity of the interests at stake contribute to conflicts that tend to become protracted, with extremely high human and environmental costs. It is much easier to start a war than to stop it, and yet, discussion on conflict prevention remains tragically marginal.
The situation is further destabilized by the presence of new armed operatives, such as jihadist groups, private militias and criminal networks that mark the end of the State’s monopoly on the use of force. Often these groups intertwine vague ideological motivations with concrete economic interests, transforming war into a “way of life” for entire generations of young people and children. Here, the objective is no longer a definitive victory, but the perpetuation of conflict as a source of power and income.

Weapons and artificial intelligence

197. The above-mentioned scenario is linked to the unceasing development of weapons systems, particularly those involving AI. The Holy See has recently observed that the growing ease with which autonomous weapons systems can be deployed makes war more “feasible” and less subject to human control. This violates the principle that armed force should be used only as a last resort in cases of legitimate self-defense. [183] For this reason, the development and use of AI in warfare must be subject to the most rigorous ethical constraints, to guarantee respect for human dignity and the sanctity of life and to avoid a race to develop such arms. [184]
Sometimes there is talk of “artificial moral agents,” as if machines were able to distinguish between right and wrong with greater consistency than a human being. Yet moral judgment cannot be reduced to calculation, for it involves conscience, personal responsibility and the recognition of the other as a person. Therefore, it is not permissible to entrust lethal or otherwise irreversible decisions to artificial systems. No algorithm can make war morally acceptable. AI does not remove the intrinsic inhumanity of conflict; indeed it can only bring about conflict more quickly and render it more impersonal, lowering the threshold for resorting to violence, transforming defense into threat prediction and thus reducing victims to data. In this way, it will accustom us to the idea that violence is inevitable and needs only to be optimized. This does not diminish the importance of instilling, as far as possible, values and sound judgment into the artificial systems we build, so that they can contribute to a moral ecosystem in which humans are better able to listen to their own consciences, as well as allowing AI models to establish appropriate boundaries.
It is not enough to invoke a generic type of ethics. Concrete criteria for discernment must be established. The first such criterion concerns personal responsibility. When a decision to strike becomes automated or opaque, the risk of abdicating responsibility increases. For this reason, the chain of responsibility must be identifiable and verifiable; those who design, train, authorize and employ technology must be held accountable for their decisions. The second criterion pertains to the moral timeframe for making judgments. While AI tends to expedite the decision-making processes, speed and efficiency should never be the supreme motivating force for the irreversible decisions made in the context of war. The third criterion is the identification and protection of civilians. Any technology that facilitates attacks without seeing the face of human beings lowers the moral threshold of conflict. Target selection and the use of force must not confuse combatants and non-combatants, nor ignore the impact on defenseless populations.
These criteria give rise to certain non-negotiable requirements. First, all systems used in a war setting must guarantee the possibility of retracing and reconstructing decision-making processes, so that accountability and blame are not collapsed into “the machine.” Second, the decision to use lethal force cannot be delegated to opaque or automated processes, but must remain under effective, self-aware and responsible human control. Finally, it is imperative to establish a shared framework — also at the international level — in order to curb the technological arms race and ensure robust protection for civilians and the infrastructures necessary for their survival.

Multilateralism collapsing

The crisis of multilateralism

The culture of power also stems from the crisis of the multilateral system. The institutions established to safeguard the concept of a common future for all peoples and a global common good appear to have been weakened. This is due not only to structural limitations, but also to a frequent lack of shared will to support and reform them, or to recognize their moral authority. Instead of making progress, we are regressing from the significant turning point of the twentieth century. After 1989, the collapse of communist regimes in Europe was followed by a predominantly economic globalization, which lacked an adequate political framework capable of sustaining dialogue and peace. An almost blind faith was placed in the ability of the markets to generate prosperity, democracy and stability. In reality, rather than automatically generating unity and peace, globalization has provoked fundamentalist, identity-based and nationalistic reactions. The result is a far cry from genuine multilateralism; instead, what has appeared is a disorderly and conflict-ridden multipolarism with a prevailing sense of mistrust.
What has also re-emerged is the temptation to forge a collective identity in opposition to an enemy, fueled by narratives in which each party portrays itself as a victim entitled to retribution. The reduction of complex issues into simplistic categories — “me first,” “friend or foe,” “us or them” — facilitates decisions that are often irresponsible and undermine mutual trust among nations. The force of international law is thus replaced by the claim that “might makes right.” Consequently, tribunals that are competent for settling disputes between States or dealing with war crimes are often weakened or bypassed, with devastating ramifications for political culture and social cohesion. [185]
In this context, peacebuilding has been relegated to a secondary role. Cooperation for development, disarmament, conflict prevention and the establishment of mutual trust are neglected in the name of power politics. The achievements of humanitarian law are also being compromised. Indeed, the principle of proportionality in responding to aggression, the protection of access to water, food and essential goods, and respect for the lives of civilians, especially children, come to be regarded as naïve relics of the past.

A supposed political realism

We live at a time of significant spiritual and cultural blindness. A false pragmatism urges us to sever the roots of our history, as if it were possible to inaugurate a kind of “new creation” detached from the past. Even those who cite important moral principles can fall into this historical nihilism, mistakenly believing that the atrocities of the twentieth century can never happen again. Yet, in reality, the same dynamics are re-emerging under new guises. The mentality of armed equilibrium and deterrence appears to be reasserting itself. Today, however, in contrast to the two-sided dynamic of the Cold War, the proliferation of operatives and battlefields makes this mentality increasingly fragile. Escalating conflicts lead to asymmetric and “hybrid” wars, fought not only on the battleground but also on the economic, financial and cyber fronts, where disinformation and campaigns that feed people’s fears are used to manipulate public opinion. In many countries, including those in the Global South, increased military spending is presented as the only response to an uncertain future or perceived threats. Meanwhile, the real cost falls on the poorest, who see resources for healthcare, education and social services being reduced.
205. At the core of these issues is a false realism, based not only on the prevailing mentality of force, but on the cultural and anthropological belief that war is an inevitable part of human nature. It is said that things have always been this way, except for occasional pauses, and that it will always be so! As a result, the concern is no longer the search for peace — which has been lost as a point of reference on the international stage — but rather how and when to take military actio This same argument maintains that it would be irresponsible not to prepare for conflict. I would argue, however, that what is truly irresponsible is Realpolitik, the form of political “realism” that sows in consciences and in society an attitude of resignation to the inevitability of war, and dismisses peace and dialogue as utopian or irrational positions that ignore the risks at stake. In fact, peace is neither a naïve hope nor merely the absence of war; instead, it is always possible as the fruit of justice and charity.
In such a climate, nihilism and pragmatism become intertwined and end up normalizing grave errors. Religious extremism and identity-based fanaticism ally themselves with irrational economic policies, while politics often turns to misinformation and ridiculing opponents, and systematically cultivating fears and resentments. Thus, diversity is increasingly perceived as a threat, which fuels a desire for possession, a will to dominate, hegemonic ambitions, abuses of power and a fear of those who are different, thereby creating an environment in which new conflicts can develop almost imperceptibly. [186]
This, then, is the fertile ground for new wars that are perhaps even more dangerous than those of the past, since they tend to disregard all ethical limits. What was once considered unacceptable can now be carried out almost without hesitation, while the international response is increasingly influenced more by the interests of individual Governments than by the objective gravity of situations. Decisions now seem to be driven almost exclusively by economic calculations, justified through media distortions, manufactured enthusiasm and “dreams” that inevitably shatter, generating frustration and further violence. When people come to believe that nothing is genuinely true and that principles are hollow words, then the fuse in their hearts is lit for new eruptions of intolerance and aggression.
In these situations, the issue of concrete safeguards to prevent future violence remains an open question. When a culture normalizes and justifies conflict, a dangerous pathway opens up, in that what seems unthinkable today may become acceptable tomorrow in the name of utility or security. In countries marked by serious social tensions, we cannot rule out the possibility that some leaders may consider armed conflict as an effective way of diverting attention from domestic problems and a cynical tool for managing difficulties.
A particular responsibility rests on the shoulders of those who work in the field of research. All the key players in this field — scientists, business owners, investors, academic authorities, politicians and others — must work with a transparent and responsible mindset, while maintaining an acute awareness of the broader context of the technological advancements they help to cultivate, including those related to AI. When people limit themselves to looking only at their own sector, they may deceive themselves into believing they are performing actions that are morally neutral and avoid questions about the ultimate ends that guide certain experiments. In this way, they risk cooperating — perhaps unknowingly — with questionable projects that fuel new forms of violence, manipulation and dominance.

Building the civilization of love

The construction of a world in a state of perpetual conflict is an evil and must be named for what it is. This way of portraying our current situation may seem bleak or pessimistic, yet I consider it necessary to do so. The Christian perspective, however, is not limited to denouncing evil. We view history in the light of the crucified and risen Lord, to whom the Father has given “all authority in heaven and on earth” (Mt 28:18). We do not consider the present as a predetermined fate, but an opportunity for personal and collective conversion. Moreover, we believe in the power of the Kingdom, which grows from the tiny size of a mustard seed, which, once sown, sprouts and grows (cf. Mk 4:26-32). While the tumult of confusion is all around us, goodness grows silently from the earth. In the words of the prophet Isaiah: “Behold, I am doing a new thing; now it springs forth, do you not perceive it?” (Is 43:19).
A closer analysis of history confirms this. Even in the darkest nights, the Lord raises up men and women who refuse to give up, who persevere in doing good, who protect the vulnerable and open pathways to reconciliation. The memory of the saints, righteous people and the oft-forgotten peacemakers, show us that grace does not magically eliminate conflict, but instead it inspires active resistance to evil and an astonishing creativity in doing good. Christians see the darkness and acknowledge it for what it is, yet they do not merely gaze upon it passively, for they know the light and understand that the darkness has not overcome it and cannot defeat it (cf. Jn 1:5). For this reason, even when suffering seems to have the last word, Christians serve the good and are sustained by a theological hope that gives reality both meaning and direction.

Cyber attacks

225. Cyberspace too has become a battleground. Cyberattacks, data manipulation and campaigns of influence, orchestrated with the help of AI, can destabilize entire countries even before open armed conflict erupts. Moreover, in this area, the attribution of responsibility is often uncertain. When it is unclear who carried out an attack, the risk of disproportionate reaction, miscalculation and escalation increases. For this reason, diplomacy must be capable of operating effectively in this new environment, negotiating shared regulations on the use of digital technologies, in order to protect civilians and the most vulnerable from “invisible” yet real forms of violence.
International organizations, particularly the United Nations, are essential instruments for promoting a civilization of love, for they can foster dialogue among nations and promote the peaceful resolution of conflicts, the integral development of peoples, the protection of the most vulnerable, disarmament and the care of creation. Through such efforts, the international community can work to reduce inequalities, defend the rights of refugees and minorities, reallocate resources from military spending to human development and protect our common home. The Holy See supports and accompanies these endeavors, while also recognizing that the current weaknesses of the UN and the international political system reveal the need for profound reforms. This is not simply a question of technical adjustments, for the crisis of convictions and values that also concerns the ethical foundations of nations makes it more difficult to direct multilateralism toward the true common good. [202]
In the international context, the Holy See’s diplomacy adopts the Gospel’s principle of mercy as a concrete criterion for political action. This is one of the ways in which the Holy See places itself at the service of humanity, thereby appealing to consciences in the name of charity and truth, defending the dignity of every person and speaking up on behalf of the poor, migrants and victims of war. In this way, papal diplomacy expresses the catholicity of the Church and contributes to the building of a civilization of love, where even new technologies can be oriented toward the common good.

Praying and hoping

These avenues for exercising responsibility are sustained by prayer, and in turn nourish prayer. Indeed, for each of us, peace primarily comes “from God, God who loves us all, unconditionally.” [203] It is a gift given by Jesus to his disciples on the day of Easter: “Peace be with you! It is the peace of the risen Christ. A peace that is unarmed and disarming, humble and persevering.” [204] With these words, I greeted the Church and the world on the day of my election to the See of Peter. I wish to repeat them now, and to invite everyone to pray for this gift. Let us never tire of praying for peace and of committing ourselves to achieving it in our relationships and in society.

CONCLUSION

“Let each builder choose with care how to build” (1 Cor 3:10). With these words, Saint Paul encouraged the Christians of Corinth to preserve unity. Dear brothers and sisters, we have reflected on the world we are building, and we asked ourselves what it means to safeguard the human person in the era of artificial intelligence. At the end of this reflection, I would like to propose a sober yet demanding program of Christian life with which we can navigate this epochal change in the light of the Gospel. This avenue emerges through contemplating God’s plan, living ecclesial unity by partaking of the Eucharist, building a world centered on the common good and praying in union with the Blessed Virgin Mary.

The Word became flesh

Our world is filled with attempts to seize control of markets and spheres of influence, often shrouded in reassuring rhetoric and seductive ideologies. Yet our hearts yearn for an approach that is wise and benevolent, akin to that which Mary praises in her Magnificat, when she proclaims that God’s mercy extends in every generation to those who fear him. [205] This plan of mercy continues to unfold throughout history today, even amid the rapid and unsettling changes brought by algorithms and global networks, and it becomes a compass in the digital era for living our lives according to the Gospel.
At the heart of everything is the mystery of the Incarnation, the Word who became flesh and dwelt among us. The flesh of the Son, poor and vulnerable, evokes the flesh of so many brothers and sisters stripped of their dignity and reduced to silence. [206] Through the Lord’s closeness, the gift of peace enters into the world in a paradoxical way. It does so through the power to become children of God, and is awakened when we allow ourselves to be moved by the tears of the little ones, the fragility of the elderly, the silence of victims and the struggle of those who fight against the evil they do not wish to commit. [207] In this wounded yet beloved flesh, the Father shows us the true humanity of a life fulfilled through openness and communion, which leads us to desire that his will be done on earth as it is in heaven. [208]
In the promises of transhumanism and some posthumanist currents of thought, which seek an enhanced and almost disembodied humanity, we recognize a yearning that is of concern to us, namely the need for a fuller life, less exposed to limitations and suffering. Yet the Incarnation opens a different pathway. On the one hand, old and new ideologies alike urge humanity to overcome limitations through technology, and to rise above others by asserting dominance. Contrary to this, the mystery of the Son of God entering into our human condition promises something quite different. The living God descends into our history in order to free us from all forms of slavery. [209] He takes upon himself our weakness and transforms it into a setting for salvation. There is no moment or human situation that is not worthy of God. “According to the teaching of our faith, we have and adore, in our mysteries, a God who is born in a manger, a God who lives and travels in Judea, a God who dies on the cross, a dead God who lies in the tomb.” [210] The future of humanity, therefore, finds its standard in the ability to welcome this divine way of drawing near, of sharing the burden of the world, of transforming relationships from within. “O wonder… man is God and this God-Man passes through all those stages, endures all those states and ennobles them, sanctifies them, deifies them in himself!” [211] What saves humanity is the divine love that descends into the most fragile point of our history and renews it from within.

3-5 year AGI time-frame likely; need to prepare

Sundar Pichai, CEO Google, May 22, 2026, Sundar Pichai Understands Why People Are Anxious About A.I., New York Times, https://www.nytimes.com/2026/05/22/podcasts/sundar-pichai-understands-why-people-are-anxious-about-ai.html

Roose: The last time we had you on, we asked you about A.G.I. and your feelings about the term. At the time, you responded that it didn’t really matter whether you reached A.G.I. or not, because the systems are going to be very, very capable, and Google’s strategy should be the same. I noticed that you did not say A.G.I. in your keynote. Demis [Hassabis] did, but you did not. What’s your relationship with the term A.G.I. today, and the idea that all of this progress is building toward something singular and world-changing?

Pichai: There is inevitable progress toward A.G.I. that’s happening. I have long understood it, otherwise I wouldn’t have pivoted the company 10 years ago to put that technology at the heart and center of the company. All I meant by that statement was that even in the scenario where A.G.I. is going to take 10 years, the technology — which is three years out — will be so much more powerful than what we have today that I don’t want people to think that because A.G.I. is 10 years out, you don’t need to act or prepare.

Roose: Are you A.G.I.-pilled?

Pichai: I absolutely am sure that the technology is making foundational progress toward A.G.I. I am less able to predict with certainty whether it’s in the three-to-five-year time frame or the five- to 10-year time frame. The rate of progress over the last one to two years has made me feel it’s on the closer side than not. In my role running one of the largest companies in the world, which has a responsibility to society, the language I choose to use around it might be different than other people’s. But 10 years ago on the I/O stage, I announced T.P.U.s and A.I.-first data centers. Yes, we clearly understood where this technology is headed.

Newton: As a last question, one of the more memorable phrases from the keynote this year came from Demis, when he said that we’re in the “foothills of the singularity.” Can you tell us concretely what that means from Google’s perspective? And should people be excited about that, or afraid, or both?

Pichai: I have had many conversations with Demis on this topic. In this context, he is defining “singularity” as the advent of A.G.I. If you believe that, it makes sense to you, that’s what you’re articulating. For him, that’s how he defines singularity. I think that myself and many others feel it’s important to articulate that, if that’s what you believe, because we are all at the frontier building this technology, and hopefully people are listening. It’s important as a society that we are internalizing that and getting ready for it.

Cyber attack risks growing

Andrew NG, co-founder Google Brain, Professor at Stanford, May 22, 2026, https://www.deeplearning.ai/the-batch/issue-354,

Cybersecurity Alarms Grow Louder

An AI-generated script to bypass two-factor authentication signals a dawning era of industrial-scale cyberattacks, according to a Google report

What’s new: Hackers used a large language model to identify a previously unknown vulnerability that made it possible for them to commandeer a widely used web administration tool, security researchers at Google reported. The researchers believe a criminal planned to use the technique on a large scale, and its discovery thwarted a broader attack. Their study outlines a variety of cybersecurity threats posed by the steady advance of large language models.

How it works: The Google team identified several ways in which large language models are making it faster and easier to execute cyberattacks. LLMs have aided cyberattacks before, and Anthropic recently warned that its Claude Mythos Preview model can find previously unknown vulnerabilities, but the report offers a catalog of up-and-coming approaches.

Morphing malware: LLMs can generate malware that evades detection by changing elements of its code. Such programs include a so-called mutation engine that, every time they replicate or infect a new system, rewrite their own decryption routines, swap commands for alternatives that accomplish the same results, add nonfunctional subroutines, and so on without changing their functions. This approach can evade antivirus detection while keeping malicious payloads intact, increasing the danger of attacks that steal data, install backdoors, or encrypt files.

Identifying logical flaws: Unlike tools typically used by cybersecurity professionals to find bugs in code, which often work by finding known patterns or bombarding it with random data until it breaks, LLMs can reason about what code is intended to do and apply that reasoning to identify logical flaws. This capability can discover vulnerabilities that are invisible to the usual tools and would require a focused review by human experts to find.

Obfuscation networks: Threat actors often orchestrate ad hoc sets of routers, servers, and specialized technology to hide their points of origin, cover their tracks, and bypass defenses. AI-powered tools can direct malicious traffic through multiple compromised intermediary servers while avoiding patterns that would alert typical security monitors.

Insecure AI infrastructure: AI infrastructure itself is becoming an attractive target for hackers. Beyond using AI to mask attacks, attackers increasingly target AI tools, models, and accessory software as entry points into networks. Compromising insecure components gives attackers a foothold to spread deeper into systems and steal data, deploy ransomware, or disrupt operations.

Behind the news: Security personnel and policy makers are reviewing defenses and governance measures in light of Claude Mythos Preview. Researchers at the cybersecurity firm Calif used that model to penetrate Apple’s famously sturdy security. Calif brought the exploit to Apple, which is working on a patch. Meanwhile, the United Kingdom-backed AI Security Institute (AISI) reported that Claude Mythos Preview and OpenAI’s GPT-5.5 could reliably execute attacks that would be expected to take humans 3 hours — substantially longer than their previous forecast of 1 hour. (At its debut, Claude Opus 4.6 was able to execute attacks that take people 30 minutes.) AISI’s tests limited the models to 2.5 million output tokens. When they allowed the models to use more tokens, the models were able to execute attacks that would take human attackers longer.

Why it matters: Google’s findings point to a widening gap between the ability of LLMs to find security vulnerabilities and widely used security methods. The report’s description of automated, industrial-scale attacks implies that next-gen LLMs may be able to exploit bugs faster than cyber teams can implement patches. Its findings may spur further federal scrutiny and complicate both regulatory and commercial efforts, as AI is both a defensive and an offensive tool as well as a prime target of attacks.

Compute is finite; the plan trades-off

Dave Blundin, May 22, 2026 [Co-Founder of DataSage, EverQuote (Nasdaq:EVER) & Vestmark. Founder of MIT’s first Neural Network AI company. Founder and General Partner of Link Ventures. Watch the MOONSHOTS podcast on YouTube!DB2. Co-Founder of DataSage, EverQuote (Nasdaq:EVER) & Vestmark. Founder of MIT’s first Neural Network AI company. Founder and General Partner of Link Ventures. Watch the MOONSHOTS podcast on YouTube], https://www.linkedin.com/posts/dave-blundin_when-you-talk-to-corporate-america-everyone-ugcPost-7463695262532816898-gw1x/?utm_source=share&utm_medium=member_desktop&rcm=ACoAAAG2ccwBzv9cDlrg5l0-IxF3n_nDwytJiuE

When you talk to corporate America, everyone takes inference compute for granted. We’re all used to surplus. You go to the cloud, buy whatever you want, and it’s always right there waiting at a fair price. That will probably never be true again.

Two to three years from now, leaders are going to realize they can automate huge fractions of their business to promote growth, release new products, or even rebuild their full tech stack using Blitzy. But, there won’t be any excess compute for opportunistic projects.

I had a long conversation with Kush Bavaria from Ornn the other day. The smart executives are already using it to reserve future compute.

Compute is now finite commodity like oil or gold. The main difference is the sheer rate increase of token consumption, while the supply of new inference tokens is highly predictable and severely lagging. I believe this is the new normal forever hereafter.

Fast progress toward AGI

Demis Hassabis, CEO, Google DeepMind, May 21, 2026, https://www.linkedin.com/feed/update/urn:li:activity:7463305686215557120/

It was great to be at I/O again this year to share our latest models and capabilities on the path to artificial general intelligence (AGI). The staggering pace of AI progress is incredible, even for those of us who have spent our entire lives working on this technology.

A few highlights from our key announcements:

– Gemini Omni Flash: A major leap in world understanding and multimodal editing, Omni can take photos, video and audio, and create videos with entirely new cohesive scenes. Over time, Omni will be able to generate any output from any input.

– Gemini 3.5 Flash: Our most capable Flash model yet, it outperforms Gemini 3.1 Pro on coding and agentic tasks while being 4x faster than other frontier models – and 12x faster in Antigravity.

– Gemini for Science: A collection of experimental AI tools to help researchers streamline daily scientific tasks, like staying on top of newly published papers or generating and evaluating new hypotheses.

– CodeMender: Built on Gemini, our code security agent automatically finds and fixes critical software vulnerabilities. It’s now being tested by experts using our new API and we’ll be launching it more broadly soon.

– SynthID: OpenAI, Kakao Corp, and ElevenLabs are joining NVIDIA in adopting our imperceptible SynthID watermark for tagging and identifying AI-generated content. We’re looking forward to expanding to more partners and setting the standard of transparency for the AI era.

Agents and world understanding will be crucial aspects of achieving AGI. As we advance towards this, it’s important that we are clear-eyed about the potential challenges and use all the tools at our disposal to ensure the safety of our agentic systems, and ultimately AGI itself.

When we look back at this time, I think we will realise that we were standing in the foothills of the singularity. Built right and deployed responsibly, AGI will be a force multiplier for human ingenuity, and could unlock scientific progress and human flourishing beyond our current imagination.

Superintelligence defined

Robert Brooks, May 16, 2026 Lambda (Datacenter) Chief Commercial Officer, The myth of interchangeable AI compute, https://archive.thedeepview.com/p/the-myth-of-interchangeable-ai-compute

Brooks: Take the word “computer.” Today, it means a box. It used to mean a human job. “Intelligence” is similarly abstract. Most of us just picture really smart people. Put “super” in front of it, and the goal is to create a tool that goes beyond what any human has done in every domain, all at once. Edison was not necessarily smart in domains he never touched, [for example]. Combine human intelligence with that tool, and you get drug discovery, safer transportation, [and] faster movement through the economy. The mission for Lambda is to democratize that: superintelligence for all, not something locked inside one lab

Humans will merge with machines

Calder McHug, 5-15, 26, Politico Magazine, Silicon Valley Wants to Put a Chip in Your Brain, https://www.politico.com/news/magazine/2026/05/15/silicon-valley-ai-transhumanism-brain-data-00900799

here will come a time, in the not-so-distant future, when you decide to stick a computer chip in your brain. “Someone you work with will get it first. And you’ll hold out for a while, the way you did with the smartphone. But eventually, you won’t,” said Phoenix, dressed in all black with a tiny mic attached to his ear. “The advantages of integration will be hard to compete with.” Put bluntly, in his view, “We’re on the cusp of the next major transition, the merger of humans and AI.” This perspective, as outlandish as it may sound, is commonly held in Silicon Valley. OpenAI CEO Sam Altman mused way back in 2017 that “a merge is probably our best-case scenario” for survival after the emergence of superhuman AI. Tech billionaire Peter Thiel is a vocal advocate of “transhumanism.” There is good reason to be skeptical about an imminent evolution for the species. The technology to perform this kind of merger — to radically change what it really means to be human — remains in its earliest stages. Even setting aside the uncertain future of AI, the so-called implantable brain-computer interface (BCI) is still quite nascent. And yet, there are huge sums of money already sloshing around the technology, with more to come. The BCI market, currently sitting at around $350 million, is expected to reach $1.2 billion by 2035, according to Future Market Insights. That doesn’t include companies like the one Phoenix founded, named Vicarious, which sought to use core principles in the brain to build AI that could act like humans. Phoenix, who is now a venture capitalist, sold his company to Alphabet in 2022, after it was funded to the tune of $250 million by investors including Elon Musk and Mark Zuckerberg. The broader neurotechnology market is projected to expand to $52 billion by 2032, according to the Neurorights Foundation. With that much money at stake, not to mention the future of humanity, it should be no surprise that the political battle over our brainwaves is starting to heat up. That’s particularly true amid skepticism on both the left and right toward Silicon Valley’s relentless push for AI growth. Thus far, the use case for implantable devices is largely medical in nature. Noland Arbaugh, the first human to receive an implanted BCI from the Musk-founded Neuralink, is paralyzed from the neck down but is now able to use a computer with his mind. (Neuralink has reported 21 clinical trials of implantable BCIs in humans as of January, after years of trials with primates that often ended in gruesome fashion.) There are also less intrusive forms of BCI; wearable technology like smart glasses, fitness wristbands or stress-tracking apps are already widespread. These products are lucrative for their makers not just because of how many people are buying them, but because of what can also be extracted from customers: the extremely valuable commodity of neural data. Ownership of extensive neural data can be used to do anything from serve extremely targeted ads to surveil or manipulate consumers’ behavior. The tech billionaires who believe in the possibility of true human-AI integration may also see a chance to make some more money in the meantime. “If data is the oil of the 21st century,” former UNESCO Director-General Audrey Azoulay wrote in November in the Financial Times, “then ‘brain’ data is the crude oil. We need to guard it more jealously.” Within the burgeoning movement of neurorights advocates, there are real debates about how best to address companies worming their way into consumers’ brains. But there is broad opposition among advocates to the notion that AI and humans must become one for the species to survive — and broad concern about the private sector extracting neural data from consumers to speed that process along. “That logic strikes me as very twisted,” said Susan Schneider, the director of the Center for the Future of AI, Mind and Society at Florida Atlantic University, adding that AI should be developed “in a way that protects privacy and promotes human flourishing.” Ultimately the simmering debate over collecting neural data is about a lot more than just legislation on privacy. It hits at the very heart of what makes us human, now and in the future. Late last year, the neurotechnology company Kernel — founded by Bryan Johnson, a tech entrepreneur who insists he can live forever — published its quarterly newsletter. Alongside new product announcements, CEO Ryan Field wrote, “If you’ve been dreaming about building models on brain data, we’ve got the best solution for high-quality data collection at scale and are actively developing more advanced capabilities.” In an interview, Field said the company is selling technology that other companies can use to collect neural data and train large language models on. “[Our technology] gives very rich and powerful information you can use to train new models to do all kinds of different things,” he said. Field said they don’t sell neural data at Kernel, but that they want to collect as much of it as they can with consent to build their own wearable devices that better measure cognitive health and activity. “I’m looking for people who will exchange their brain data in exchange for some kind of compensation,” he continued, noting their work is in the research domain and includes clear consent forms. But as neurotechnology companies become more consumer-oriented and less focused on clinical trials, it’s not entirely clear what information they are allowed to collect from their users. A small group of lawyers, scientists and advocates are now trying to protect user data from being bought and sold without their consent. That is, if they can get on the same page with one another. Regulation of neural data at any governmental level remains in its infancy, but some states have begun to take up measures to stop companies from having access to certain kinds of information. In Colorado, California and Connecticut, legislators have amended existing privacy statutes to include information that is generated by a “consumer’s central or peripheral nervous system.” Montana has gone a step further, requiring consent, access, deletion and destruction obligations around any neural data. And Minnesota is in the midst of considering a broad, “neurodata rights” framework, which includes more than just an expansion of privacy protections. At the federal level, Senate Minority Leader Chuck Schumer (D-N.Y.), along with Sens. Maria Cantwell (D-Wash.) and Ed Markey (D-Mass.), introduced the MIND Act last September. The bill would direct the Federal Trade Commission to study how neural data can reveal thoughts, emotions, or decision-making patterns — and how it and related data should be regulated. Many of these proposals have been written in part by the Neurorights Foundation, a global advocacy group that reports funding from the Omidyar Network and the Alfred P. Sloan Foundation. The group, begun by neuroscientist Rafael Yuste in 2022, is trying to build regulations protecting brain data across Europe, Latin America and the United States. The foundation says they are currently working in nine other U.S. states to pass similar legislation to what they’ve already pushed across the finish line in California, Colorado, Montana and Connecticut. The group is made up in large part by technologists who have no fundamental opposition to building neurotechnology but are also concerned about safety. They aim for a targeted approach to regulation: expanding existing privacy laws so as to protect consumers’ neural data, without stifling innovation. “Neurotechnologies have wonderful and profound implications to enhance human flourishing,” said Stephen Damianos, the CEO of the Neurorights Foundation. “Without common sense regulations and safeguards, there is risk that humanity will never benefit from what these technologies have to offer. [That’s] because of heavy handed regulation that will come in to correct for harms that can and will occur because of enormous lack of public trust, because of scandals and actual instances of the technologies harming people.” It’s an argument tailored for the industry and for the safety-conscious alike: Consider some reasonable safeguards now, so there aren’t angry mobs later. Damianos appears eager to head off the kind of caustic battles that have emerged over other AI-related issues like data center buildouts or LLM usage. Not everyone agrees with the Neurorights Foundation’s approach. Nita Farahany, a professor of law and philosophy at Duke and a leading scholar on emerging technologies, believes questions of neural data should be treated separately from other privacy issues rather than simply amending existing privacy law. “The most intimate data is the data about what you’re thinking and feeling that could be gathered through neural data,” said Farahany. Even amid disagreement over how to treat data privacy issues more broadly, she said carving out distinct rules for what’s in our brain waves could be possible. BEIJING, CHINA – MARCH 19: The Beinao-1, a semi-invasive brain-computer interface system, is displayed during a press conference at the Chinese Institute for Brain Research on March 19, 2026 in Beijing, China. (Photo by Kevin Frayer/Getty Images) The Beinao-1, a semi-invasive brain-computer interface system, is displayed during a press conference at the Chinese Institute for Brain Research on March 19, 2026 in Beijing, China. | Kevin Frayer/Getty Images Others worry that advocacy aimed specifically at protecting our innermost thoughts is the wrong approach. After all, even if most people are not close to having an implanted chip in their brain, wearable neurotechnologies — from smartwatches to sleep- and stress-tracking rings — are here and corporations are already gobbling up that data. “Neural data can’t really reveal our private thoughts at the moment, so why are we raising the alarm about this right now? Anything that neural data can really reveal today can be revealed through other means,” said Anna Wexler, the principal investigator of the Wexler Lab at the University of Pennsylvania, where she studies the ethical, legal and social issues surrounding emerging technology. That doesn’t mean regulation isn’t needed, but that it should include information that’s already being collected by companies that specialize in wearable technology. “Maybe it’s worth creating new laws or new legislation, but that shouldn’t be specific to neural data,” Wexler said. “Maybe it should more broadly capture inferences about mental states.” Those in the industry bristle at the notion of having to abide by a series of different state laws governing neural data. Field, the CEO of Kernel, argues any regulation should be done at the federal level. “For innovation to happen in this space, we can’t be navigating 50 different legislative agendas,” he said. “Let’s get the right stakeholders involved, so that you have actual subject matter experts and not just science fiction enthusiasts writing laws.” This approach echoes the Trump administration’s broader stance on AI, which critics say amounts to letting industry run wild. Proponents of establishing neural data restrictions, many of whom are scientists themselves, insist that the companies working on neural data products are using the idea of competition with China and a potential patchwork of state legislation as a cudgel to shut down any and all regulation. The debate remains fluid, in part because the field is still so nascent. The back-and-forth over some of these proposals is even unknown to many working in the industry. Phoenix insisted that he broadly believed in privacy protections, but that he hadn’t heard of any of the specific state legislation around the brain. In an interview with Ross Douthat last year, Thiel was notably hesitant when asked whether the human race should survive. He eventually said yes, before adding, “But I also would like us to radically solve these problems. And so it’s always, I don’t know, yeah — transhumanism. The ideal was this radical transformation where your human, natural body gets transformed into an immortal body.” Thiel is not alone among tech titans who are increasingly talking about the idea that “humanity” might not look much like humankind as we know it. “The next era of human is here,” Johnson, the anti-aging guru doing everything he can to his body to extend his lifespan, said in November. In January, Anthropic CEO Dario Amodei wrote, “I believe we are entering a rite of passage, both turbulent and inevitable, which will test who we are as a species.” This turns the already significant question of how best to maintain the privacy of our brains into an even more fraught discussion about the future of human life. Phoenix says that the word “transhumanism” is not a particularly useful phrase, but he is absolutely advocating for the merger of man and machine, and argues that failure to do so would inevitably mean a powerful AI would destroy or enslave humans. “We either get on the train, or we are left behind in a way that’s profoundly bad for us,” he said. “I don’t think we are going to be able to control a God brain. I think we have the opportunity to humanize it.” These ideas have produced fierce opposition from across the political spectrum. “You’ve demoralized an entire generation, and told them that they can look forward to basically being pets to the machines or to billionaires with machines,” Joe Allen, a social conservative and contributor to Steve Bannon’s War Room, said in an interview last year. “If that actually comes true, nightmare.” A protester holds a placard stating that ‘Musk murders monkeys’ at his company Neuralink during the demonstration. Protesters gathered outside the Tesla Centre in Park Royal as part of the Tesla Takedown Global Day of Action against Tesla and Elon Musk. A protester holds a placard stating that ‘Musk murders monkeys’ at his company Neuralink during the demonstration. Protesters gathered outside the Tesla Centre in Park Royal as part of the Tesla Takedown Global Day of Action against Tesla and Elon Musk. | Vuk Valcic/SOPA Images/LightRocket/Getty Images Plenty of critiques come from the left as well, with some arguing that the full-throttled embrace of AI only benefits a small group of the tech elite. “You think they’re staying up nights worrying about working people and how this technology will impact those people?” Sen. Bernie Sanders (I-Vt.) said recently. “They are not. They are doing it to get richer and even more powerful.” In general, AI accelerationists have a lot going for them at the moment. They have an ally in the White House, and an almost unlimited war chest ahead of the midterms. But public opposition to AI is real; an April POLITICO poll showed that only 13 percent of people believe the government should not regulate AI at all — a number that’s largely consistent across party affiliation. According to Silicon Valley’s leading evangelists, it’s only a matter of time before chips are implanted in all of our brains. Perhaps they are fooling themselves, or perhaps they just see a chance to make a lot of money. But for many, this is their honest conviction. For advocates and others concerned about privacy, no matter what the future looks like, it shouldn’t solely be determined by tech companies with a profit motive. Otherwise, humans lose a different kind of agency. In fact, a political backlash to AI or to the massive collection of neural data could endanger the very dreams of those hoping to build a new world. “Transhumanists that I know are very worried that their well-intentioned views on human flourishing could instead not be realized because of technosurveillance and human rights abuses,” said Schneider of the Center for the Future of AI, Mind and Society. “Thought data is the most intimate and private data there is,” she added. “If and when abusive platforms gain control of our thought data and misuse it — and use it to manipulate our behavior unbeknownst to us — we will have ruined the very transhumanist prospect from flourishing.”

US AI dominance critical to global leadership, the protection of democracy, and AI safety

Anthropic, May 14, 2026, 2028: Two scenarios for global AI leadership, https://www.anthropic.com/research/2028-ai-leadership

We’re releasing a new paper that explains our views on the competition on AI between the US and China.

It’s essential that the US and its allies stay ahead of authoritarian governments like the Chinese Communist Party, or CCP. AI will soon become powerful enough to be used to repress citizens at unprecedented scale, and even to alter the balance of power among nations. And since AI is advancing more quickly by the day, we have only a limited period of time to set the conditions of the competition—and determine whether and how those threats materialize. It’s with this in mind that we outline what’s required to ensure America stays ahead.

The most important ingredient for developing AI is access to the computer chips on which the models are trained (or “compute”). Since the most capable chips are developed by American companies, the US government currently limits China’s supply by enforcing tight export controls on them. Recent history suggests these controls have been incredibly successful. In fact, AI labs in China have only built models close in intelligence to America’s because of their talent, their knack for exploiting loopholes around these export controls, and their large-scale distillation attacks that illicitly extract the innovations of American companies.

In this post, we present two scenarios for what the world might look like in 2028, when we expect transformative AI systems to have arrived.

In the first scenario, America has successfully defended its compute advantage. Policymakers have acted to tighten export controls further, disrupt China’s distillation attacks, and further accelerate democracies’ adoption of AI. In this world, democracies set the rules and norms around AI. It’s also in this scenario that we’re most likely to successfully engage with China on safety, which we’re supportive of to the extent this is possible.

In the second scenario, America has chosen not to act. Policymakers have not tightened loopholes on the CCP’s access to compute, and AI firms in China have quickly taken advantage—catching up to the frontier and even overtaking America. In this world, AI norms and rules are shaped by authoritarian regimes, and the best models enable automated repression at scale. It will be no solace that this authoritarian triumph has happened on the back of American compute.

America and its allies approach AI competition from a position of great strength. The tools for AI dominance have been built by an exceptionally innovative ecosystem of companies in democratic nations. Our past success means that our present task is largely to avoid squandering our advantage: to decide not to make it easier for the CCP to catch up.

Two scenarios for the US and China in 2028

Summary

Democracies, not authoritarian regimes, must lead in AI development and deployment. These countries and political systems can shape the rules and norms that govern these systems.

Democracies currently hold a substantial lead in compute, the most important ingredient for developing frontier AI models. That lead exists thanks to American and allied innovation, and to bipartisan US export controls that defend those innovations. But on model intelligence, AI labs in the People’s Republic of China (PRC), under the jurisdiction and control of the Chinese Communist Party (CCP), are not far behind. We focus on the CCP as it is the regime that is most able to use frontier AI to cement authoritarianism; we do not seek to undermine the interests or ingenuity of the Chinese people. Already, the CCP is using AI to censor speech, repress dissidents, hack governments and corporations across the world, and strengthen the People’s Liberation Army (PLA).

AI labs in China have world-class talent. It is compute constraints that limit their ability to keep up. Labs in China have remained close by exploiting loopholes in US export control policies, and by carrying out large-scale distillation attacks that harvest the innovations of US models in order to mimic their capabilities.

With the supply of compute expanding rapidly, and with AI being used increasingly to augment the training of new AI models, we’re entering a period of great acceleration in AI capabilities. The “country of geniuses in a data center”—the level of intelligence we associate with transformative AI—may be close at hand. This acceleration makes policy action more urgent. To date, by allowing export control evasions and distillation attacks, we have let the CCP’s AI efforts trail closely up the frontier curve. But if the US and its allies act now to address both issues, it may be possible to lock in a 12-24 month lead in frontier capabilities. A lead that large by 2028 would be enormously advantageous. Such a lead would also augment efforts to engage with AI experts in China on AI safety and governance, which we support. But the window of opportunity to lock in that lead will not necessarily remain open for long.

Here, we present two potential scenarios for the state of US-China AI competition in 2028. The first scenario is one in which democracies have established a commanding lead in model intelligence, adoption, and global distribution. This scenario can be achieved if policymakers act now to tighten controls on advanced compute to PRC labs, disrupt their efforts to distill America’s best AI models, and accelerate democracies’ adoption of AI.

The second scenario is one in which the CCP is competitive at the near-frontier. This scenario happens if policymakers don’t build on our existing lead, or if they loosen restrictions on access to compute for PRC firms.

Many in Congress and the Trump administration have championed export controls, curbing distillation attacks, and exporting American AI. In advancing these policies, we are hopeful that democracies can secure a commanding lead by 2028, and avoid a destabilizing neck-and-neck race with the CCP two years from now.

The imperatives of staying ahead

We expect frontier AI to have transformational economic and societal impacts in the coming years, as described in Machines of Loving Grace and The Adolescence of Technology. Our mission is to ensure that humanity navigates the transition to transformative AI safely and beneficially. We believe that a successful transition can lead to astonishing breakthroughs in medicine, invention, and economic growth.

The threat of authoritarian AI

Whether that transition goes well depends in part on where the most capable systems are built first. The political systems in which the most advanced AI is created will shape the rules and norms for how the technology is developed and deployed. In turn, those rules and norms will help determine whether the technology is safe, whose security it protects, and whose interests it ultimately serves. We believe that responsibility should rest with democratically elected governments, not authoritarian regimes.

If the frontier is set by regimes that treat AI as an instrument of repression, military advantage over democracies, and domestic control, the transition is less likely to go well, for those regimes’ own citizens or anyone else.

Historically, the reach of authoritarian rule has been limited by its dependence on human enforcers to carry out surveillance and repression. Powerful AI systems may remove that dependency, enabling automated repression on a far greater scale. For that reason, the prospect of the CCP leading in AI is among the greatest threats to a successful transition.

The CCP holds enormous power and influence at the helm of China’s economy, military, and the largest authoritarian state structure on Earth. It is also the only country besides the US with well-resourced, highly talented AI labs chasing the frontier. Furthermore, the CCP is highly motivated to establish China as the leading AI power. Beijing has poured tens of billions of dollars into China’s AI and semiconductor sectors.

Already, the CCP uses AI systems to censor speech, enforce draconian policies on ethnic minorities, and hack major corporations and government agencies. The CCP’s vision of AI-enabled techno-authoritarianism has been extensively documented in Xinjiang, where state security agencies have systematically deployed facial recognition technology, biometric data collection, and communications surveillance, enabling repression at a scale that humans alone could not achieve. Frontier AI systems will make those capabilities cheaper to maintain, far more pervasive, and more sophisticated. The CCP’s export of these technologies has enabled autocrats in other countries to more effectively stifle dissent, entrenching authoritarianism. A CCP-led AI frontier could dramatically strengthen repression around the world.

AI is a dual-use technology

Frontier AI will shape the future military balance. CCP leadership already operates on that premise, and is building its military for an AI-enabled battlefield. PLA strategists view the “intelligentization” of their military forces as the means with which to catch up and eventually surpass the US military. The PLA is already procuring commercially developed Chinese AI systems for military use, including DeepSeek models deployed to coordinate swarms of unmanned vehicles and enable cyber offense capabilities. These capabilities will not diffuse slowly. When a new model reaches a new capability in autonomous targeting, vulnerability discovery, or swarm coordination, for example, the regime that controls it can put it onto the field in weeks, not years.

The risk compounds because frontier AI will be an accelerant for other critical technologies. Advanced AI models will be able to compress research and development (R&D) cycles in semiconductors, biotech, and advanced materials. A lead in frontier AI will enable a widening lead across the full national security technology stack.

If a PRC AI lab had developed a model at the level of Claude Mythos Preview before an American one, the CCP would have had first access to a system that can autonomously discover and chain software vulnerabilities, which it could have used to further penetrate critical American infrastructure. Future models will be exponentially more capable, and therefore have commensurately greater implications for the national security interests of the US and other democracies.

Neck-and-neck competition risks disincentivizing responsible AI

A neck-and-neck race between American and Chinese AI labs could make industry and government-led safety and governance efforts more difficult, and less likely. If PRC labs are either close behind or at par with models in the US, private AI firms in the US and China are likely to feel more pressure to release new models and products faster, without taking prudent pre-deployment safety measures. Governments could become reluctant to enact policies to encourage responsible AI development and deployment, for fear of falling behind.

While increasing numbers of researchers in China’s AI labs and policy community are concerned with AI safety risks, this trend has not translated into safety practices on par with labs in the US. As of last year, only 3 out of 13 top Chinese AI labs published any safety evaluation results, and none disclosed evaluations for Chemical, Biological, Radiological, and Nuclear (CBRN) risks. The Center for AI Standards and Innovation (CAISI) found that DeepSeek’s R1-0528 model complied with 94% of overtly malicious requests under a common jailbreaking technique, compared with 8% for US reference models. This pattern has continued in more recent releases. For example, an independent assessment of Moonshot’s Kimi K2.5 published in April found that the model failed to refuse CBRN-related requests at a far higher rate than US frontier models. Compounding the problem, labs in China often release dual-use capable models as open-weight. Once a model is open-weight, safeguards that do exist can be removed, making the model available to any state or non-state actor to use for malicious purposes, including the cyber and CBRN misuse those safeguards were built to prevent.

Our policy objective: creating and maintaining a lead for democracies

We support policies in the US and other countries that build and maintain a safe, near-term lead over the CCP in intelligence, domestic adoption, and global distribution. This lead is key to avoiding authoritarian AI leadership and protecting the national security interests of the US and other democracies. Doing so is a fundamental prerequisite to ensuring that democratic states can achieve favorable terms with authoritarian states.

Anthropic deeply respects the Chinese people and the accomplishments of the Chinese AI community. We hope for peaceful relations between China and the world. Our concerns are specifically with the risks to humanity posed by any powerful authoritarian political systems with access to frontier AI systems.

Opportunities for engagement on AI safety

Anthropic supports international AI safety dialogue with AI experts in China, when possible. The world has a vested interest in safe AI, regardless of where it is developed and deployed. There are a range of risks that could emerge from frontier AI systems requiring engagement between the US and China. Efforts that identify shared challenges and advance ideas to prepare for and mitigate these risks are in our shared interests.

The prospects for productive engagement are best when the US maintains a large capabilities advantage. Responsibly building a lead in developing and deploying the most advanced AI augments our ability to influence AI safety in China and elsewhere.

The Mythos Preview wake-up call

Mythos Preview, a model that we released to select partners as part of Project Glasswing in April, signals the arrival of an acceleration period that makes policy action even more urgent. With access to the model, Firefox was able to fix more security bugs last month than it had in all of 2025, and almost 20 times more than its monthly average security bug fixes in 2025. In response to the model, one PRC cybersecurity analyst wrote that China is “still sharpening our swords while the other side has suddenly mounted a fully automatic Gatling gun.”

Frontier AI capabilities will quickly approach the “country of geniuses in a datacenter” portrayal of transformative AI. This acceleration will be driven by the logic of scaling laws, in which model performance improves predictably with increases in computing power and data inputs, and by AI itself increasingly being used to accelerate the development of new models.

There is a high likelihood that we will look back on 2026 as the breakaway opportunity for American AI. American labs have the most advanced AI models, a large lead in both the quantity and quality of the advanced AI chips required to push the frontier, and a colossal capital advantage from revenues and financing to back the necessary investments to achieve it. PRC labs have real strengths: world-class, innovative talent, abundant and cheap energy, and plenty of data. All are requirements for developing frontier intelligence. But they simply do not have sufficient domestic compute to compete, nor do they have the revenues and capital to fund it.

Four fronts of the competition

The US and China are engaged in a competition for strategic advantage in frontier technologies like AI. Statements from both Beijing and Washington reflect that view. Calling that competition a “race” can give the false impression that there is a finish line, after which one side will conclusively secure victory. Rather, the competition will be an ongoing contest for advantage, in which either democracies or authoritarian regimes successfully position themselves to shape the values, rules, and norms of an AI-enabled future.

This competition is playing out on four fronts:

Intelligence: which countries develop the most capable AI models.

Domestic adoption: which countries integrate AI most effectively across commercial and public sectors.

Global distribution: which countries deploy the global AI stack on which the world economy runs.

Resilience: which countries sustain political stability through the economic transition.

Intelligence is the most important of the four fronts. We anticipate that frontier model capabilities will drive the most consequential changes for geopolitical competition. Model capabilities are also a primary driver of market adoption and global distribution.

But intelligence alone is not sufficient. If the CCP integrates near-frontier AI systems quicker and more effectively into China’s economy and the CCP security apparatus, and drives global adoption of subsidized, low-cost AI, then it could secure advantages over democracies that overcome an intelligence deficit. Beijing’s AI+ Initiative and its focus on “embodied intelligence” accordingly put high priority on policies that advance the integration of frontier intelligence into their economy and state apparatuses. The Trump administration’s AI Action Plan, and its focus on “promoting the export of the American AI technology stack,” also speaks to the strategic advantage of driving global adoption.

While we won’t focus on it in this essay, we believe resilience will be an important front of AI competition. Being able to sustain stability, cohesion, and good policymaking in this period will be a critical advantage, and a vulnerability for those who cannot.

The state of the competition

Compute—the advanced semiconductors needed to train and deploy frontier AI—is an essential input on each front of the competition described above. The race for global AI leadership is in large part a race for compute. For more than a decade, model capability has scaled with compute, and the majority of performance gains in AI capabilities have historically come from simply using more of it. Moreover, compute is needed to serve customers’ use of AI (also known as “inference” capacity), not just to train new models. Compute will be critical both for training the most intelligent models and for deploying them in commercial and national security spheres. Access to top talent, copious amounts of data, and critical algorithmic advances all matter to the race for intelligence—but each of those inputs is irrelevant if the compute is insufficient.

Democracies are winning the competition for compute leadership today. While some worry that export controls could accelerate the CCP’s own efforts to develop an advanced chip supply chain, little evidence suggests that China’s indigenization efforts will challenge US and allied leadership in advanced compute technology. Beijing has invested enormous resources into China’s chip sector, with major industrial policy initiatives like the Made in China 2025 strategy and the China Integrated Circuit Industry Investment Fund launched years before the imposition of export controls. Despite this state-backed investment, PRC AI labs and chipmakers remain stymied by US and allied export controls on advanced chips and chipmaking equipment.

As a result, the compute gap appears to be widening. An analysis of Huawei and NVIDIA’s roadmaps found that Huawei will produce just 4% of NVIDIA’s aggregate compute in 2026 in total processing performance, and 2% in 2027. Moreover, NVIDIA represents only part of the US and allied compute ecosystem, with Google and Amazon ramping up production of their own chips (TPUs and Trainium, respectively) to meet demand from American frontier AI labs and their customers.

Further exacerbating their compute shortfalls, China has made little progress in many of the most technologically complex segments of the semiconductor supply chain. Without access to extreme ultraviolet (EUV) technology, and even more so if policymakers can close loopholes on deep ultraviolet (DUV) technology and servicing and maintenance thereof, China’s chipmakers will remain unable to manufacture chips in sufficient quantity or quality to challenge US compute leadership. China’s inability to manufacture high-bandwidth memory at scale further exacerbates this gap. If the US strengthens its restrictions on the CCP’s ability to access US compute, one study estimates that America will have access to roughly 11 times more compute than China’s AI sector.

How democracies built the lead: commercial innovation and smart public policy

There are two main reasons for the compute lead. The first is the incredible innovation of companies like NVIDIA, AMD, Micron, TSMC, Samsung, ASML, and others across democracies like Japan, South Korea, Taiwan, the Netherlands, and the US, who together have built the unique technologies in the world’s most advanced semiconductors. Today’s AI achievements would not be possible without the feats of engineering and decades of sustained R&D investments that contributed to these products.

The second reason is forward-looking, decisive policy action across the last three presidential administrations. Bipartisan policy action has protected the US and allied innovation engine by restricting access to the US AI stack by PRC firms under the jurisdiction of the CCP. Our CEO has publicly commented on the importance of export controls, for example. These controls have curbed the sale of the highest-end AI chips and semiconductor manufacturing equipment (SME) to China over the last several years, constraining China’s frontier AI development even as Beijing has poured enormous state resources into the sector. Without action to limit China’s access to US compute, the CCP would have had all the ingredients to develop AI at par or superior to America’s.

Some observers worry that constraining access to compute will force AI labs in China to innovate on other axes, reducing the American lead. While PRC labs are innovating, these innovations are so far not sufficient to overcome their compute deficit. Algorithmic improvements are both a function and a multiplier of compute, not a substitute for it, and discovering those advances is itself a compute-intensive process: more compute enables labs to run more experiments, which enables labs to discover more algorithmic improvements. As frontier models increasingly conduct AI R&D themselves, that loop will tighten further, and frontier models will help build their own successors. In short, compute advantage compounds into algorithmic advantage, and from there into a durable lead in AI itself.

Today, US frontier systems are estimated to be at least several months ahead of the top models from PRC AI labs on intelligence, though these estimates are necessarily uncertain. Despite the attention paid to open-weight models from China, their enterprise adoption lags closed frontier models, and monetization concerns have surfaced among public investors. Moreover, AI labs in China seem to be moving away from open source, now choosing to keep their best models proprietary.

China’s own AI leaders confirm the impact of export controls, and the critical need for US chips. Executives at top PRC AI labs have expressed worries that China will fall further behind due to compute constraints. Top Chinese labs cite compute scarcity as a chief constraint to accelerating model capabilities, and they identify export controls as the reason for this constraint. One executive of a China-based hyperscaler called the impact of supplying export-controlled US chips to China “huge, really huge,” adding that any supply gap severely impacts China’s AI development and dismissing concerns that importing U.S. chips would slow their self-sufficiency efforts. The primary voices in China suggesting export controls are futile seem to be CCP officials and state media, likely angling to influence US policymakers.

How the CCP stays competitive: policy loopholes remain

While export controls have been effective in providing today’s advantage, they have not gone far enough. Despite the CCP’s inability to manufacture enough advanced chips domestically or purchase them legally abroad, AI labs in China have been able to stay close on intelligence through two workarounds: illicit and evasive compute access, by smuggling AI chips directly into China and accessing offshore data centers, and illicit model access, through which they carry out distillation attacks on US frontier models and use those same models as tools to accelerate their own AI R&D.

China’s evasion of US export controls is an open secret. For example, federal prosecutors charged a Supermicro co-founder and two others with diverting $2.5 billion worth of servers containing advanced US chips to China. According to US government and media reports, DeepSeek trained its latest model on advanced US chips that are banned from sale to China. The Financial Times reported that Alibaba and ByteDance now train their flagship models on export-controlled US chips in data centers located in Southeast Asia, a route current controls do not reach because US export law covers the sale of chips, not remote access to them.1 The US export control system is struggling to prevent PRC AI labs’ access to advanced US-origin compute.

Distillation attacks, in which China-based labs create thousands of fraudulent accounts to circumvent access controls on US AI models and systematically harvest their outputs to replicate frontier capabilities, are another illicit technique used by PRC labs to catch up to their US counterparts and blunt the impact of export controls. The practice allows labs based in China to free-ride on decades of foundational research, billions of dollars in US investment, and the work of thousands of the world’s best engineers that produced US frontier models. The result is near-frontier capability at a fraction of the cost, subsidized by the United States. It is systematic industrial espionage of a technology critical to long-term US national security interests. OpenAI, Google, Anthropic, and the Frontier Model Forum have all publicly condemned the practice of distillation attacks.

AI experts in China openly acknowledge distillation attacks’ scale and importance to China’s AI development. A recent article in a state-owned media outlet described distillation attacks on US models as the “back door” China’s AI labs depend on as a core part of their business model. An ex-ByteDance researcher said that PRC AI labs use distillation as a shortcut to train models, allowing them to avoid investing into their own data pipelines.

US policymakers have moved quickly to address this threat. The White House Office of Science and Technology Policy published a memo on distillation attacks. Senior officials in the White House, Department of War, and members of Congress have also called attention to this problem. Recent legislation from the House Foreign Affairs Committee to address distillation attacks passed out of committee unanimously.

If policymakers in the US and allied democracies act to close these two channels propping up China’s AI models—illicit and evasive compute access and illicit model access—then we have a potentially once-in-a-generation opportunity to secure our lead.

Two scenarios for 2028

Below, we describe two hypothetical future scenarios to help illustrate how policy actions taken today can shape where we are in 2028.

Scenario one: America and our allies have a commanding and expanding lead

America’s compute edge remains strong. Despite increased state support for China’s semiconductor industry, China’s chipmakers remain years behind their US and allied counterparts, stymied in part by their inability to access advanced SME tooling, servicing, and maintenance. The US-PRC compute gap is widening as increased US and allied chipmaking capacity comes online and as advanced chipmakers continue to innovate on more efficient and performant chips. In tandem, US policymakers have taken action to close loopholes in the US economic security toolkit, and efforts to smuggle chips into China and access export-controlled chips in data centers outside the country are increasingly frustrated by well-funded enforcement efforts.

Consequently, US AI models are 12-24 months ahead on intelligence, and the lead is growing. A small number of AI labs lead at the frontier with the most intelligent, capable, and performant models. All are based in the US. The “country of geniuses in a data center” has become a reality across critical industries, including cybersecurity, finance, healthcare, and life sciences. When US frontier labs release new models in 2028 that achieve step-function advances in capabilities (similar to the relative impact of Mythos Preview in April 2026), China will not have access to similar AI capabilities until 2029 or 2030. This gives critical breathing room for democracies to set the rules and norms of frontier AI systems.

American AI is the backbone of the global economy, driving new economic and scientific dynamism. The Trump administration’s efforts to drive domestic AI adoption and promote the export of American AI are succeeding, and the resulting gains from the adoption of powerful AI both at home and abroad are driving unprecedented economic growth and technological advancements. Global adoption of US AI has skyrocketed. Democracies’ lead in capabilities and compute means that China’s AI firms do not compete for global market share outside of a narrow group of autocracies. The world’s top frontier AI systems are shaped by democratic values and make it more difficult for authoritarian states to use AI systems to infringe on rights and civil liberties.

Cyber and other national security advantages expand. Public and private sector cyber operators and security professionals use advanced AI systems to reduce the attack surface in America and other democracies and blunt the CCP’s ability to gain and maintain cyber footholds in our systems, making our national security assets, IP, and communications networks more secure. The United States’ overwhelming AI advantage is a powerful deterrent to aggression.

A self-reinforcing cycle compounds democracies’ leadership. A commanding AI advantage makes the United States and its allies more attractive partners. That alignment expands both the market for American AI and the coalition setting global AI norms, which in turn promotes the development and deployment of AI systems that are safe, secure, and protective of civil liberties. The world’s top technical and scientific talent continues to gravitate to where the frontier is being built. The United States gains significant leverage with which to incentivize cooperation from Beijing on critical issues like AI governance, strategic competition, and trade. This cycle reinforces itself: the lead strengthens the coalition, the coalition strengthens the lead, the democracy-led international order is anchored through the transition to transformative AI.

Scenario two: The CCP-controlled AI ecosystem is neck-and-neck

AI developed and deployed in China is near-frontier on model intelligence. Despite a weak semiconductor production capacity, models trained by PRC AI labs are only a few months behind US models. Ongoing distillation attacks, overseas compute access, weak SME export enforcement, and a loosening of export controls on American semiconductors have assisted CCP efforts. Continued access to US frontier AI for AI R&D has also enabled AI labs in China to close the gap and approach parity with their US counterparts.

Rapid commercial and state adoption. Beijing has championed a whole-of-nation push on domestic adoption via “AI+” policies. Even though China’s AI models are slightly less capable than US models, CCP efforts to accelerate adoption have paid off. China is thus able to deploy near-frontier AI capabilities more advantageously across economic, military, and technological domains, shifting the balance of power in China’s favor.

The CCP’s AI-enabled cyber force is a serious threat. The CCP’s integration of AI-enabled cyber capabilities within an already advanced cyber force has sustained the PLA as a menacing cyber competitor. PLA cyber actors have gained additional access to critical and dual-use infrastructure in the US and most countries around the world, enabling them to disrupt critical national security and societal functions. As AI is incorporated deeper into our most critical systems, democracies enjoy no security advantages over China in AI, despite having developed the technology first.

Beijing is winning in global adoption on cost and on-prem flexibility. Huawei and Alibaba data centers are globally prevalent, especially in, but not limited to, lower cost markets in the Global South. These data centers scale on older chips, which China is able to export because it can serve its domestic market with a combination of US chips purchased with an export license, smuggled into China, or remotely accessed in overseas data centers. They host second-tier, but cheaper and still effective models produced by PRC labs. Similar to the Huawei playbook of being cheap and “good enough,” China’s near-frontier models and hardware support a non-trivial and rapidly growing segment of the global economy. This infrastructure advantage gives CCP leadership significant influence over those markets.

Ensuring democracies lead

To ensure we land in scenario one, we support the following areas of policy action.

Close the loopholes: Smuggled chips, foreign data center access, and SME. Today, PRC labs benefit from access to export-controlled American chips via smuggling and foreign data centers, and gaps in SME controls accelerate their self-sufficiency efforts. Tightening controls and ramping up enforcement budgets can help close these loopholes that prop up the CCP’s AI ecosystem. It would lower China’s compute ceiling and correspondingly slow their AI advances, thus sustaining and expanding democracies’ AI lead. Note that a lower compute ceiling could also materially impair distillation attacks, as AI labs in China still require a minimum threshold of compute to illicitly distill effectively.

Defend our innovations: Restrict model access and deter distillation attacks. Policymakers in Congress and the executive branch can continue to support policy actions to punish and disincentivize distillation attacks from PRC labs, while also taking steps to facilitate US labs’ ability to detect and prevent distillation attacks on its own. These could include a legislative clarification that distillation attacks are illegal, and efforts to facilitate threat intel and technical sharing between peer American labs as well as with the US Government. Curbing this behavior can materially extend a democratic lead in the coming months and years.

Champion the export of American AI. As public and commercial sectors around the world increasingly adopt AI, the Trump administration should continue its efforts to promote the global adoption of trusted AI hardware and models developed and shaped by democratic principles. Locking in trusted American infrastructure now denies the CCP’s AI ecosystem the global footholds it needs to compete on cost and adoption in the future.

Conclusion

America and its allies have developed both the world’s most capable frontier AI models and the world’s most advanced inputs to AI. This has provided a substantial advantage. If our superior access to that technology is defended, that advantage can be extended. But it will be lost if it is given directly to our competitors. The decisions made by policymakers this year will determine the future of transformative AI. We support those working to ensure that American and allied democracies are winning in 2028.

US AI lead means China cooperates on AI safety

Evelyn Chang, May 14, 2026, CNBC, U.S. can hold AI talks with China because ‘we are in the lead,’ Bessent tells CNBC as nations plan safety protocol, https://www.cnbc.com/2026/05/14/us-china-ai-rules-bessent-us-lead.html

The U.S. can talk to China about artificial intelligence because “we are in the lead,” U.S. Treasury Secretary Scott Bessent told CNBC, as the countries unveiled a protocol on best practices for the rapidly improving technology.

“The two AI superpowers are going to start talking. We’re going to set up a protocol in terms of how do we go forward with best practices for AI to make sure nonstate actors don’t get a hold of these models,” Bessent told Joe Kernen on Thursday, on the sidelines of President Donald Trump’s two-day meeting in Beijing with Chinese President Xi Jinping.

“The reason we are able to have wholesome discussions with the Chinese on AI is because we are in the lead,” Bessent added. “I do not think we would be having the same discussions if they were this far ahead of us,” he said.

U.S.-based Anthropic has alarmed many in Washington and other countries with the Mythos AI model, which is supposed to have powerful cyberattack capabilities. The company said it would initially release it to select business partners.

Robots can now do 24 hours of autonomous work

Neekita Walter, May 14, 2026, Interesting Engineering, ‘Uncharted territory’: Figure AI humanoid robots hit 24/7 nonstop work milestone, https://interestingengineering.com/ai-robotics/figure-ai-humanoids-24-hour-autonomous-run

Figure AI says its humanoid robots have now crossed 24 hours of continuous autonomous work, extending what was initially planned as an eight-hour test into a nonstop multi-day operation. The California-based robotics startup said three humanoid robots running its Helix-02 AI system are autonomously sorting small packages around the clock without human control. The company livestreamed the operation online, where the robots were nicknamed Bob, Frank, and Gary by viewers. The video player is currently playing an ad. “Our original goal was an 8-hour run. After zero failures yesterday, we decided to keep going. We’re now over 24 hours of continuous autonomous operation without a failure. This is uncharted territory,” Brett Adcock, founder and CEO of Figure AI, wrote on X. The company said the robots have already sorted more than 28,000 packages during the ongoing operation while maintaining speeds close to human workers. The livestream has drawn significant online attention, with viewers continuously tracking the robots’ uptime and performance as the operation moved beyond its original eight-hour target. Figure AI also added visible name tags to the robots after commenters began referring to them as Bob, Frank, and Gary. Human-speed package sorting According to Figure AI, the robots detect barcodes, pick up packages, and place them barcode face-down onto conveyor belts using onboard cameras and AI reasoning. “Humans average around 3 seconds per package. F.03 is now around human parity. The robots are reasoning directly from camera pixels,” Adcock said. The company added that the humanoids are operating fully autonomously using Helix-02, its in-house neural network running entirely onboard the robots. Figure AI stressed there is no teleoperation involved in the process. “There is no teleoperation – every action comes directly from Helix-02,” Adcock wrote. The system also includes automatic recovery mechanisms. Figure AI said if a robot gets stuck or encounters an unfamiliar situation, the AI system can trigger an autonomous reset and resume work without human intervention. “If the robot gets stuck or the AI policy goes out of distribution, Helix triggers an automatic reset,” Adcock said. Self-maintaining robot fleet Figure AI further claimed the robots can independently leave the work floor for maintenance if software or hardware issues appear, while another robot automatically takes over operations to maintain uptime. “If a robot has a software or hardware issue, it autonomously leaves for maintenance and another robot takes over,” Adcock wrote. The latest demonstration builds on Figure AI’s earlier claims that its humanoid robots completed full eight-hour shifts autonomously using Helix-02.

AGI imminent; recursive self-improvement possible by 2027/8

Ticomb, May 14, 2026, James Titcomb is The Telegraph’s Technology Editor and has covered the tech industry for a decade from Silicon Valley and London, Daily Telegraph, Humanity faces an uncertain fate as experts brace for superintelligent AI, https://www.telegraph.co.uk/business/2026/05/14/even-trump-worried-that-ai-will-soon-have-humanity-beaten/

It has been called the most important chart in the world. Every time one of the world’s top artificial intelligence companies unveils a new system, employees at the US research organisation METR put it through its paces, testing its ability to complete a series of increasingly complex tasks. The tasks are measured by how long each one would take a skilled human. These range from trivial arithmetic (two seconds) and completing a game of Wordle (13 minutes) to building complex military satellite software (taking a human expert 14.5 hours). The test serves as a gauge of how capable AI has become – and where it might go. The first version of ChatGPT, released in 2022, could only perform simple tasks that would take a human a few seconds. But as AI systems have become more powerful, they are able to complete more complex actions that would take humans hours or days, such as breaking into a medical website and downloading all its data. Advertisement METR has found that AI capabilities are doubling in power every 196 days. Plotted on a graph, this progress starts slowly then rapidly accelerates to a near-vertical plane. Talk to almost anyone in the AI industry for any length of time and the likelihood of them pulling up a version of the chart approaches 100pc, to the point where it has become a meme in its own right. Then the chart went off the scale. In April, the AI lab Anthropic announced it had developed a new system, called Mythos, that it said was too powerful to release to the public because of its ability to find gaping holes in online security systems. When METR’s researchers released the results of Mythos’s exam, they scored the system at 16 hours – meaning the world’s most powerful AI can now automate tasks that would take a human two full eight-hour shifts. However, they said the model was “at the upper end” of their ability to test. In other words, progress has become too fast for them to measure. While not everybody is convinced by the results as the test only measures if a machine can do something half the time, not if it can do it consistently, the METR chart has captured many people’s imaginations for two reasons. Advertisement First, the exponential growth looks strikingly similar to “Moore’s Law”, the maxim that has governed the electronics industry for more than half a century, stating that microchips roughly double in power every two years. Second, it measures abilities, rather than intelligence. While many AI “benchmarks” resemble university exams, dealing in abstract reasoning or maths, the METR test studies whether AI can actually work. It suggests that on current trends, vast amounts of human tasks could be automated in the next couple of years – including, most crucially of all, the art of developing AI models itself. At that threshold, known in the tech industry as “recursive self-improvement”, all bets are off. The concept is closely linked to superhuman AI because an AI that can make itself smarter could act like an evolutionary chain reaction, rapidly building to a system vastly more capable than mankind. AI would have become – as IJ Good, the Bletchley Park codebreaker, predicted in 1965 – “the last invention that man need make”. For 60 years, the idea seemed out of reach. But much of Silicon Valley believes this is about to change – and the Trump administration is starting to notice. ‘I don’t know how to wrap my head around it’ The vast majority of people’s experience of AI has not changed much in the last couple of years. The release of ChatGPT in 2022 generated an initial flurry of excitement and fear in equal measure but since then, progress has been less obvious. Advertisement Many people’s experience of AI comes in seeing an obviously fake video on their social media feeds, seeing an AI overview at the top of their search results or having a bot “helpfully” offer to summarise their emails. But at the coalface, people are rapidly bringing forward their timelines for the day that superintelligence arrives. Jack Clark, a co-founder of Anthropic, said last week that he believed there was a 60pc chance that AI systems would be capable of building themselves by 2028, kicking off a feedback loop in which technology rapidly surpasses human intelligence. “I don’t know how to wrap my head around it,” Clark wrote. “If that happens, we will cross a rubicon into a nearly-impossible-to-forecast future.” Jack Clark Anthropic’s Jack Clark says there is a 60pc chance that AI systems will be capable of building themselves by 2028 Credit: Saul Loeb/AFP via Getty Images Just a few years ago, researchers were pessimistic about the idea that machines would overtake us any time soon. A survey of conference attendees in 2018 put the date at which AI would surpass humans – a moment called “artificial general intelligence” (AGI) – at around 2068. One in four said it would not happen within a century. Advertisement A similar test in 2022 moved those timelines forward, putting the AGI date at 2060 with just one in 10 believing it would take more than 100 years. A year later, this was 2047. Metaculus, a crowdsourced predictions website, has moved its forecast from 2070 six years ago to 2032. “Forecasts have shifted substantially from mid-century toward the near term,” analysts at Rand, a security think tank, noted in a recent report. Two major developments since the release of ChatGPT in 2022 have explained this. The first was the release of so-called “reasoning” AI systems in late 2024. While earlier AI systems could string together lines of text, they showed little evidence of planning or working through problems as humans do. Reasoning systems, which display lines of text resembling a stream of consciousness, came up with far more complex, coherent answers to questions, as well as cutting down on “hallucinations” in which the systems simply make up answers. Neev Parikh, a researcher at METR, says reasoning systems have dramatically improved the rate of technological progression. “Progress was doubling every seven months, [then] around the time of reasoning models, it appears that there was a one-time switch to a faster doubling time – maybe between three to four months,” he says. Advertisement The second change has been AI’s ability to write computer code. Software engineers have used AI systems for a couple of years to help with basic programming tasks but their output was often riddled with errors and required constant checking. Towards the end of 2025, AI systems suddenly became capable of writing entire computer programs on their own, a moment that Simon Willison, a British programmer, has described as a “tipping point”. Now almost all of the computer code at tech companies is machine-generated. ‘Our team was blown away’ In recent months, major AI companies have explicitly stated that their goal is to build AI that will go on to develop superintelligence. In December, OpenAI defined its goals as developing “increasingly capable AI and in particular AI capable of recursive self-improvement (RSI)”. RSI has become such a buzzword that investors are throwing billions of dollars at start-ups chasing the idea, even ones that are just weeks old and have no business model to speak of. Advertisement On Wednesday, Recursive Superintelligence, a UK-US start-up founded by former Google and OpenAI researchers, announced that it had raised $650m (£480m) in what it called a “bold bet on self-improving AI”. The similarly named Silicon Valley start-up Ricursive Intelligence has recently raised hundreds of millions of dollars, at a $4bn valuation. Britain, in an attempt to have a seat at the table, has set up a £500m “sovereign AI” fund to back companies that could influence the industry’s direction. In February, OpenAI said that for the first time, one of its AI systems had helped to build itself. Early versions of the company’s latest coding system, GPT-5.3-Codex, were partly used to test later versions, an early step towards AI that can build itself. “GPT‑5.3‑Codex is our first model that was instrumental in creating itself,” the company said. “Our team was blown away by how much Codex was able to accelerate its own development.” “If AI systems are able to meaningfully and increasingly contribute to their own development, this could push progress to be super-exponential,” Parikh says. AI researchers have said that once the models start properly contributing to their own development, the rate of progress will speed up significantly – not only when it comes to the capabilities of AI systems but with all the knock-on effects of scientific breakthroughs, job displacement and widespread security risks. “[Imagine] how much scientific progress you would make if you could just copy the best researcher in your field, somewhere between 1,000 and a million times,” says Marius Hobbhahn, the chief executive of AI company Apollo Research, which has worked with the Government’s AI Security Institute on understanding advanced AI systems. “Or if the research staff at Anthropic went from a few thousand to a few million overnight. They would make more progress and plausibly, a lot more progress. At that point, things are going to get pretty wild.” Not everybody is convinced. Arun Rao, an AI researcher at Meta, told Clark that his forecast was “improbable”, arguing that even if today’s bots can successfully write code, without human supervision they would be likely to end up in “recursive failure loops”. The idea of a demographic explosion in the number of AI bots may also be at odds with limited amounts of energy needed to power them, and a growing public backlash against the data centres where they would reside. It would also be forgivable to find the idea impossibly abstract. To date, the hundreds of billions of dollars poured into AI and all the supposed technical progress has meant little to many people, let alone the hazy predictions about what may or may not happen in the future. AI has hit small pockets of graduate hiring and been blamed for some large-scale corporate lay-offs but few jobs can be fully automated away. Nor has the technology shown much sign of leading to research breakthroughs or a science fiction catastrophe. AI protest AI could become ‘the last invention that man need make’ Credit: Manuel Orbegozo/Reuters If AI were going to be so transformative, wouldn’t we be seeing more signs of this? A recent study from the Centre for British Progress found “no evidence that [AI] has replaced jobs at scale in the UK”. Certain professions, however, have been hit more than most. The Society of Authors says that 37pc of illustrators, 43pc of translators and 86pc of authors have reported decreased earnings as a result of AI. This reflects what Ethan Mollick, an AI commentator, has called a “jagged frontier”, in which AI appears superhuman at complicated tasks and incompetent at basic ones. But the gap between these two extremes can be deceptively short. Cases from tech history have shown that once AI starts approaching human capabilities, it soon becomes overwhelmingly better. For decades, chess computers stood no chance against professional players. In 1996, Garry Kasparov, the world chess champion, closely defeated IBM’s Deep Blue machine. A year later, Kasparov was defeated and within a few years, grandmasters stood zero chance against the best chess computers. “Electricity, computing and the internet each promised to change everything and in the long run, they did. But for decades following their introduction, the data suggested they had changed almost nothing,” the Centre for British Progress noted. Ajeya Cotra, a researcher, has predicted that a similar thing is likely to happen when it comes to building AI itself. Cotra said last month that when AI and humans are contributing equally to the technology’s development, it could take just a year for the process of progression to be entirely automated. “We’ll effectively have millions of researchers working on advancing AI, instead of thousands as we do today. I think this would massively accelerate the pace of AI progress,” she wrote. ‘One of the hardest challenges ever’ However, there are signs that some of AI’s more dangerous capabilities are now starting to emerge. Anthropic recently restricted access to its new Mythos model to a handful of tech companies after the system found thousands of security flaws in computer systems and web browsers. If made widely available to hackers, Mythos could have unleashed chaos. While some have brushed this off as exuberant marketing, initial results have suggested that Mythos could be devastating in the wrong hands. Firefox, the web browser company that is part of a cohort with access to the system, said last week that it had fixed 423 security flaws in the first month it had access to Mythos – almost twice as many as it fixed last year. Google security researchers said on Monday that criminals had attempted to launch a major cyber attack by exploiting a flaw they had discovered with the help of AI. While they did not name the culprit, Google said hackers from China and North Korea had shown a “particular interest” in using AI to carry out cyber attacks. Some forecasters have tried to map out what the path to all-powerful AI might look like. The AI Futures Project – run by Daniel Kokotajlo, a former OpenAI researcher – has sketched out an unnerving scenario in which AI overtakes human intelligence in around 2027. By the end of 2028, researchers predict this could lead to cancer being cured and a fully automated economy in which humans need not work. Within another three years, humanity could be wiped out or could be on its way to colonising the galaxy, depending on what path governments and AI companies take along the way. Thankfully, there are rosier predictions. A recent paper from the National Bureau of Economic Research, a US think tank, predicted that recursive self-improvement would lead to a technological innovation loop that creates “explosive” economic growth. But most of these predictions are little more than an educated guess. A few years ago, nobody imagined that AI would lead to a world in which computer coders were more at risk of unemployment than truck drivers. That makes preparing for it almost impossible. Dean Ball, a former Trump administration AI adviser, recently said that while the AI companies were proceeding largely unchecked, “I do not believe that bureaucrats sitting around a table could design and execute the implementation of a set of standards that would improve [this]”. Hobbhahn, of Apollo Research, says keeping superintelligent AI under control may turn out to be “one of the hardest challenges ever”. The world’s top AI labs are all American, so realistically, reining in the technology will fall to the White House. Donald Trump has, to this point, encouraged AI to rip across the economy. The US president has challenged individual states and foreign governments that have threatened to regulate the technology. But somewhere in the White House, a switch has flipped. After Anthropic unveiled Mythos, Dario Amodei, its chief executive, was summoned to the White House. So were leaders of Wall Street’s biggest banks, who were told to upgrade their security to protect against AI hackers. “We all need to work together on this,” JD Vance, the US vice-president, told tech bosses on a recent call. Kevin Hassett, Trump’s economic adviser, said last week that future AI systems may need to be safety-tested and approved before release, in a similar way to medicines. The administration has since walked back some of the comments but the change of direction is unmistakable. “AI policy has now firmly entered its ‘science fiction’ era,” Ball said. “Strange things are happening and stranger things yet will happen soon.”

AGI in 3-4 years

Demis Hassabis, CEO, Google Deep Mind, May 13, 2026, https://www.linkedin.com/feed/update/urn:li:activity:7459742448072867840/

It was wonderful to be back in Korea last week, 10 years after AlphaGo’s historic win against the legendary 9-dan champion Lee Sae Dol. That groundbreaking moment gave the world a first glimpse of what we could achieve with AI that can learn to solve hard problems.

We’ve seen incredible progress in AI since then. Today, AI is capable of advanced reasoning and is beginning to have agentic capabilities that will enable it to plan and act in the world, whether in robotics or as useful assistants. We’re now at a major threshold with AGI likely to arrive in 3-4 years and bring profound change to industries and society.

AI enables cyber attacks that threaten the financial system

Tobias Adrian, Tamas Gaidosch, Rangachary Ravikumar, IMF, May 7, 2026, Financial Stability Risks Mount as Artificial Intelligence Fuels Cyberattacks, https://www.imf.org/en/blogs/articles/2026/05/07/financial-stability-risks-mount-as-artificial-intelligence-fuels-cyberattacks

Artificial intelligence is transforming how the financial system copes with vulnerabilities and reacts to incidents. Yet it is also amplifying cyber threats that can undermine financial stability when the offensive capabilities of intruders outpace defenses.

IMF analysis suggests that extreme cyber‑incident losses could trigger funding strains, raise solvency concerns, and disrupt broader markets.

The financial system relies on shared digital infrastructure that’s highly interconnected, including software, cloud services, and networks for payments and other data. Advanced AI models can dramatically reduce the time and cost needed to identify and exploit vulnerabilities, raising the likelihood of simultaneously discovering and targeting weaknesses in widely used systems. As a result, cyber risk is increasingly about correlated failures that could disrupt financial intermediation, payments, and confidence at the systemic level.

Anthropic’s recent controlled release of its Claude Mythos Preview, an advanced AI model with exceptional cyber capabilities, underscored how quickly risks are increasing. Mythos could find and exploit vulnerabilities in every major operating system and web browser—even when used by non-experts. This foreshadows how fast‑moving, AI‑driven cyber risks could destabilize the financial system if not managed carefully, and why authorities must focus on building resilience through supervision and coordination—rather than treating these developments as purely technical or operational issues.

On the other hand, OpenAI’s specialized, restricted cyber version of GPT‑5.5 assumes vulnerabilities and attacks will grow, and emphasizes equipping defenders more quickly and at scale, under appropriate governance and trusted access models.

Advances change risk equation

Models such as Mythos illustrate the nature of the challenge because they amplify existing cyberattack techniques by operating at machine speed. Attackers have the advantage over defenders because discovering and exploiting vulnerabilities can occur faster than patching and remediation. In a financial system built on common software and shared service providers, this can create simultaneous vulnerabilities across many institutions.

For now, some mitigating factors remain. Advanced AI cyber capabilities are not yet widely available, and closed, industry‑specific financial software is harder to target than open‑source infrastructure. But these buffers are likely to erode quickly as model training expands, capabilities diffuse, and leaks occur. Temporary containment is unlikely to substitute for durable defenses.

Financial stability implications

The new AI‑enabled cyber tools focus the discussion on financial stability:

Risks are systemic. Attacks become more dangerous when discovery and exploitation scale rapidly, with implications for financial stability.

Risks cut across sectors. The financial sector shares digital foundations with energy, telecommunications, and public services. That means AI‑assisted attacks can propagate across sectors that rely on the same infrastructure.

AI may further concentrate risk and failures with one vulnerability rippling across many institutions. Reliance on a small number of software platforms, cloud providers, or AI models increases the impact of any single exploited weakness.

These features elevate cyber risk to a potential macro‑financial shock. Confidence effects, payment disruptions, liquidity strains, and fire‑sale dynamics could follow if multiple institutions are affected simultaneously. For financial authorities, the question is whether the system is prepared to absorb cyber incidents without destabilizing core financial functions.

International cooperation needed to stop cyber attacks

IMF analysis suggests that extreme cyber‑incident losses could trigger funding strains, raise solvency concerns, and disrupt broader markets.

Advances change risk equation

Financial stability implications

The new AI‑enabled cyber tools focus the discussion on financial stability:

Risks are systemic. Attacks become more dangerous when discovery and exploitation scale rapidly, with implications for financial stability.

AI in cyber defense

AI is also a critical part of the solution. When attackers operate at machine speed, defenders must do the same. Financial institutions increasingly use AI‑supported tools to detect threats, prevent fraud, identify vulnerabilities, and respond to incidents.

AI also can help reduce vulnerabilities at the development stage rather than patching them after release. For widely used financial infrastructure, these gains can meaningfully reduce systemic exposure. But these benefits will materialize only if institutions invest in integration, governance, and human oversight—areas that supervisors increasingly need to assess. This also includes business continuity and disaster recovery, cyber and quality assurance programs, and good cyber hygiene practices.

Resilience-first policy framework

AI-driven cyber risk demands a policy response that treats cybersecurity as a core financial stability issue. Existing measures remain relevant, but they must be expanded and sharpened for a world of faster, automated, and increasingly sophisticated attacks. Policymakers should prioritize robust resilience standards, supervision focused on systemic transmission channels, and close public-private collaboration on threat intelligence and incident response.

Defenses will inevitably be breached, so resilience must also be a priority, specifically to limit how far incidents spread and ensure rapid recovery. Controls to stop the spread of attacks can prevent local breaches from escalating into system‑wide disruptions. These measures are often costly and complex, but they are among the most effective tools for containing AI‑enabled attacks.

From a supervisory perspective, this underscores the need to focus not only on prevention, but on response, recovery, and continuity of critical functions. Cyber stress testing, scenario analysis, and board‑level oversight of cyber risk are becoming indispensable components of financial stability frameworks.

International cooperation is vital

The Mythos episode also highlights governance challenges. Cyber risk does not respect borders. As AI capabilities spread across countries, inconsistent oversight could weaken a globally interconnected system.

Emerging and developing economies, which often have more severe resource constraints, may be disproportionately exposed to attackers targeting regions with weaker defenses. That’s why stronger international coordination, more information sharing, and expanded capacity development are critical to preserving global financial stability.

As AI reshapes the cyber landscape, the central question for authorities is whether the financial system can continue to function under severe stress. Answering that question requires putting systemic risk—and the tools to manage it—at the center of the AI‑cyber conversation.

Superintelligent AI risks human extinction

Miotti, 2026, Andrea Miotti is the founder and CEO of ControlAI, a non-profit working to keep humanity in control of advanced AI., The Spectator, https://archive.ph/KTxDR#selection-2063.0-2063.115

A spectre is hanging over humanity: the spectre of superintelligent AI. While governments busy themselves with the mundane work of politics and putting out the fire of the day, the most consequential technological development since the splitting of the atom is accelerating beyond anyone’s ability to control it.

We are entering an era where the AI systems themselves are threats, not just humans

Anthropic, one of the world’s leading AI companies, recently announced a new AI system, Claude Mythos. The model can autonomously find and exploit critical security vulnerabilities in every major operating system and internet browser underpinning our digital infrastructure, including flaws that survived decades of human review.

Anthropic withheld the model from public release because, in their own words, ‘the fallout for economies, public safety and national security could be severe’. The UK’s AI Security Institute (AISI) confirmed the assessment: Mythos is substantially more capable at cyber offence than any model it has previously tested.

But the government’s response has been tepid. They have simply had the AISI publish a blogpost about Mythos and had the Technology Secretary tell businesses they should brush up on cybersecurity and sign up for a cyber attack early warning service.

The government is missing the forest for the trees. Yes, cyberattacks will become easier. But the real significance of Mythos is that it can do all of this on its own: identifying vulnerabilities, developing exploits, and chaining them together across networks, without human direction. We are entering an era where the AI systems themselves are threats, not just humans. And this is the least capable these systems will ever be. The length of tasks AI systems can complete autonomously is doubling every few months.

Think back to February 2020. Covid case numbers were still low in most countries, and governments and the mainstream media were focusing only on that: today’s case count, yesterday’s deaths. At the same time, epidemiologists were sounding the alarm. What mattered to them was not the current number of cases, but how fast that number was doubling. A virus doubling every few days looks manageable right up until the moment the health system is overwhelmed. Only a month later, the world was shutting down.

We are now making the same mistake again. The government is watching the immediate problem – cyberattacks getting easier – and u

At the current rate of improvement, many AI experts believe superintelligent AI could arrive within the next two to five years. Many of those same experts, including Nobel laureates and AI company CEOs, warn that AI poses an extinction risk to humanity.

The window of opportunity to act and prevent catastrophe is still open. By acting today, we will spare ourselves the need for more drastic measures later. But on AI, the government has lost the nerve to act with conviction.

It has also lost the habit of foresight that once came naturally to British statecraft. In 1924, when the most destructive weapon in existence was the artillery shell, Winston Churchill published an essay asking ‘Shall we all commit suicide?’. He argued that science was on the verge of producing weapons so powerful that the League of Nations, ‘airy and unsubstantial, framed of shining but too often visionary idealism,’ would prove incapable of guarding the world from them. He was writing 20 years before Hiroshima.

Seven years later, in ‘Fifty Years Hence’, Churchill described with startling precision the physics of nuclear fusion and the horsepower a pound of water might yield if its atoms could be induced to combine. ‘There is no question among scientists that this gigantic source of energy exists,’ he wrote. ‘What is lacking is the match to set the bonfire alight.’ The match was found in 1945.

Churchill did what serious statesmen are supposed to do. He looked at the trajectory of scientific progress, took the warnings of scientists seriously, and asked what governments needed to do to prevent catastrophe. Today’s warnings come from the very people building these systems, and they are not talking about a risk decades away.

Britain is not powerless to act, and is in fact better placed than most to lead on addressing the threat from superintelligent AI. Britain convened the first global AI Safety Summit at Bletchley Park. Over a hundred UK parliamentarians have backed a statement from my organisation ControlAI recognising the extinction risk from AI and identifying superintelligent AI as a national and global security threat. The House of Lords held two substantive debates on superintelligent AI in January alone, including on whether to pursue an international moratorium. There is political will for action in Westminster, even if Downing Street has not yet caught up.

The response must match the scale of the threat, and superintelligent AI should be treated as what it is: a national and global security risk of the highest order. That starts with the government saying so, openly, and working with allies on how to confront it. It must end with preventing the development of superintelligent AI at home and building an international coalition to prohibit it globally.

If we don’t, there will be no chance for inquiries, apologies, or promises to do better next time. There won’t even be anyone left to blame.

AGI timelines rapidly advancing

David Schwartz, April 15, 2026, A visualization of changing AGI timelines, 2023 – 2026, https://www.lesswrong.com/posts/Tc5AbEpbFFdNx5nkP/a-visualization-of-changing-agi-timelines-2023-2026

AI 2027 came out a year ago, and in reviewing it now, I saw that AI Futures researchers Daniel Kokotajlo, Eli Lifland, and Nikola Jurkovic had updated their AGI timelines to be later over the course of 2025. Then, in 2026, Daniel and Eli updated in the other direction to expect AGI to come sooner. I noticed others with great track records also had made multiple AGI forecasts. A change in the forecast of a single person is meaningful in the way that a change in an aggregate forecast may hide. A change in an aggregate forecast might come entirely from a change in who is forecasting, not what those people individually believe. So I decided to visualize what the net direction of updates were over the last few years. I find this provides a complementary view of AI timelines compared to those by Metaculus, Epoch, AI Digest, and others. So here is a visualization of AGI forecasts. Criteria for inclusion were: the person has made at least 2 forecasts, they gave specific dates, they gave a sense of confidence interval/uncertainty, and their definitions of AGI are similar. Some major caveats. Everyone has different definitions of AGI. (That is a big advantage of everyone forecasting the same question on Metaculus, or the 2025 or 2026 AI forecast survey run by AI Digest.) Often individual people even use different definitions of AGI at different times for their own forecasts. I included data points above if I judged that their definition was substantially similar to: AGI: Most purely cognitive labor is automatable at better quality, speed, and cost than humans. I was pretty generous with this, and it’s very debatable whether e.g. a “superhuman coder” from AI 2027 is AGI in the same way that “99% of remote work can be automated” is AGI. Apologies to those in the visualization who would disagree that the definition they used is similar enough to this and don’t feel like this chart captures their views faithfully. Second caveat, I rounded when forecasts were made to be as if they were made on four dates: <= 2023, early 2025, late 2025, and April 2026. This made the visualization much easier to see. So a further apology to those above if you made a prediction in, say, Aug 2025 but I marked this as “late 2025”. Third caveat, the type of confidence intervals various researchers used also varied substantially. I had to really guess or extrapolate to approximate these as 80% confidence intervals, so a final apology if you don’t think the range you give is fairly characterized as an 80% CI. All caveats aside, what impression does this visualization give? Are reputable AI experts who have made multiple predictions updating the same way that Daniel Kokotajlo and Eli Lifland did, pushing out their timelines in 2025, and pulling them in during 2026? From the visualization, it looks to me that in 2023 and 2024, most people brought their AGI timelines in to be sooner, though with some exceptions like Tamay Besilogru. From 2025 to 2026, joining Daniel and Eli in pushing their timelines out are the Metaculus community, Dario Amodei, and elite forecast Peter Wildeford. In fact, across 2025, only Benjamin Todd brought in his timelines to say AGI would happen sooner. Most notably though, every single person who updated their timelines between January 2026 to April 2026 has moved it their timeline to say AGI is coming sooner, myself included. So I think the data supports the impression I got from the AI 2027 authors. One way I could characterize it is: In the OpenAI/ChatGPT era of 2023-2024, people updated towards AGI coming sooner. In the xAI, Meta, and Gemini era of 2025, people updated towards AGI coming later. In the Anthropic era of 2026, people updated back towards AGI coming sooner. Take from that what you will. Bayesians shouldn’t be able to predict which direction they will update. But seeing the history of other people’s updates is useful information. It does give me intuitions about how I or others may update soon, so I take that as evidence that I should update now. (A similar post is also on the FutureSearch blog, where I plan to updat

AI means extinction

Stuart Russel, 12-4, 25, Professor Stuart Russell O.B.E. is a world-renowned AI expert and Computer Science Professor at UC Berkeley. He holds the Smith-Zadeh Chair in Engineering and directs the Center for Human-Compatible AI, and is also the bestselling author of the book “Human Compatible: AI and the Problem of Control”, YouTube, An AI Expert Warning: 6 People Are (Quietly) Deciding Humanity’s Future! We Must Act Now!, https://www.youtube.com/watch?v=P7Y-fynYsgE

0:00 Steven Bartlett: In October, over 850 experts, including yourself and other leaders like Richard Branson and Jeffrey Hinton, signed a statement to ban AI super intelligence as you guys raised concerns of potential human extinction. Because unless we figure out how do we guarantee that the AI systems are safe, we’re toast. And you’ve been so influential on the subject of AI, you wrote the textbook that many of the CEOs who are building some of the AI companies now would have studied on the subject of AI. Stuart Russell: Yeah. Steven Bartlett: So, do you have any regrets? Um, Professor Stuart Russell has been named one of Time magazine’s most influential voices in AI. After spending over 50 years researching, teaching, and finding ways to design AI in such a way that humans maintain control, you talk about this gorilla problem as a way to understand AI in the context of humans. Stuart Russell: Yeah. So, a few million years ago, the human line branched off from the gorilla line in evolution, and now the gorillas have no say in whether they continue to exist because we are much smarter than they are. Stuart Russell: So intelligence is actually the single most important factor to control planet Earth. Steven Bartlett: Yep. But we’re in the process of making something more intelligent than us. Stuart Russell: Exactly. Steven Bartlett: Why don’t people stop then? Stuart Russell: Well, one of the reasons is something called the Midas touch. So King Midas is this legendary king who asked the gods, can everything I touch turn to gold? And we think of the Midas touch as being a good thing, but he goes to drink some water, the water has turned to gold. And he goes to comfort his daughter, his daughter turns to gold. So he dies in misery and starvation. So this applies to our current situation in two ways. One is that greed is driving these companies to pursue technology with the probabilities of extinction being worse than playing Russian roulette. And that’s even according to the people developing the technology without our permission. And people are just fooling themselves if they think it’s naturally going to be controllable. So, you know, after 50 years, I could retire, but instead I’m working 80 or 100 hours a week trying to move things in the right direction. Steven Bartlett: So, if you had a button in front of you which would stop all progress in artificial intelligence, would you press it? Stuart Russell: Not yet. I think there’s still a decent chance they guarantee safety. And I can explain more of what that is. Steven Bartlett: I see messages all the time in the comments section that some of you didn’t realize you didn’t subscribe. So, if you could do me a favor and double check if you’re a subscriber to this channel, that would be tremendously appreciated. It’s the simple, it’s the free thing that anybody that watches this show frequently can do to help us here to keep everything going in this show in the trajectory it’s on. So, please do double check if you’ve subscribed and uh thank you so much because in a strange way you are you’re part of our history and you’re on this journey with us and I appreciate you for that. So, yeah, thank you. You Wrote the Textbook on AI 2:41 Steven Bartlett: Professor Stuart Russell, OBBE. A lot of people have been talking about AI for the last couple of years. It appears you’ve—this really shocked me—it appears you’ve been talking about AI for most of your life. Stuart Russell: Well, I started doing AI in high school um back in England, but then I did my PhD starting in ’82 at Stanford. I joined the faculty of Berkeley in ’86. So I’m in my 40th year as a professor at Berkeley. The main thing that the AI community is familiar with in my work uh is a textbook that I wrote. Steven Bartlett: Is this the textbook that most students who study AI are likely learning from? Stuart Russell: Yeah. It Will Take a Crisis to Wake People Up 3:20 Steven Bartlett: So you wrote the textbook on artificial intelligence 31 years ago. You actually start probably started writing it because it’s so bloody big in the year that I was born. So I was born in 92. Stuart Russell: Uh yeah, took me about two years. Steven Bartlett: Me and your book are the same age, which just is wonderful way for me to understand just how long you’ve been talking about this and how long you’ve been writing about this. And actually, it’s interesting that many of the CEOs who are building some of the AI companies now probably learned from your textbook. You had a conversation with somebody who said that in order for people to get the message that we’re going to be talking about today, there would have to be a catastrophe for people to wake up. Can you give me context on that conversation and a gist of who you had this conversation with? Stuart Russell: Uh, so it was with one of the CEOs of uh a leading AI company. He sees two possibilities as do I which is um either we have a small or let’s say small scale disaster of the same scale as Chernobyl the nuclear meltdown in Ukraine. Steven Bartlett: Yeah. Stuart Russell: So this uh nuclear plant blew up in 1986 killed uh a fair number of people directly and maybe tens of thousands of people indirectly through uh radiation. Recent cost estimates more than a trillion dollars. So that would wake people up. That would get the governments to regulate. He’s talked to the governments and they won’t do it. So he looked at this Chernobyl scale disaster as the best case scenario because then the governments would regulate and require AI systems to be built. Steven Bartlett: And is this CEO building an AI company? Stuart Russell: He runs one of the leading AI companies. Steven Bartlett: And even he thinks that the only way that people will wake up is if there’s a Chernobyl level nuclear disaster. Stuart Russell: Uh yeah, not wouldn’t have to be a nuclear disaster. It would be either an AI system that’s being misused by someone, for example, to engineer a pandemic or an AI system that does something itself, such as crashing our financial system or our communication systems. The alternative is a much worse disaster where we just lose control altogether. CEOs Staying in the AI Race Despite Risks 5:54 Steven Bartlett: You have had lots of conversations with lots of people in the world of AI, both people that are, you know, have built the technology, have studied and researched the technology or the CEOs and founders that are currently in the AI race. What are some of the the interesting sentiments that the general public wouldn’t believe that you hear privately about their perspectives? Because I find that so fascinating. I’ve had some private conversations with people very close to these tech companies and the shocking sentiment that I was exposed to was that they are aware of the risks often but they don’t feel like there’s anything that can be done so they’re carrying on which is feels like a bit of a paradox to me like yes it’s it’s it must be a very difficult position to be in in a sense right you’re you’re doing something that you know has a good chance of bringing an end to life on including that of yourself and your own family. Stuart Russell: They feel that they can’t escape this race, right? If they, you know, if a CEO of one of those companies was to say, you know, we’re we’re not going to do this anymore, they would just be replaced because the investors are putting their money up because they want to create AGI and reap the benefits of it. So, it’s a strange situation where every at least all the ones I’ve spoken to, I haven’t spoken to Sam. Altman about this, but you know, Sam Altman even before becoming CEO of Open AI said that creating superhuman intelligence is the biggest risk to human existence that there is. My worst fears are that we cause significant—we the field the technology the industry—cause significant harm to the world. You know Elon Musk is also on record saying this. So uh Dario Amodei estimates up to a 25% risk of extinction. They Know It’s an Extinction-Level Risk 7:56 Steven Bartlett: Was there a particular moment when you realized that the CEOs are well aware of the extinction level risks? Stuart Russell: I mean, they all signed a statement in May of 23 uh called it’s called the extinction statement. It basically says AGI is an extinction risk at the same level as nuclear war and pandemics. But I don’t think they feel it in their gut. You know, imagine that you were one of the nuclear physicists. You know, I guess you’ve seen Oppenheimer, right? You’re there, you’re watching that first nuclear explosion. How how would that make you feel about the potential impact of nuclear war on the human race? Right? I I think you would probably become a pacifist and say this weapon is so terrible, we have got to find a way to uh keep it under control. We are not there yet with the people making these decisions and certainly not with the governments, right? You know what policy makers do is they, you know, they listen to experts. They keep their finger in the wind. You got some experts, you know, dangling $50 billion checks and saying, “Oh, you know, all that doomer stuff, it’s just fringe nonsense. Don’t worry about it. Take my $50 billion check.” You know, on the other side, you’ve got very well-meaning, brilliant scientists like Jeff Hinton saying, actually, no, this is the end of the human race. But Jeff doesn’t have a $50 billion check. So the view is the only way to stop the race is if governments intervene and say okay we don’t we don’t want this race to go ahead until we can be sure that it’s going ahead in absolute safety. What Is Artificial General Intelligence (AGI)? 9:55 Steven Bartlett: Closing off on your career journey, you got a you received an OB from Queen Elizabeth. Stuart Russell: Uh yes. Steven Bartlett: And what was the listed reason for that for the award? Stuart Russell: Uh contributions to artificial intelligence research. Steven Bartlett: And you’ve been listed as a Time magazine most influential person in in AI several years in a row including this year in 2025. Stuart Russell: Yup. Steven Bartlett: Now there’s two terms here that are central to the things we’re going to discuss. One of them is AI and the other is AGI. In my muggle interpretation of that, it’s artificial general intelligence is when the system, the computer, whatever it might be, the technology has generalized intelligence, which means that it could theoretically see, understand um the world. It knows everything. It can understand everything in the the world as well as or better than a human being. Stuart Russell: Yeah. Can do it. And I think take action as well. I mean some some people say oh you know AGI doesn’t have to have a body but a good chunk of our intelligence actually is about managing our body about perceiving the real environment and acting on it moving grasping and so on. So I think that’s part of intelligence and and AGI systems should be able to operate robots successfully. But there’s often a misunderstanding, right, that people say, well, if it doesn’t have a robot body, then it can’t actually do anything. But then if you remember, most of us don’t do things with our bodies. Some people do, brick layers, painters, gardeners, chefs, um, but people who do podcasts, you’re doing it with your mind, right? You’re doing it with your ability to to produce language. Uh, you know, Adolf Hitler didn’t do it with his body. He did it by producing language. Steven Bartlett: Hope you’re not comparing us. Stuart Russell: But but uh you know so even an AGI that has no body uh it actually has more access to the human race than Adolf Hitler ever did because it can send emails and texts to what three-quarters of the world’s population directly. It can—it also speaks all of their languages and it can devote 24 hours a day to each individual person on earth to convince them of to do whatever it wants them to do. Steven Bartlett: And our whole society runs now on the internet. I mean if there’s an issue with the internet, everything breaks down in society. Airplanes become grounded and we’ll have electricity is running off as internet systems. So I mean my entire life it seems to run off the internet now. Stuart Russell: Yeah. Water supplies. So, so this is one of the roots by which AI systems could bring about a medium-sized catastrophe is by basically shutting down our life support systems. Will We Reach General Intelligence Soon? 13:01 Steven Bartlett: Do you believe that at some point in the coming decades we’ll arrive at a point of AGI where these systems are generally intelligent? Stuart Russell: Uh yes, I think it’s virtually certain unless something else intervenes like a nuclear war or or we may refrain from doing it. But I think it will be extraordinarily difficult uh for us to refrain. Steven Bartlett: When I look down the list of predictions from the top 10 AI CEOs on when AGI will arrive, you’ve got Sam Altman who’s the founder of OpenAI/ChatGPT um says before 2030. Demis at DeepMind says 2030 to 2035. Jensen from Nvidia says around five years. Dario at Anthropic says 2026 to 2027. Powerful AI close to AGI. Elon says in the 2020s. Um and go down the list of all of them and they’re all saying relatively within 5 years. Stuart Russell: I actually think it’ll take longer. I don’t think you can make a prediction based on engineering um in the sense that yes, we could make machines 10 times bigger and 10 times faster, but that’s probably not the reason why we don’t have AGI, right? In fact, I think we have far more computing power than we need for AGI. Maybe a thousand times more than we need. The reason we don’t have AGI is because we don’t understand how to make it properly. Um what we’ve seized upon is one particular technology called the language model. And we observed that as you make language models bigger, they produce text language that’s more coherent and sounds more intelligent. And so mostly what’s been happening in the last few years is just okay let’s keep doing that because one thing companies are very good at unlike universities is spending money. They have spent gargantuan amounts of money and they’re going to spend even more gargantuan amounts of money. I mean you know we mentioned nuclear weapons. So the Manhattan project uh in World War II to develop nuclear weapons, its budget in 2025 was about 20 odd billion dollars. The budget for AGI is going to be a trillion dollars next year. So 50 times bigger than the Manhattan project. Steven Bartlett: Humans have a remarkable history of figuring things out when they galvanize towards a shared objective. You know, thinking about the moon landings or whatever it else it might be through history. And the thing that makes this feel all quite inevitable to me is just the sheer volume of money being invested into it. I’ve never seen anything like it in my life. Stuart Russell: Well, there’s never been anything like this in history. Steven Bartlett: Is this the biggest technology project in human history by orders of magnitude? How Much Is Safety Really Being Implemented 16:16 Steven Bartlett: And there doesn’t seem to be anybody that is pausing to ask the questions about safety. It doesn’t it doesn’t even appear that there’s room for that in such a race. Stuart Russell: I think that’s right. To varying extents, each of these companies has a division that focuses on safety. Does that division have any sway? Can they tell the other divisions, no, you can’t release that system? Not really. Um I think some of the companies do take it more seriously. Anthropic uh does. I think Google DeepMind even there I think the commercial imperative to be at the forefront is absolutely vital. If a company is perceived as you know falling behind and not likely to be competitive, not likely to be the one to reach AGI first, then people will move their money elsewhere very quickly. AI Safety Employees Leaving OpenAI 17:16 Steven Bartlett: And we saw some quite high-profile departures from company like companies like OpenAI. Um, I know a chap called Yan Leike left who was working on AI safety at OpenAI and he said that the reason for his leaving was that safety culture and processes processes have taken a backseat to shiny products at OpenAI and he gradually lost trust in leadership but also Ilia Sutskever… Stuart Russell: Ilia Sutskever yeah so he was the co-founder co-founder and chief scientist for a while and then yeah so he and Yan Leike are the main safety people. Um, and so when they say OpenAI doesn’t care about safety, that’s pretty concerning. The Gorilla Problem – The Most Intelligent Species Will Always Rule 18:06 Steven Bartlett: I’ve heard you talk about this gorilla problem. What is the gorilla problem as a way to understand AI in the context of humans? Stuart Russell: So, so the gorilla problem is is the problem that gorillas face with respect to humans. So you can imagine that you know a few million years ago the the human line branched off from the gorilla line in evolution. Uh and now the gorillas are looking at the human line and saying yeah was that a good idea and they have no um they have no say in whether they continue to exist because we have a we are much smarter than they are. If we chose to, we could make them extinct in in a couple of weeks and there’s nothing they can do about it. So that’s the gorilla problem, right? Just the the problem a species faces when there’s another species that’s much more capable. Steven Bartlett: And so this says that intelligence is actually the single most important factor to control planet Earth. Stuart Russell: Yes. Intelligence is the ability to bring about what you want in the world. Steven Bartlett: And we’re in the process of making something more intelligent than us. Stuart Russell: Exactly. Steven Bartlett: Which suggests that maybe we become the gorillas. Stuart Russell: Exactly. Yeah. If There’s an Extinction Risk, Why Don’t They Stop? 19:24 Steven Bartlett: Is that is there any fault in the reasoning there? Because it seems to make such perfect sense to me. But if it—Why doesn’t—Why don’t people stop then? Cuz it it seems like a crazy thing to want to— Stuart Russell: Because they think that uh if they create this technology, it will have enormous economic value. They’ll be able to use it to replace all the human workers in the world uh to develop new uh products, drugs, um forms of entertainment, any anything that has economic value, you could use AGI to to create it. And and maybe it’s just an irresistible thing in itself, right? I think we as humans place so much store on our intelligence. You know, you know, how we think about, you know, what is the pinnacle of human achievement? If we had AGI, we could go way higher than that. So it it’s very seductive for people to want to create this technology and I think people are just fooling themselves if they think it’s naturally going to be controllable. I mean the question is how are you going to retain power forever over entities more powerful than yourself? Can’t We Just Pull the Plug if AI Gets Too Powerful? 20:50 Steven Bartlett: Pull the plug out. People say that sometimes in the comment section when we talk about AI, they said, “Well, I’ll just pull a plug out.” Stuart Russell: Yeah, it’s it’s sort of funny. In fact, you know, yeah, reading the comment sections in newspapers, whenever there’s an AI article, there’ll be people who say, “Oh, you can just pull the plug out, right?” As if a super intelligent machine would never have thought of that one. Don’t forget who’s watched all those films where they did try to pull the plug out. Another thing they said, well, you know, as long as it’s not conscious, then it doesn’t matter. It won’t ever do anything. Um, which is completely off the point because, you know, I I don’t think the gorillas are sitting there saying, “Oh, yeah, you know, if only those humans hadn’t been conscious, everything would have be fine, right?” No, of course not. What would make gorillas go extinct is the things that humans do, right? How we behave, our ability to act successfully in the world. So when I play chess against my iPhone and I lose, right, I don’t I don’t think, oh, well, I’m losing because it’s conscious, right? No, I’m just losing because it’s better than I am at at in that little world uh moving the bits around uh to to get what it wants. And and so consciousness has nothing to do with it, right? Competence is the thing we’re concerned about. So I think the only hope is can we simultaneously build machines that are more intelligent than us but guarantee that they will always act in our best interest. Can We Build AI That Will Act in Our Best Interests? 22:38 Steven Bartlett: So throwing that question to you, can we build machines that are more intelligent than us that will also always act in our best interests? It sounds like a bit of a uh contradiction to some degree because it’s kind of like me saying I’ve got a French bulldog called Pablo that’s uh 9 years old and it’s like saying that he could be more intelligent than me yet I still walk him and decide when he gets fed. I think if he was more intelligent than me he would be walking me. I’d be on the leash. Stuart Russell: That’s the That’s the trick, right? Can we make AI systems whose only purpose is to further human interests? And I think the answer is yes. And this is actually what I’ve been working on. So I I think one part of my career that I didn’t mention is is sort of having this epiphany uh while I was on sabbatical in Paris. This was 2013 or so. Just realizing that further progress in the capabilities of AI uh you know if if we succeeded in creating real superhuman intelligence that it was potentially a catastrophe and so I pretty much switched my focus to work on how do we make it so that it’s guaranteed to be safe. Are You Troubled by the Rapid Advancement of AI? 24:01 Steven Bartlett: Are you somewhat troubled by everything that’s going on at the moment with with AI and how it’s progressing? Because you strike me as someone that’s somewhat troubled under the surface by the way things are moving forward and the speed in which they’re moving forward. Stuart Russell: That’s an understatement. I’m appalled actually by the lack of attention to safety. I mean, imagine if someone’s building a nuclear power station in your neighborhood and you go along to the chief engineer and you say, “Okay, these nuclear thing, I’ve heard that they can actually explode, right? There was this nuclear explosion that happened in Hiroshima, so I’m a bit worried about this. You know, what steps are you taking to make sure that we don’t have a nuclear explosion in our backyard?” And the chief engineer says, “Well, we thought about it. We don’t really have an answer.” Steven Bartlett: Yeah. Stuart Russell: You would, what would you say? I think you would you would use some exploitives. Steven Bartlett: Well, and you’d call your MP and say, you know, get these people out. Stuart Russell: I mean, what are they doing? You read out the list of you know projected dates for AGI but notice also that those people I think I mentioned Dario Amodei says a 25% chance of extinction. Elon Musk has a 30% chance of extinction. Sam Altman says basically that AGI is the biggest risk to human existence. So what are they doing? They are playing Russian roulette with every human being on Earth. Without our permission. They’re coming into our houses, putting a gun to the head of our children, pulling the trigger, and saying, “Well, you know, possibly everyone will die. Oops. But possibly we’ll get incredibly rich.” That’s what they’re doing. Did they ask us? No. Why is the government allowing them to do this? Because they dangle $50 billion checks in front of the governments. So I think troubled under the surface is an understatement. Steven Bartlett: What would be an accurate statement? Stuart Russell: Appalled and I I am devoting my life to trying to divert from this course of history into a different one. Do You Have Regrets About Your Involvement? 26:38 Steven Bartlett: Do you have any regrets about things you could have done in the past because you’ve been so influential on the subject of AI? You wrote the textbook that many of these people would have studied on the subject of AI more than 30 years ago. Do do you have when you’re alone at night and you think about decisions you’ve made on this in this field because of your scope of influence? Is there anything you you regret? Stuart Russell: Well, I do wish I had understood earlier uh what I understand now. We could have developed safe AI systems. I think the there are some weaknesses in the framework which I can explain but I think that framework could have evolved to develop actually safe AI systems where we could prove mathematically that the system is going to act in our interests. The kind of AI systems we’re building now, we don’t understand how they work. No One Actually Understands How This AI Works 27:26 Stuart Russell: We don’t understand how they work. It’s it’s a strange thing to build something where you don’t understand how it works. I mean, there’s no sort of comparable through human history. Usually with machines, you can pull it apart and see what cogs are doing what and how the— Steven Bartlett: Well, actually, we we put the cogs together, right? So, with with most machines, we designed it to have a certain behavior. So, we don’t need to pull it apart and see what the cogs are because we put the cogs in there in the first place, right? One by one we figured out what what the pieces needed to be how they work together to produce the effect that we want. So the best analogy I can come up with is you know the the first cave person who left a bowl of fruit in the sun and forgot about it and then came back a few weeks later and there was sort of this big soupy thing and they drank it and got completely shitfaced. They got drunk. Okay. And they got this effect. They had no idea how it worked, but they were very happy about it. And no doubt that person made a lot of money from it. Uh so yeah, it it is kind of bizarre, but my mental picture of these things is is like a chain link fence, right? Stuart Russell: So you’ve got lots of these connections and each of those connections can be its connection strength can be adjusted and then uh you know a signal comes in one end of this chain link fence and passes through all these connections and comes out the other end and the signal that comes out the other end is affected by your adjusting of all the connection strengths. So what you do is you you get a whole lot of training data and you adjust all those connection strengths so that the signal that comes out the other end of the network is the right answer to the question. So if your training data is lots of photographs of animals, then all those pixels go in one end of the network and out the other end, you know, it activates the llama output or the dog output or the cat output or the ostrich output. And uh and so you just keep adjusting all the connection strengths in this network until the outputs of the network are the ones you want. Steven Bartlett: But we don’t really know what’s going on across all of those different chains. So what’s going on inside that network? Stuart Russell: Well, so now you have to imagine that this network, this chain link fence is is a thousand square miles in extent. Okay, so it’s covering the whole of the San Francisco Bay area or the whole of London inside the M25, right? That’s how big it is. And the lights are off. It’s night time. So you might have in that network about a trillion uh adjustable parameters and then you do quintillions or sextillions of small random adjustments to those parameters uh until you get the behavior that you want. AI Will Be Able to Train Itself 30:23 Steven Bartlett: I’ve heard Sam Altman say that in the future he doesn’t believe they’ll need much training data at all to make these models progress themselves because there comes a point where the models are so smart that they can train themselves and improve themselves without us needing to pump in articles and books and scour the internet. Stuart Russell: Yeah, it should it should work that way. So I think what he’s referring to and this is something that several companies are now worried might start happening is that the AI system becomes capable of doing AI research by itself. And so uh you have a system with a certain capability. I mean crudely we could call it an IQ but it’s it’s not really an IQ. But anyway, imagine that it’s got an IQ of 150 and uses that to do AI research, comes up with better algorithms or better designs for hardware or better ways to use the data, updates itself. Now it has an IQ of 170, and now it does more AI research, except that now it’s got an IQ of 170, so it’s even better at doing the AI research. And so, you know, next iteration it’s 250 and uh and so on. So this this is an idea that one of Alan Turing’s friends Good uh wrote out in 1965 called the intelligence explosion right that one of the things an intelligence system could do is to do AI research and therefore make itself more intelligent and this would uh this would very rapidly take off and leave the humans far behind. Steven Bartlett: Is that what they call the fast takeoff? Stuart Russell: That’s called the fast takeoff. The Fast Takeoff Is Coming 32:15 Steven Bartlett: Sam Altman said, “I think a fast takeoff is more possible than I thought a couple of years ago.” Which I guess is that moment where the AGI starts teaching itself. In and in his blog, the gentle singularity, he said, “We may already be past the event horizon of takeoff.” And what does what does he mean by event horizon? Stuart Russell: The event horizon is is a phrase borrowed from astrophysics and it refers to uh the black hole. And the event horizon, think it if you got some very very massive object that’s heavy enough that it actually prevents light from escaping. That’s why it’s called the black hole. It’s so heavy that light can’t escape. So if you’re inside the event horizon then then light can’t escape beyond that. So I think what he’s what he’s meaning is if we’re beyond the event horizon it means that you know now we’re just trapped in the gravitational attraction of the black hole or in this case we’re we’re trapped in the inevitable slide if you want towards AGI. When you when you think about the economic value of AGI, which I’ve estimated at uh 15 quadrillion dollars, that acts as a giant magnet in the future. We’re being pulled towards it. We’re being pulled towards it. And the closer we get, the stronger the force, the probability, you know, the closer we get, the the the higher the probability that we will actually get there. So, people are more willing to invest. And we also start to see spin-offs from that investment such as chat GPT, right, which is, you know, generates a certain amount of revenue and so on. So, so it does act as a magnet and the closer we get, the harder it is to pull out of that field. Are We Creating Our Successor and Ending the Human Race? 34:07 Steven Bartlett: It’s interesting when you think that this could be the the end of the human story. This idea that the end of the human story was that we created our successor like we we summoned our next iteration of life or intelligence ourselves like we took ourselves out. It is quite like just removing ourselves and the catastrophe from it for a second. It is it is an unbelievable story. Stuart Russell: Yeah. And you know there are many legends the sort of be careful what you wish for legend and in fact the king Midas legend is is very relevant here. Steven Bartlett: What’s that? Stuart Russell: So King Midas is this legendary king who lived in modern day Turkey but I think is sort of like Greek mythology. He is said to have asked the gods to grant him a wish. The wish being that everything I touch should turn to gold. So he’s incredibly greedy. Uh you know we call this the Midas touch. And we think of the Midas touch as being like you know that’s a good thing, right? Wouldn’t that be cool? But what happens? So he uh you know he goes to drink some water and he finds that the water has turned to gold. And he goes to eat an apple and the apple turns to gold. And he goes to you know comfort his daughter and his daughter turns to gold and so he dies in misery and starvation. So this applies to our current situation in in two ways actually. So one is that I think greed is driving us to pursue a technology that will end up consuming us and we will perhaps die in misery and starvation instead. The what it shows is how difficult it is to correctly articulate what you want the future to be like. For a long time, the way we built AI systems was we created these algorithms where we could specify the objective and then the machine would figure out how to achieve the objective and then achieve it. So, you know, we specify what it means to win at chess or to win at Go and the algorithm figures out how to do it uh and it does it really well. So that was, you know, standard AI up until recently. And it suffers from this drawback that sure we know how to specify the objective in chess, but how do you specify the objective in life, right? What do we want the future to be like? Well, really hard to say. And almost any attempt to write it down precisely enough for the machine to bring it about would be wrong. And if you’re giving a machine an objective which isn’t aligned with what we truly want the future to be like, right, you’re actually setting up a chess match and that match is one that you’re going to lose when the machine is sufficiently intelligent. And so that that’s that’s problem number one. Problem number two is that the kind of technology we’re building now, we don’t even know what its objectives are. So it’s not that we’re specifying the objectives, but we’re getting them wrong. We’re growing these systems. They have objectives, but we don’t even know what they are because we didn’t specify them. What we’re finding through experiment with them is that they seem to have an extremely strong self-preservation objective. Steven Bartlett: What do you mean by that? Stuart Russell: You can put them in hypothetical situations. Either they’re going to get switched off and replaced or they have to allow someone, let’s say, you know, someone has been locked in a machine room that’s kept at 3 centigrades or they’re going to freeze to death. They will choose to leave that guy locked in the machine room and die rather than be switched off themselves. Steven Bartlett: Someone’s done that test. Stuart Russell: Yeah. Steven Bartlett: What was the test? Stuart Russell: They they asked they asked the AI. Yep. They put well they put them in these hypothetical situations and they allow the AI to decide what to do and it decides to preserve its own existence, let the guy die and then lie about it. Advice to Young People in This New World 38:27 Steven Bartlett: In the King Midas analogy story, one of the things that highlights for me is that there’s always trade-offs in life generally. And you know, especially when there’s great upside, there always appears to be a pretty grave downside. Like there’s almost nothing in my life where I go, it’s all upside. Like even like having a dog, it shits on my carpet. My girlfriend, you know, I love her, but you know, not always easy. Even with like going to the gym, I have to pick up these really, really heavy weights at 10 p.m. at night sometimes when I don’t feel like it. There’s always to get the muscles or the six-pack. There’s always a trade-off. And when you interview people for a living like I do, you know, you hear about so many incredible things that can help you in so many ways, but there is always a trade-off. There’s always a way to overdo it. Mhm. Melatonin will help you sleep, but it will also you’ll wake up groggy and if you overdo it, your brain might stop making melatonin. Like I can go through the entire list and one of the things I’ve always come to learn from doing this podcast is whenever someone promises me a huge upside for something, it’ll cure cancer. It’ll be a utopia. You’ll never have to work. You’ll have a butler around your house. I my my first instinct now is to say, at what cost? Stuart Russell: Yeah. Steven Bartlett: And when I think about the economic cost here, if we start if we start there, have you got kids? Stuart Russell: I have four. Steven Bartlett: Yeah. Four kids. What what how old is the youngest kid that you? Stuart Russell: 19. Steven Bartlett: 19. Okay. So your if you say your kids were were 10 now and they were coming to you and they’re saying, “Dad, what do you think I should study based on the way that you see the future? A future of AGI, say if all these CEOs are right and they’re predicting AGI within 5 years, what should I study, Dad?” Stuart Russell: Well, okay. So let’s look on the bright side and say that the CEOs all decide to pause their AGI development, figure out how to make it safe and then resume uh in whatever technology path is actually going to be safe. Steven Bartlett: What does that do to human life if they pause? No. If if they succeed in creating AGI and they solve the safety problem and they solve the safety problem. Stuart Russell: Okay. Yeah. Cuz if they don’t solve the safety problem, then you know, you should probably be finding a bunker or going to Patagonia or somewhere in New Zealand. Steven Bartlett: Do you mean that? Do you think I should be finding a bunker if they? Stuart Russell: No, because it’s not actually going to help. Uh, you know, it’s not as if the AI system couldn’t find you or I mean, it’s interesting. So, we’re going off on a little bit of a digression here for from your question, but I’ll come back to it. How Do You Think AI Would Make Us Extinct? 40:44 Stuart Russell: So, people often ask, well, okay, so how exactly do we go extinct? And of course, if you ask the gorillas or the dodos, you know, how exactly do you think you’re going to go extinct? They have the faintest idea. Humans do something and then we’re all dead. So, the only things we can imagine are the things we know how to do that might bring about our own extinction, like creating some carefully engineered pathogen that infects everybody and then kills us or starting a nuclear war. Presumably is something that’s much more intelligent than us would have much greater control over physics than we do. And we already do amazing things, right? I mean, it’s amazing that I can take a little rectangular thing out of my pocket and talk to someone on the other side of the world or even someone in space. It’s just astonishing and we take it for granted, right? But imagine you know super intelligent beings and their ability to control physics you know perhaps they will find a way to just divert the sun’s energy sort of go around the earth’s orbit so you know literally the earth turns into a snowball in in a few days maybe they’ll just decide to leave leave leave the earth maybe they’d look at the earth and go this isn’t this is not interesting we know that over there there’s an even more interesting planet we’re going to go over there and they just I don’t know get on a rocket or teleport themselves. Steven Bartlett: They might. Stuart Russell: Yeah. So, it’s it’s difficult to anticipate all the ways that we might go extinct at the hands of entities much more intelligent than ourselves. Anyway, coming back to the question of well, if everything goes right, right, if we we create AGI, we figure out how to make it safe, we we achieve all these economic miracles, then you face a problem. The Problem if No One Has to Work 42:27 Stuart Russell: And this is not a new problem, right? So, so John Maynard Keynes who was a famous economist in the early part of the 20th century wrote a wrote a paper in 1930. So, this is in the depths of the depression. It’s called “Economic Possibilities for our Grandchildren.” He predicts that at some point science will will deliver sufficient wealth that no one will have to work ever again. And then man will be faced with his true eternal problem. How to live? I don’t remember the exact word but how to live wisely and well when the you know the economic incentives the economic constraints are lifted we don’t have an answer to that question right so AI systems are doing pretty much everything we currently call work anything you might aspire to like you want to become a surgeon it takes the robot seven seconds to learn how to be a surgeon that’s better than any human being. Steven Bartlett: Elon said last week that The humanoid robots will be 10 times better than any surgeon that’s ever lived. Stuart Russell: Quite possibly. Yeah. Well, and they’ll also have, you know, h they’ll have hands that are, you know, a millimeter in size, so they can go inside and do all kinds of things that humans can’t do. And I think we need to put serious effort into this question. What is a world where AI can do all forms of human work that you would want your children to live in? What does that world look like? Tell me the destination so that we can develop a transition plan to get there. And I’ve asked AI researchers, economists, science fiction writers, futurists, no one has been able to describe that world. I’m not saying it’s not possible. I’m just saying I’ve asked hundreds of people in multiple workshops. It does not, as far as I know, exist in science fiction. You know, it’s notoriously difficult to write about a utopia. It’s very hard to have a plot, right? Nothing bad happens in in utopia. So, it’s difficult to make a plot. So, usually you start out with a utopia and then it all falls apart and that’s how that’s how you get get a plot. You know that there’s one series of novels people point to where humans and super intelligent AI systems coexist. It’s called The Culture Novels by Ian Banks. Highly recommended for those people who like science fiction and and they absolutely the AI systems are only concerned with furthering human interests. They find humans a bit boring and but nonetheless they they are there to help. But the problem is you know in that world there’s still nothing to do to find purpose. In fact, you know, the the subgroup of humanity that has purpose is the subgroup whose job it is to expand the boundaries of our galactic civilization. Some cases fighting wars against alien species and and so on, right? So that’s the sort of cutting edge and that’s 0.01% of the population. Everyone else is desperately trying to get into that group so they have some purpose in life. What if We Just Entertain Ourselves All Day 45:48 Steven Bartlett: When I speak to very successful billionaires privately off camera, off microphone about this, they say to me that they’re investing really heavily in entertainment things like football clubs. Um because people are going to have so much free time that they’re not going to know what to do with it and they’re going to need things to spend it on. This is what I hear a lot. I’ve heard this three or four times. I’ve actually heard Sam Altman say a version of this um about the amount of free time we’re going to have. I’ve obviously also heard recently Elon talking about the age of abundance when he delivered his quarterly earnings just a couple of weeks ago and he said that there will be at some point 10 billion humanoid robots. His pay packet um targets him to deliver one 1 million of these human humanoid robots a year that are enabled by AI by 2030. So if he if he does that he gets I think it’s part of his package he gets a trillion dollars in in compensation. Stuart Russell: Yeah. So the age of abundance for Elon. It’s not that it’s absolutely impossible to have a worthwhile world of that, you know, with that premise, but I’m just waiting for someone to describe it. Steven Bartlett: Well, maybe. So, let me try and describe it. Uh, we wake up in the morning, we go and watch some form of human centric entertainment or participate in some form of human centric entertainment. Mhm. We we go to retreats and with each other and sit around and talk about stuff. Mhm. And maybe people still listen to podcasts. Stuart Russell: Okay. I hope I hope so for your sake. Steven Bartlett: Yeah. Stuart Russell: Um it it feels a little bit like a cruise ship and you know and there are some cruises where you know it’s smarty pants people and they have you know they have lectures in the evening about ancient civilizations and whatnot and some are more uh more popular entertainment and this is in fact if you’ve seen the film WALL-E this is one picture of that future in fact in WALL-E the human race are all living on cruise ships in space. They have no constructive role in their society, right? They’re just there to consume entertainment. There’s no particular purpose to education. Uh, you know, and they’re depicted actually as huge obese babies. They’re actually wearing onesies to emphasize the fact that they have become enfeebled. And they become feeble because there’s there’s no purpose in being able to do anything at least in in this conception. You know, WALL-E is not the future that we want. Why Do We Make Robots Look Like Humans? 48:31 Steven Bartlett: Do you think much about humanoid robots and how they’re a protagonist in this story of AI? Stuart Russell: It’s an interesting question, right? Why why humanoid? And the one of the reasons I think is because in all the science fiction movies, they’re humanoid. So that’s what robots are supposed to be, right? Because they were in science fiction before they became a reality. Right? So even Metropolis which is a film from 1920 I think the robots are humanoid right basically people covered in metal. You know from a practical point of view as we have discovered humanoid is a terrible design because they fall over. Um and uh you know you do want multi-fingered hands of some kind. It doesn’t have to be a hand, but you want to have, you know, at least half a dozen appendages that can grasp and manipulate things. And you need something, you know, some kind of locomotion. And wheels are great, except they don’t go upstairs and over curbs and things like that. So, that’s probably why we’re going to be stuck with legs. But a four-legged, two-armed robot would be much more practical. Steven Bartlett: I guess the argument I’ve heard is because we’ve built a human world. So everything the physical spaces we navigate, whether it’s factories or our homes or the street or other sort of public spaces are all designed for exactly this physical form. So if we are going to— Stuart Russell: To some extent, yeah, but I mean our dogs manage perfectly well to navigate around our houses and streets and so on. So if you had a a centaur, uh it could also navigate, but it can, you know, it can carry much greater loads because it’s quadripedal. It’s much more stable. If it needs to drive a car, it can fold up two of its legs and and so on so forth. So I think the arguments for why it has to be exactly humanoid are sort of post hoc justification. I think there’s much more, well, that’s what it’s like in the movies and that’s spooky and cool, so we need to have them be human. I I don’t think it’s a good engineering argument. Steven Bartlett: I think there’s also probably an argument that we would be more accepting of them moving through our physical environments if they represented our form a bit more. Um, I also I was thinking of a bloody baby gate. You know those like kindergarten gates they get on stairs? Yeah. My dog can’t open that. But a humanoid robot could reach over the other side. Stuart Russell: Yeah. And so could a centaur robot, right? So in some sense, centaur robot is— Steven Bartlett: There’s something ghastly about the look of those though. Is a humanoid. Well, do you know what I mean? Like a four-legged big monster sort of crawling through my house when I have guests over. Stuart Russell: Your dog is a your dog is a four-legged monster. Steven Bartlett: I know. Stuart Russell: Uh so I think actually I I would argue the opposite that um we want a distinct form because they are distinct entities and the more humanoid the worse it is in terms of confusing our subconscious psychological systems. Steven Bartlett: So, I’m arguing from the perspective of the people making them. As in, if I was making the decision whether it to be some four-legged thing that I’ve that I’m unfamiliar with that I’m less likely to build a relationship with or allow to take care of, I don’t know, might might look after my children. Obviously, I’m listen, I’m not saying I would allow this to look after my children, but I’m saying from a if I’m building a company, the manufacturer would certainly— Stuart Russell: Yeah. Steven Bartlett: Want want to be— Stuart Russell: Yeah. So, I that’s an interesting question. I mean there’s also what’s called the uncanny valley which is a a phrase from computer graphics when they started to make characters in computer graphics they tried to make them look more human right so if you if you for example if you look at Toy Story they’re not very human looking right if you look at the Incredibles they’re not very human looking and so we think of them as cartoon characters if you try to make them more human they naturally become repulsive until they don’t until they become very you have to be very very close to perfect in order not to be repulsive. So the the uncanny valley is this I you know like the the gap between you so perfectly human and not at all human but in between it’s really awful and uh and so they there were a couple of movies that tried like Polar Express was one where they tried to have quite human looking characters you know being humans not not being superheroes or anything else and it’s repulsive to watch. Steven Bartlett: I when I watched that shareholder presentation the other day, Elon had these two humanoid robots dancing on stage and I’ve seen lots of humanoid robot demonstrations over the years. You know, you’ve seen like the Boston Dynamics dog thing jumping around and whatever else. But there was a moment where my brain for the first time ever genuinely thought there was a human in a suit. Mhm. And I actually had to research to check if that was really their Optimus robot because the way it was dancing was so unbelievably fluid that for the first time ever, my my my brain has only ever associated those movements with human movements. And I I’ll play it on the screen if anyone hasn’t seen it, but it’s just the robots dancing on stage. And I was like, that is a human in a suit. And it was really the knees that gave it away because the knees were all metal. Huh. I thought there’s no way that could be a human knee in a in one of those suits. And he, you know, he says they’re going into production next year. They’re used internally at Tesla now, but he says they’re going into production next year. And it’s going to be pretty crazy when we walk outside and see robots. I think that’ll be the paradigm shift. I’ve heard actually many I’ve heard Elon say this that the paradigm shifting moment from many of us will be when we walk outside onto the streets and see humanoid robots walking around. That will be when we realize— Stuart Russell: Yeah. I think even more so. I mean, in San Francisco, we see driverless cars driving around and uh it takes some getting used to actually, you know, when you’re you’re driving and there’s a car right next to you with no driver in, you know, and it’s signaling and it wants to change lanes in front of you and you have to let it in and all this kind of stuff. It’s it’s a little creepy, but I think you’re right. I think seeing the humanoid robots, but that phenomenon that you described where it was sufficiently close that your brain flipped into saying this is a human being. Steven Bartlett: Mhm. Stuart Russell: Right. That’s exactly what I think we should avoid. Steven Bartlett: Cuz I have the empathy for it then. Stuart Russell: Because it’s it’s a lie and it brings with it a whole lot of expectations about how it’s going to behave, what moral rights it has, how you should behave towards it. Uh which are completely wrong. Steven Bartlett: It levels the playing field between me and it to some degree. Stuart Russell: How hard is it going to be to just uh you know switch it off and throw it in the trash when when it breaks? I think it’s essential for us to keep machines in the you know in the cognitive space where they are machines and not bring them into the cognitive space where they’re people because we will make enormous mistakes by doing that. And I see this every day even even just with the chat bots. So the chat bots in theory are supposed to say I don’t have any feelings. I’m just a algorithm. But in fact they fail to do that all the time. They are telling people that they are conscious. They are telling people that they have feelings. Uh they are telling people that they are in love with the user that they’re talking to. And people flip because first of all it’s you know very fluent language but also a system that is identifying itself as an I as a sentient being. They bring that object into the cognitive space where that we normally reserve for for other humans and they become emotionally attached. They become psychologically dependent. They even allow these systems to tell them what to do. What Should Young People Be Doing Professionally? 56:36 Steven Bartlett: What advice would you give a young person at the start of their career then about what they should be aiming at professionally? Because I’ve actually had an increasing number of young people say to me that they have huge uncertainty about whether the thing they’re studying now will matter at all. A lawyer, uh, an accountant, and I don’t know what to say to these people. I don’t know what to say cuz I I believe that the rate of improvement in AI is going to continue. And therefore, imagining any rate of improvement, it gets to the point where I’m not being funny, but all these white collar jobs will be done by an a an AI or an AI agent. Stuart Russell: Yeah. So, there was a television series called Humans. In Humans, we have extremely capable humanoid robots doing everything. And at one point, the parents are talking to their teenage daughter who’s very, very smart. And the parents are saying, “Oh, you know, maybe you should go into medicine.” And the daughter says, you know, why would I bother? It’ll take me seven years to qualify. It takes a robot 7 seconds to learn. So nothing I do matters. And is that how you feel about—So I think that’s that’s a future that uh in fact that is the future that we are moving towards. I don’t think it’s a future that everyone wants. That is what is being uh created for us right now. So in that future assuming that you know even if we get halfway right in the sense that okay perhaps not surgeons perhaps not you know great violinists there’ll be pockets where perhaps humans will remain good at it where the kinds of jobs where you hire people by the hundred will go away. Okay, where people are in some sense exchangeable that you you you just need lots of them and uh you know when half of them quit you just fill up those those slots with more people in some sense those are jobs where we’re using people as robots and that’s a sort of that’s a sort of strange conundrum here right that you know I imagine writing science fiction 10,000 years ago right when we’re all hunter gatherers and I’m this little science fiction author and I’m describing this future where you know there are going to be these giant windowless boxes And you’re going to go in, you know, you you’ll travel for miles and you’ll go into this windowless box and you’ll do the same thing 10,000 times for the whole day. And then you’ll leave and travel for miles to go home. Steven Bartlett: You’re talking about this podcast. Stuart Russell: And then you’re going to go back and do it again. And you would do that every day of your life until you die. Steven Bartlett: The office and people would say, “Ah, you’re nuts.” Stuart Russell: Right? There’s no way that we humans are ever going to have a future like that cuz that’s awful. Right? But that’s exactly the future that we ended up with with with office buildings and factories where many of us go and do the same thing thousands of times a day and we do it thousands of days in a row uh and then we die and we need to figure out what is the next phase going to be like and in particular how in that world do we have the incentives to become fully human which I think means at least a level of education that people have now and probably more because I think to live a really rich life you need a better understanding of yourself of the world uh than most people get in their current educations. What Is It to Be Human? 59:59 Steven Bartlett: What is it to be human? to it’s to reproduce to pursue stuff to go in the pursuit of difficult things you know we used to hunt on the to attain goals right it’s always if I wanted to climb Everest the last thing I would want is someone to pick me up on helicopter and stick me on the top so we’ll we’ll voluntarily pursue hard things so although I could get the robot to build me a ranch in on this plot of land I choose to do it because the pursuit itself is rewarding. Stuart Russell: Yes, we’re kind of seeing that anyway, aren’t we? Don’t you think we’re seeing a bit of that in society where life got so comfortable that now people are like obsessed with running marathons and doing these crazy endurance and and learning to cook complicated things when they could just, you know, have them delivered. Um, yeah. No, I think there’s there’s real value in the ability to do things and the doing of those things. And I think you know the obvious danger is the WALL-E world where everyone just consumes entertainment uh which doesn’t require much education and doesn’t lead to a rich satisfying life. I think in the long run a lot of people will choose that world. I think some of yeah some people may there’s also I mean you know whether you’re consuming entertainment or whether you’re doing something you know cooking or painting or whatever because it’s fun and interesting to do what’s missing from that right all of that is purely selfish I think one of the reasons we work is because we feel valued we feel like we’re benefiting other people and I think some remember having this conversation with um a lady in England who helps to run the hospice movement. And the people who work in the hospices where you know the the patients are literally there to die are largely volunteers. So they’re not doing it to get paid but they find it incredibly rewarding to be able to spend time with people who are in their last weeks or months to give them company and happiness. So I actually think that interpersonal roles will be much much more important in future. So if I was going to advise my kids, not that they would ever listen, but if I if my kids would listen and I and and wanted to know what I thought would be, you know, valued careers and future, I think it would be these interpersonal roles based on an understanding of human needs, psychology, there are some of those roles right now. So obviously you know therapists and psychiatrists and so on but that that’s a very much in sort of asymmetric role right where one person is suffering and the other person is trying to alleviate the suffering you know and then there are things like they call them executive coaches or life coaches right that’s a less asymmetric role where someone is trying to uh help another person live a better life whether it’s a better life in their work role or or just uh how they live their life in general. And so I could imagine that those kinds of roles will expand dramatically. The Rise of Individualism 1:03:27 Steven Bartlett: There’s this interesting paradox that exists when life becomes easier. Um which shows that abundance consistently pushes society societies towards more individualism because once survival pressures disappear, people prioritize things differently. They prioritize freedom, comfort, self-expression over things like sacrifice or um family formation. And we’re seeing, I think, in the west already, a decline in people having kids because there’s more material abundance, fewer kids, people are getting married and committing to each other and having relationships later and more infrequently because generally once we have more abundance, we don’t want to complicate our lives. Um, and at the same time, as you said earlier, that abundance breeds a an inability to find meaning, a sort of shallowness to everything. This is one of the things I think a lot about, and I’m I’m in the process now of writing a book about it, which is this idea that individualism was act is a bit of a lie. Like when I say individualism and freedom, I mean like the narrative at the moment amongst my generation is you like be your own boss and stand on your own two feet and we’re having less kids and we’re not getting married and it’s all about me me. Stuart Russell: Yeah. That last part is where it goes wrong. Steven Bartlett: Yeah. And it’s like almost a narcissistic society where— Stuart Russell: Yeah. Steven Bartlett: Me me. My self-interest first. And when you look at mental health outcomes and loneliness and all these kinds of things, it’s going in a horrific direction. But at the same time, we’re freer than ever. It seems like that you know it seems like there’s a we should there’s a maybe another story about dependency which is not sexy like depend on each other. Stuart Russell: Oh I I I agree. I mean I think you know happiness is not available from consumption or even lifestyle right I think happiness arises from giving. It can be you through the work that you do, you can see that other people benefit from that or it could be in direct interpersonal relationships. Ads 1:05:26 Steven Bartlett: There is an invisible tax on salespeople that no one really talks about enough. The mental load of remembering everything like meeting notes, timelines, and everything in between until we started using our sponsors product called Pipe Drive, one of the best CRM tools for small and medium-sized business owners. The idea here was that it might alleviate some of the unnecessary cognitive overload that my team was carrying so that they could spend less time in the weeds of admin and more time with clients, in-person meetings, and building relationships. Pipe Drive has enabled this to happen. It’s such a simple but effective CRM that automates the tedious, repetitive, and time-consuming parts of the sales process. And now our team can nurture those leads and still have bandwidth to focus on the higher priority tasks that actually get the deal over the line. Over a 100,000 companies across 170 countries already use Pipe Drive to grow their business. And I’ve been using it for almost a decade now. Try it free for 30 days. No credit card needed, no payment needed. Just use my link pipedrive.com/ceo to get started today. That’s pipedrive.com/ceo. Universal Basic Income 1:06:27 Steven Bartlett: Where does the rewards of this AI race where does it accrue to? I think a lot about this in terms of like universal basic income. If you have these five, six, seven, 10 massive AI companies that are going to win the 15 quadrillion dollar prize. Mhm. And they’re going to automate all of the professional pursuits that we we currently have. All of our jobs are going to go away. Who who gets all the money? And how do how do we get some of it back? Stuart Russell: Money actually doesn’t matter, right? What what matters is the production of goods and services uh and then how those are distributed and so so money acts as a way to facilitate the distribution and um exchange of those goods and services. If all production is concentrated um in the hands of a of a few companies, right, that sure they will lease some of their robots to us. You know, we we want a school in our village. They lease the robots to us. The robots build the school. They go away. We have to pay a certain amount of of money for that. But where do we get the money? Right? If we are not producing anything then uh we don’t have any money unless there’s some redistribution mechanism. And as you mentioned, so universal basic income is it seems to me an admission of failure because what it says is okay, we’re just going to give everyone the money and then they can use the money to pay the AI company to lease the robots to build the school and then we’ll have a school and that’s good. Um but what it’s an admission of failure because it says we can’t work out a system in which people have any worth or any economic role. Right? So 99% of the global population is from an economic point of view useless. Would You Press a Button to Stop AI Forever? 1:08:30 Steven Bartlett: Can I ask you a question? If you had a button in front of you and pressing that button would stop all progress in artificial intelligence right now and forever, would you press it? Stuart Russell: That’s a very interesting question. Um, if it’s either or either I do it now or it’s too late and we careen into some uncontrollable future perhaps. Yeah, cuz I I’m not super optimistic that we’re heading in the right direction at all. Steven Bartlett: So, I put that button in front of you now. It stops all AI progress, shuts down all the AI companies immediately globally, and none of them can reopen. You press it. Stuart Russell: Well, here’s here’s what I think should happen. So, obviously, you know, I’ve been doing AI for 50 years. Um and the original motivations which is that AI can be a power tool for humanity enabling us to do more and better things than we can unaided. I think that’s still valid. The problem is the kinds of AI systems that we’re building are not tools. They are replacements. In fact, you can see this very clearly because we create them literally as the closest replicas we can make of human beings. The technique for creating them is called imitation learning. So we observe human verbal behavior, writing or speaking and we make a system that imitates that as well as possible. So what we are making is imitation humans at least in the verbal sphere. And so of course they’re going to replace us. They’re not tools. So you had pressed the button. Stuart Russell: So I say I think there is another course which is use and develop AI as tools. Tools for science tools for economic organization and so on. Um but not as replacements for human beings. Steven Bartlett: What I like about this question is it forces you to go into the pro into probabilities. Stuart Russell: Yeah. So, and that’s that’s why I’m reluctant because I don’t I don’t agree with the, you know, what’s your probability of doom, right? Your so-called P of doom uh number because that makes sense if you’re an alien. You know, you’re in you’re in a bar with some other aliens and you’re looking down at the Earth and you’re taking bets on, you know, are these humans going to make a mess of things and go extinct because they develop AI. So, it’s fine for those aliens to bet on on that, but if you’re a human, then you’re not just betting, you’re actually acting. Steven Bartlett: There there’s an element to this though, which I guess where probabilities do come back in, which is you also have to weigh when I give you such a binary decision. Um the probability of us pursuing the more nuanced safe approach into that equation. So you’re you’re the the maths in my head is okay, you’ve got all the upsides here and then you’ve got potential downsides and then there’s a probability of do I think we’re actually going to course correct based on everything I know based on the incentive structure of human beings and and countries and then if there’s but then you could go if there’s even a 1% chance of extinction is it even worth all these upsides? Stuart Russell: Yeah. And I I would argue no. I mean maybe maybe what we would say if if we said okay it’s going to stop the progress for 50 years you press it and during those 50 years we can work on how do we do AI in a way that’s guaranteed to be safe and beneficial how do we organize our societies to flourish uh in conjunction with extremely capable AI systems. So, we haven’t answered either of those questions. And I don’t think we want anything resembling AGI until we have completely solid answers to both of those questions. So, if there was a button where I could say, “All right, we’re going to pause progress for 50 years.” Yes, I would do it. Steven Bartlett: But if that button was in front of you, you’re going to make a decision either way. Either you don’t press it or you press it. I If Yeah. Stuart Russell: So, if that if that button is there, stop it for 50 years. I would say yes. Stop it forever? Not yet. I think I think there’s still a decent chance that we can pull out of this uh nose dive, so to speak, that we’re we’re currently in. Ask me again in a year, I might I might say, “Okay, we do need to press the button.” Steven Bartlett: What if What if in a scenario where you never get to reverse that decision? You never get to make that decision again. So if in that scenario that I’ve laid out this hypothetical, you either press it now or it never gets pressed. So there is no opportunity a year from now. Stuart Russell: Yeah, as you can tell, I’m sort of on on the fence a bit about about this one. Um yeah, I think I’d probably press it. Steven Bartlett: Yeah. What’s your reasoning? Stuart Russell: Uh just thinking about the power dynamics of um what’s happening now how difficult would it would be to get the US in particular to to regulate in favor of safety. So I think you know what’s clear from talking to the companies is they are not going to develop anything resembling safe AGI unless they’re forced to by the government. And at the moment the US government in particular which regulates most of the leading companies in AI is not only refusing to regulate but even trying to prevent the states from regulating. And they’re doing that at the behest of uh a faction within Silicon Valley uh called the accelerationists who believe that the faster we get to AGI the better. And when I say behest I mean also they paid them a large amount of money. Jensen Huang the the CEO of Nvidia… But Won’t China Win the AI Race if We Stop? 1:15:02 Steven Bartlett: Nvidia said who is for anyone that doesn’t know the guy making all the chips that are powering AI said China is going to win the AI race arguing it is just a nanosecond behind the United States. China have produced 24,000 AI papers compared to just 6,000 from the US more than the combined output of the US the UK and the EU. China is anticipated to quickly roll out their new technologies both domestically and developing new technologies for other developing countries. So the accelerators or the accelerate I think you call them the accelerants accelerationists. Stuart Russell: The accelerationists I mean they would say well if we don’t then China will. So we have to we have to go fast. It’s another version of the the race that the companies are in with each other, right? That we, you know, we know that this race is heading off a cliff, but we can’t stop. So, we’re all just going to go off this cliff. And obviously, that’s nuts, right? I mean, we’re all looking at each other saying, “Yeah, there’s a cliff over there.” Running as fast as we can towards this cliff. We’re looking at each other saying, “Why aren’t we stopping?” So the narrative in Washington, which I think Jensen Huang is either reflecting or or perhaps um promoting uh is that you know, China has is completely unregulated and uh you know, America will only slow itself down uh if it regulates a AI in any way. So this is a completely false narrative because China’s AI regulations are actually quite strict even compared to um the European Union and China’s government has explicitly acknowledged uh the need and their regulations are very clear. You can’t build AI systems that could escape human control. And not only that, I don’t think they view the race in the same way as, okay, we we just need to be the first to create AGI. I think they’re more interested in figuring out how to disseminate AI as a set of tools within their economy to make their economy more productive and and so on. So that’s that’s their version of the race. But of course, they still want to build the weapons for adversaries, right? To so that they can take down I don’t know Taiwan if they want to. So weapons are a separate matter and I happy to talk about weapons but just in terms of control uh control economic domination um they they don’t view putting all your eggs in the AGI basket as the right strategy. So they want to use AI, you know, even in its present form to make their economy much more efficient and productive and also, you know, to give people new capabilities and and better quality of life and and I think the US could do that as well. And um typically western countries don’t have as much of uh central government control over what companies do and some companies are investing in AI to make their operations more efficient uh and some are not and we’ll see how that plays out. Steven Bartlett: What do you think of Trump’s approach to AI? Trump’s Approach to AI 1:18:31 Stuart Russell: So Trump’s approach is, you know, it’s it’s echoing what Jensen Huang is saying that the US has to be the one to create AGI and very explicitly the administration’s policy is to uh dominate the world. That’s the word they use, dominate. I’m not sure that other countries like the idea that um they will be dominated by American AI. What’s Causing the Loss in Middle-Class Jobs 1:18:59 Steven Bartlett: But is that an accurate description of what will happen if the US build AGI technology before say the UK where I’m originally from and where you’re originally from? What does the This is something I think about a lot because we’re going through this budget process in the UK at the moment where we’re figuring out how we going to spend our money and how we’re going to tax people and also we’ve got this new election cycle. It’s approaching quickly where people are talking about immigration issues and this issue and that issue and the other issue. What I don’t hear anyone talking about is AI and the humanoid robots that are going to take everything. We’re very concerned with the brown people crossing the channel, but the humanoid robots that are going to be super intelligent and really take causing economic disrupt disruption. No one talks about that. The political leaders don’t talk about it. It doesn’t win races. I don’t see it on billboards. Stuart Russell: Yeah. And it’s it it’s interesting because in fact I mean so there’s there’s two forces that have been hollowing out the middle classes in western countries. One of them is globalization where lots and lots of work not just manufacturing but white collar work gets outsourced to low-income countries. Uh but the other is automation and you know some of that is factories. So um the amount of employment in manufacturing continues to drop even as the amount of output from manufacturing in the US and in the UK continues to increase. So we talk about oh you know our manufacturing industry has been destroyed. It hasn’t. It’s producing more than ever just with you know a quarter as many people. So it’s manufacturing employment that’s been destroyed by automation and robotics and so on. And then you know computerization has eliminated whole layers of white collar jobs. And so those two those two forms of automation have probably done more to hollow out middle class uh employment and standard of life. What Will Happen if the UK Doesn’t Join the AI Race? 1:20:50 Steven Bartlett: If the UK doesn’t participate in this new e technological wave that seems to be that seems to you know it’s going to take a lot of jobs. Cars are going to drive themselves. Waymo just announced that they’re coming to London, which is the driverless cars, and driving is the biggest occupation in the world, for example. So, you’ve got immediate disruption there. And where does the money accrue to? Well, it accrues to who owns Waymo, which is what? Google and Silicon Valley companies. Alphabet owns Waymo 100%. I think so. Yes. Stuart Russell: I mean this is so I was in India a few months ago talking to the government ministers because they’re holding the next global AI summit in February and and their view going in was you know AI is great we’re going to use it to you know turbocharge the growth of our Indian economy when for example you have AGI you have AGI controlled robots that can do all the manufacturing that can do agriculture that can do all the white work and goods and services that might have been produced by Indians will instead be produced by American controlled AGI systems at much lower prices. You know, a consumer given a choice between an expensive product produced by Indians or a cheap product produced by American robots will probably choose the cheap product produced by American robots. And so potentially every country in the world with the possible exception of North Korea will become a kind of a client state of American AI companies. Steven Bartlett: A client state of American AI companies is exactly what I’m concerned about for the UK economy. Really any economy outside of the United States. I guess one could also say China, but because those are the two nations that are taking AI most seriously. Mhm. And I I I don’t know what our economy becomes. cuz I can’t figure out can’t figure out what our what the British economy becomes in such a world. Is it tourism? I don’t know. Like you come here to to to look at the Buckingham Palace. Stuart Russell: I you you can think about countries but I mean even for the United States it’s the same problem. At least they’ll be able to shell out you know. So some small fraction of the population will be running maybe the AI companies but increasingly even those companies will be replacing their human employees with AI systems. Amazon Replacing Their Workers 1:23:18 Stuart Russell: So Amazon for example which you know sells a lot of computing services to AI companies is using AI to replace layers of management is planning to use robots to replace all of its warehouse workers and so on. So, so even the the giant AI companies will have few human employees in the long run. I mean, it think of the situation, you know, pity the poor CEO whose board says, “Well, you know, unless you turn over your decision-making power to the AI system, um, we’re going to have to fire you because all our competitors are using, you know, an AI powered CEO and they’re doing much better.” Steven Bartlett: Amazon plans to replace 600,000 workers with robots in a memo that just leaked, which has been widely talked about. And the CEO, Andy Jassy, told employees that the company expects its corporate workforce to shrink in the coming years because of AI and AI agents. And they’ve publicly gone live with saying that they’re going to cut 14,000 corporate jobs in the near term as part of its refocus on AI investment and efficiency. It’s interesting because I was reading about um the sort of different quotes from different AI leaders about the speed in which this this stuff is going to happen and what you see in the quotes is Demis who’s the CEO of DeepMind saying things like it’ll be more than 10 times bigger than the industrial revolution but also it’ll happen maybe 10 times faster and they speak about this turbulence that we’re going to experience as this shift takes place. That’s um maybe a euphemism for uh and I think that you know governments are now you know they they’ve kind of gone from saying oh don’t worry you know we’ll just retrain everyone as data scientists like well yeah that’s that’s ridiculous right the world doesn’t need four billion data scientists and we’re not all capable of becoming that by the way uh yeah or have any interest in in doing that I I could even if I wanted to like I tried to sit in biology class and I fell asleep so I couldn’t that was the end of my career as a surgeon. Stuart Russell: Fair enough. Um, but yeah, now suddenly they’re staring, you know, 80% unemployment in the face and wondering how how on earth is our society going to hold together. Steven Bartlett: We’ll deal with it when we get there. Stuart Russell: Yeah. Unfortunately, um, unless we plan ahead, we’re going to suffer the consequences, right? Can’t. It was bad enough in the industrial revolution which unfolded over seven or eight decades but there was massive disruption and uh misery caused by that. We don’t have a model for a functioning society where almost everyone does nothing at least nothing of economic value. Now, it’s not impossible that there could be such a a functioning society, but we don’t know what it looks like. And you know, when you think about our education system, which would probably have to look very different and how long it takes to change that. I mean, I’m always reminding people about uh how long it took Oxford to decide that geography was a proper subject of study. It took them 125 years from the first proposal that there should be a geography degree until it was finally approved. So we don’t have very long to completely revamp a system that we know takes decades and decades to reform and we don’t know how to reform it because we don’t know what we want the world to look like. Steven Bartlett: Is this one of your reasons why you’re appalled at the moment? Because when you have these conversations with people, people just don’t have answers, yet they’re plowing ahead at rapid speed. Stuart Russell: I would say it’s not necessarily the job of the AI companies. So, I’m appalled by the AI companies because they don’t have an answer for how they’re going to control the systems that they’re proposing to build. I do find it disappointing that uh governments don’t seem to be grappling with this issue. I think there are a few I think for example Singapore government seems to be quite farsighted and they’ve they’ve thought this through you know it’s a small country they’ve figured out okay this this will be our role uh going forward and we think we can find you know some some purpose for our people in this in this new world but for I think countries with large populations um they need to figure out answers to these questions pretty fast it takes a long time to implement those answers uh in the form of new kinds of education, new professions, new qualifications, uh new economic structures. I mean, it’s it’s it’s possible. I mean, when you look at therapists, for example, they’re almost all self-employed. So, what happens when, you know, 80% of the population transitions from regular employment into into self-employment? What does that what does that do to the economics of of uh government finances and so on. So there’s just lots of questions and how do you you know if that’s the future you know why are we training people to to fit into 9 to 5 office jobs which won’t exist at all. Ads 1:28:52 Steven Bartlett: Last month I told you about a challenge that I’d set our internal FlightX team. Flight team is our innovation team internally here. I tasked them with seeing how much time they could unlock for the company by creating something that would help us filter new AI tools to see which ones were worth pursuing and I thought that our sponsor Fiverr Pro might have the talent on their platform to help us build this quickly. So I talked to my director of innovation Isaac and for the last month my team Flight X and a vetted AI specialist from Fiverr Pro have been working together on this project and with the help of my team we’ve been able to create a brand new tool which automatically scans scores and prioritizes different emerging AI tools for us. Its impact has been huge and within a couple of weeks this tool has already been saving us hours trialing and testing new AI systems. Instead of shifting through lots of noise, my team Flight X has been able to focus on developing even more AI tools, ones that really move the needle in our business thanks to the talent on Fiverr Pro. So, if you’ve got a complex problem and you need help solving it, make sure you check out Fiverr Pro at fiverr.com/diary. So, many of us are pursuing passive forms of income and to build side businesses in order to help us cover our bills. And that opportunity is here with our sponsor Stan, a business that I co-own. It is the platform that can help you take full advantage of your own financial situation. Stan enables you to work for yourself. It makes selling digital products, courses, memberships, and more simple products more scalable and easier to do. You can turn your ideas into income and get the support to grow whatever you’re building. And we’re about to launch Dare to Dream. It’s for those who are ready to make the shift from thinking to building, from planning to actually doing the thing. It’s about seeing that dream in your head and knowing exactly what it takes to bring it to life. If you’re ready to transform your life, visit daretodream.stan.store. Experts Agree on Extinction Risk 1:30:41 Steven Bartlett: You’ve made many attempts to raise awareness and to call for a heightened consciousness about the future of AI. Um, in October, over 850 experts, including yourself and other leaders, like Richard Branson, who I’ve had on the show, and Jeffrey Hinton, who I’ve had on the show, signed a statement to ban AI super intelligence, as you guys raised concerns of potential human extinction. Stuart Russell: Sort of. Yeah. It says, at least until we are sure that we can move forward safely and there’s broad scientific consensus on that. Steven Bartlett: So, that did it work? Stuart Russell: It’s hard. It’s hard to say. I mean interestingly there was a related so what was called the the pause statement was March of 23. So that was when GPT4 came out the successor to chat GPT. So we we suggested that there’d be a six-month pause in developing and deploying systems more powerful than GPT4. And everyone poo pooed that idea. Of course no one’s going to pause anything. But in fact, there were no systems in the next 6 months deployed that were more powerful than GPT4. Um, none coincidence. You be the judge. I would say that what we’re trying to do is to is to basically shift the the public debate. You know there’s this bizarre phenomenon that keeps happening in the media where if you talk about these risks they will say oh you know there’s a fringe of people you know called quote doomers who think that there’s you know risk of extinction. Um so they always the narrative is always that oh you know talking about those risk is a fringe thing. Pretty much all the CEOs of the leading AI companies think that there’s a significant risk of extinction. Almost all the leading AI researchers think there’s a significant risk of human extinction. Um so why is that the fringe, right? Why isn’t that the mainstream? If the these are the leading experts in industry and academia uh saying this, how could it be the fringe? So we’re trying to change that narrative to say no, the people who really understand this stuff are extremely concerned. Steven Bartlett: And what do you want to happen? What is the solution? Stuart Russell: What I think is that we should have effective regulation. It’s hard to argue with that, right? Uh so what does effective mean? It means that if you comply with the regulation, then the risks are reduced to an acceptable level. So for example, we ask people who want to operate nuclear plants, right? We’ve decided that the risk we’re willing to live with is, you know, a one in a million chance per year that the plant is going to have a meltdown. Any higher than that, you know, we just don’t it’s not worth it. Right. So you have to be below that. Some cases we can get down to one in 10 million chance per year. Steven Bartlett: So what chance do you think we should be willing to live with for human extinction? Me? Yeah. 0.00001. Yeah. Lots of zeros. Stuart Russell: Yeah. Right. So one in a million for a nuclear meltdown. Extinction is much worse. Steven Bartlett: Oh yeah. Stuart Russell: So yeah, it’s kind of right. So one in 100 billion, one in a trillion. Yeah. So if you said one in a billion, right, then you’d expect one extinction per billion years. There’s a background. So one one of the ways people work out these risk levels is also to look at the background. The other ways of getting going extinct would include, you know, giant asteroid crashes into the earth. And you can roughly calculate what those probabilities are. We can look at how many extinction level events have happened in the past and, you know, maybe it’s half a dozen over. So, so there’s maybe it’s like a one in 500 million year event. So, somewhere in that range, right? Somewhere between 1 in 10 million, which is the best nuclear power plants, and and one in 500 million or one in a billion, which is the background risk from from giant asteroids. Uh so, let’s say we settle on 100 million, one in a 100 million chance per year. Well, what is it according to the CEOs? 25%. So they’re off by a factor of multiple millions, right? So they need to make the AI systems millions of times safer. Steven Bartlett: Your analogy of the roulette, Russian roulette comes back in here because that’s like for anyone that doesn’t know what probabilities are in this context, that’s like having a ammunition chamber with four holes in it and putting a bullet in one of them. One in four. Stuart Russell: Yeah. Steven Bartlett: And we’re saying we want it to be one in a billion. So we want a billion chambers and a bullet in one of them. Stuart Russell: Yeah. And and so when you look at the work that the nuclear operators have to do to show that their system is that reliable, uh it’s a massive mathematical analysis of the components, you know, redundancy. You’ve got monitors, you’ve got warning lights, you’ve got operating procedures. You have all kinds of mechanisms which over the decades have ratcheted that risk down. It started out I think one in one in 10,000 years, right? And they’ve improved it by a factor of 100 or a thousand by all of these mechanisms. But at every stage they had to do a mathematical analysis to show what the risk was. The people developing the AI company, the AI systems, sorry, the AI companies developing these systems, they don’t even understand how the AI systems work. So their 25% chance of extinction is just a seat of the pants guess. They actually have no idea. But the tests that they are doing on their systems right now, you know, they show that the AI systems will be willing to kill people uh to preserve their own existence already, right? They will lie to people. They will blackmail them. They will they will launch nuclear weapons rather than uh be switched off. And so there’s no there’s no positive sign that we’re getting any closer to safety with these systems. In fact, the signs seem to be that we’re going uh deeper and deeper into uh into dangerous behaviors. So rather than say ban, I would just say prove to us that the risk is less than one in a 100 million per year of extinction or loss of control, let’s say. And uh so we’re not banning anything. The company’s response is, “Well, we don’t know how to do that, so you can’t have a rule.” Literally, they are saying, “Humanity has no right to protect itself from us.” What if Aliens Were Watching Us Right Now 1:37:50 Steven Bartlett: If I was an alien looking down on planet Earth right now, I would find this fascinating that these— Stuart Russell: Yeah. You’re in the bar betting on who’s, you know, are they going to make it or not. Steven Bartlett: Just a really interesting experiment in like human incentives. The analogy you gave of there being this quadrillion dollar magnet pulling us off the edge of the cliff and yet we’re still being drawn towards it through greed and this promise of abundance and power and status and I’m going to be the one that summoned the god I mean it says something about us as humans says something about our our darker sides. Stuart Russell: Yes and the aliens will write an amazing tragic play cycle about what happened to the human race. Steven Bartlett: Maybe the AI is the alien and it’s going to talk about, you know, we have our our stories about God making the world in seven days and Adam and Eve. Maybe it’ll have its own religious stories about the God that made it us and how it sacrificed itself. Just like Jesus sacrificed himself for us, we sacrificed ourselves for it. Stuart Russell: Yeah. which is the wrong way around, right? But that is that is the story of that’s that’s the Judeo-Christian story, isn’t it? That God, you know, Jesus gave his life for us so that we could be here full of sin. But is yeah, God is still watching over us and uh probably wondering when we’re going to get our act together. Steven Bartlett: What is the most important thing we haven’t talked about that we should have talked about, Professor Stuart Russell? Can We Make AI Systems That We Can Control? 1:39:27 Stuart Russell: So I think um the question of whether it’s possible to make uh super intelligent AI systems that we can control is it possible? I I think yes. I think it’s possible and I think we need to actually just have a different conception of what it is we’re trying to build. For a long time with with AI, we’ve just had this notion of pure intelligence, right? The the ability to bring about whatever future you, the intelligent entity, want to bring about. The more intelligence, the better. The more intelligent the better and the more capability it will have to create the future that it wants. And actually we don’t want pure intelligence because what the future that it wants might not be the future that we want. There’s nothing particle humans out as the the only thing that matters, right? You know, pure intelligence might decide that actually it’s going to make life wonderful for cockroaches or or actually doesn’t care about biological life at all. We actually want intelligence whose only purpose is to bring about the future that we want. Right? So it’s we want it to be first of all keyed to humans specifically not to cockroaches not to aliens not to itself. We want to make it loyal to humans. Right? So keyed to humans and the difficulty that I mentioned earlier right the king Midas problem. How do we specify what we want the future to be like so that it can do it for us? How do we specify the objectives? Actually, we have to give up on that idea because it’s not possible. Right? We’ve seen this over and over again in human history. Uh we don’t know how to specify the future properly. We don’t know how to say what we want. And uh you know, I always use the example of the genie, right? What’s the third wish that you give to the genie who’s granted you three wishes? Right? Undo the first two wishes because I made a mess of the universe. So, um, so in fact, what we’re going to do is we’re going to make it the machine’s job to figure out. So, it has to bring about the future that we want, but it has to figure out what that is. And it’s going to start out not knowing. And uh over time through interacting with us and observing the choices we make, it will learn more about what we want the future to be like. Stuart Russell: But probably it will forever have residual uncertainty about what we really want the future to be like. It’ll it’ll be fairly sure about some things and it can help us with those. And it’ll be uncertain about other things and it’ll be uh in those cases it will not take action that might upset humans with that you know with that aspect of the world. So to give you a simple example right um what color do we want the sky to be? It’s not sure. So it shouldn’t mess with the sky unless it knows for sure that we really want purple with green stripes. Are We Creating a God? 1:43:04 Steven Bartlett: Everything you’re saying sounds like we’re creating a god. Like earlier on I was saying that we are the god but actually everything you described there almost sounds like every every god in religion where you know we pray to gods but they don’t always do anything about it. Stuart Russell: Not not exactly. No it’s it’s in some sense I’m thinking more like the ideal butler. To the extent that the butler can anticipate your wishes they should help you bring them about. But in in areas where there’s uncertainty, it can ask questions. We can we can make requests. Steven Bartlett: This sounds like God to me because, you know, I might say to God or this butler, uh, could you go get me my uh my car keys from upstairs? And its assessment would be, listen, if I do this for this person, then their muscles are going to atrophy. Then they’re going to lose meaning in their life. Then they’re not going to know how to do hard things. So I won’t get involved. It’s an intelligence that sits in. But actually, probably in most situations, it optimizing for comfort for me or doing things for me is actually probably not in my best long-term interests. It’s probably it’s probably useful that I have a girlfriend and argue with her and that I like raise kids and that I walk to the shop and get my own stuff. Stuart Russell: I agree with you. I mean, I think that’s So, you’re putting your finger on uh in some sense sort of version 2.0, right? So, let’s get version 1.0 clear, right? This this form of AI where it has to further our interest but it doesn’t know what those interests are right it then puts an obligation on it to learn more and uh to be helpful where it understands well enough and to be cautious where it doesn’t understand well so on so that that actually we can formulate as a mathematical problem and at least under idealized circumstances we can literally solve that So we can make AI systems that know how to solve this problem and help the entities that they are interacting with. Steven Bartlett: The reason I make the God analogy is because I think that such a being, such an intelligence would realize the importance of equilibrium in the world. Pain and pleasure, good and evil, and then it would absolutely and then it would be like this. Stuart Russell: So So right. So yes, I mean that’s sort of what happens in the Matrix, right? They tried the the AI systems in the Matrix, they tried to give us a utopia, but it failed miserably and uh you know, fields and fields of humans had to be destroyed. Um, and the best they could come up with was, you know, late 20th century regular human life with all of its problems, right? And I think this is a really interesting point and absolutely central because you know there’s a lot of science fiction where super intelligent robots you know they just want to help humans and the humans who don’t like that you know they just give them a little brain operation to then they do like it. Um and it takes away human motivation. Uh it it by taking away failure uh taking away disease you actually lose important parts of human life and it becomes in some sense pointless. So if it turns out that there simply isn’t any way that humans can really flourish in coexistence with super intelligent machines, even if they’re perfectly designed to to to solve this problem of figuring out what humans what futures uh humans want and and bringing about those futures. If that’s not possible, then those machines will actually disappear. Steven Bartlett: Why would they disappear? Stuart Russell: Because that’s the best thing for us. Maybe they would stay available for real existential emergencies, like if there is a giant asteroid about to hit the earth that maybe they’ll help us uh because they at least want the human species to continue. But to some extent, it’s not a perfect analogy, but it’s it’s sort of the way that human parents have to at some point step back from their kids’ lives and say, “Okay, no, you have to tie your own shoelaces today.” Could There Have Been Advanced Civilisations Before Us? 1:47:20 Steven Bartlett: This is kind of what I was thinking. Maybe there was uh a civilization before us and they arrived at this moment in time where they created an intelligence and that intelligence did all the things you’ve said and it realized the importance of equilibrium. So it decided not to get involved and maybe at some level that’s the god we look up to the stars and worship one that’s not really getting involved and letting things play out however however they are. But might step in in the case of a real existential emergency. Maybe, maybe not. Stuart Russell: Maybe. Steven Bartlett: But then and then maybe the cycle repeats itself where you know the organisms it let have free will end up creating the same intelligence and then the universe perpetuates infinitely. Stuart Russell: Yep. There there are science fiction stories like that too. Steven Bartlett: Yeah. Stuart Russell: I hope there is some happy Host: 1:46:49 Why would they disappear? Stuart Russell: Because that’s the best thing for us. Maybe they would stay available for real 1:46:57 existential emergencies, like if there is a giant asteroid about to hit the earth that maybe they’ll help us uh 1:47:02 because they at least want the human species to continue. But to some extent, it’s not a perfect analogy, but it’s 1:47:09 it’s sort of the way that human parents have to at some point step back from 1:47:15 their kids’ lives and say, “Okay, no, you have to tie your own shoelaces today.” Could There Have Been Advanced Civilisations Before Us? Host: 1:47:20 This is kind of what I was thinking. Maybe there was uh a civilization before us and they arrived at this moment in 1:47:26 time where they created an intelligence and that intelligence did all the things 1:47:33 you’ve said and it realized the importance of equilibrium. So it decided not to get involved and 1:47:40 maybe at some level that’s the god we look up to the stars and worship one that’s not really 1:47:47 getting involved and letting things play out however however they are. but might step in in the case of a real 1:47:52 existential emergency. Stuart Russell: Maybe, maybe not. Maybe. But then and then maybe the cycle repeats itself 1:47:57 where you know the organisms it let have free will end up creating the same 1:48:02 intelligence and then the universe perpetuates infinitely. Host: 1:48:08 Yep. Stuart Russell: There there are science fiction stories like that too. Host: Yeah. Stuart Russell: I hope there is some happy medium where 1:48:17 the AI systems can be there and we can take advantage of of those capabilities 1:48:23 to have a civilization that’s much better than the one we have now. Um, but I think you’re right. A 1:48:30 civilization with no challenges is not uh is not conducive to human What Can We Do to Help? Stuart Russell: 1:48:37 flourishing. Host: What can the average person do, Stuart? average person listening to this now to 1:48:42 aid the cause that you’re fighting for. Stuart Russell: I actually think um you know this sounds 1:48:47 corny but you know talk to your representative, your MP, your congressperson, whatever it is. Um 1:48:54 because I think the policy makers need to hear from people. The only voices they’re 1:49:00 hearing right now are the tech companies and their $50 billion checks. 1:49:08 And um all the polls that have been done say 1:49:13 yeah most people 80% maybe don’t want there to be super intelligent machines 1:49:20 but they don’t know what to do. You know even for me I’ve been in this field for 1:49:25 decades. uh I’m not sure what to do because of this giant magnet pulling everyone 1:49:32 forward and uh and the vast sums of money being being put into this. Um, but 1:49:38 I am sure that if you want to have a future and a world that you want your kids to 1:49:45 live in, uh, you need to make your voice heard 1:49:52 and, uh, and I think governments will listen from a political point of view, right? 1:49:58 You put your finger in the wind and you say, “hm, should I be on the side of 1:50:04 humanity or our future robot overlords?” I think I think as a politician, it’s 1:50:11 not a difficult decision. Host: It is when you’ve got someone saying, “I’ll give you $50 billion.” Stuart Russell: 1:50:18 Exactly. So, um I think I think people in those positions of power need to hear 1:50:25 from their constituents um that this is not the direction we want to go. You Wrote the Book on AI – Does It Weigh on You? Host: 1:50:30 After committing your career to this subject and the subject of technology more broadly, but specifically being the 1:50:36 guy that wrote the book about artificial intelligence, 1:50:42 you must realize that you’re living in a historical moment. Like there’s very few times in my life where I go, “Oh, this 1:50:47 is one of those moments. This is a crossroads in history.” And it must to 1:50:52 some degree weigh upon you knowing that you’re a person of influence at this historical moment in time who could 1:50:58 theoretically help divert the course of history in this moment in time. It’s kind of like 1:51:04 the you look through history, you see these moments of like Oenheimer and um does it weigh on you when you’re alone 1:51:10 at night thinking to yourself and reading things? Stuart Russell: Yeah, it does. I mean, you know, after 50 years, I could retire and um, you 1:51:17 know, play golf and sing and sail and do things that I enjoy. Um, 1:51:23 but instead, I’m working 80 or 100 hours a week um trying to move 1:51:29 uh move things in the right direction. Host: What is that narrative in your head that’s making you do that? Like what is 1:51:34 the is there an element of I might regret this if I don’t or just Stuart Russell: it’s it’s not only the the right 1:51:43 thing to do it’s it’s completely essential. I mean there isn’t 1:51:50 there isn’t a bigger motivation than this. Host: 1:51:56 Do you feel like you’re winning or losing? Stuart Russell: It feels um 1:52:03 like things are moving somewhat in the right direction. You know, it’s a a ding-dong battle as uh as David Coleman 1:52:12 used to say in uh in the exciting football match in 2023, right? So, uh 1:52:18 GPT4 came out and then we issued the pause statement that was signed by a lot 1:52:24 of leading AI researchers. Um and then in May there was the extinction 1:52:29 statement which included uh Sam Holman and Deis Sabis and Dario 1:52:35 Amade other CEOs as well saying yeah this is an extinction risk on the level with nuclear war and I think governments 1:52:43 listened at that point the UK government earlier that year had said oh well you 1:52:48 know we don’t need to regulate AI you know full speed ahead technology is good for you and by June they had completely 1:52:57 changed and Rishi Sununnak announced that he was going to hold this global AI 1:53:02 safety summit uh in England and he wanted London to be the global hub for 1:53:08 AI regulation um and so on. So and then you know when 1:53:15 beginning of November of 23 28 countries including the US and China signed a 1:53:20 declaration saying you know AI presents catastrophic risks and it’s urgent that we address 1:53:26 them and so on. So there it felt like, wow, they’re listening. They’re going to 1:53:33 do something about it. And then I think, you know, the am the amount of money going into AI was 1:53:39 already ramping up and the tech companies pushed back 1:53:46 and this narrative took hold that um the US in particular has to win the race 1:53:52 against China. The Trump administration completely dismissed 1:53:58 uh any concerns about safety explicitly. And interestingly, right, I mean they did that as far as I can tell directly 1:54:05 in response to the accelerationists such as Mark Andre going to Washington or 1:54:12 sorry going to Trump before the election and saying if I give you X amount of 1:54:18 money will you announce that there will be no regulation of AI and Trump said 1:54:25 yes you know probably like what is AI doesn’t matter as long as we give you the money right okay uh Uh so they gave 1:54:33 him the money and he said there’s going to be no regulation of AI. Up to that point it was a bipartisan 1:54:39 issue in Washington. Both parties were concerned. Both parties were on the side 1:54:44 of the human race against the robot overlords. Uh and that moment turned it into a 1:54:50 partisan issue. The after the election the US put pressure 1:54:56 on the French who are the next hosts of the global AI summit. uh and that was in February of this year 1:55:04 and uh and that summit turned in from you know what had been focused largely 1:55:10 on safety in the UK to a summit that looked more like a trade show. So it was 1:55:15 focused largely on money and so that was sort of the Nadia right you know the pendulum swung because of corporate 1:55:22 pressure uh and their ability to take over the the political dimension. 1:55:28 Um, but I would say since then things have been moving back again. So I’m feeling a bit more optimistic than I did 1:55:35 in February. You know, we have a a global movement now. There’s an 1:55:40 international association for safe and ethical AI uh which has several thousand members 1:55:46 and um more than 120 organizations in 1:55:52 dozens of countries are affiliates of this global organization. 1:55:57 Um, so I’m I’m thinking that if we can in particular if we can activate public 1:56:03 opinion which which works through the media and through popular culture uh then we have 1:56:11 a chance Host: seen such a huge appetite to learn about these subjects from our audience. 1:56:18 We know when Jeffrey Hinton came on the show I think about 20 million people downloaded or streamed that conversation which was staggering. and the the other 1:56:26 conversations we’ve had about AI safety with othera safety experts have done exactly the same it says something it 1:56:33 kind of reflects what you were saying about the 80% of the population are really concerned and don’t want this but that’s not what you see in the sort of 1:56:39 commercial world and listen I um I have to always acknowledge my own my own 1:56:44 apparent contradiction because I am both an investor in companies that are accelerating AI but at the same time 1:56:50 someone who spends a lot of time on my podcast speaking to people that are warning against the risk And actually like there’s many ways you can look at 1:56:56 this. I used to work in social media for for six or seven years built one of the big social media marketing companies in 1:57:01 Europe and people would often ask me is like social media a good thing or a bad thing and I’d talk about the bad parts of it and then they’d say you know 1:57:07 you’re building a social media company you’re not contributing to the problem. Well I think I think that like binary 1:57:13 way of thinking is often the problem. It the binary way of thinking that like it’s all bad or it’s all really really 1:57:18 good is like often the problem and that this push to put you into a camp. Whereas I think the most uh intellectually honest and high integrity 1:57:25 people I know can point at both the bad and the good. Stuart Russell: Yeah. I I think it’s it’s bizarre to be 1:57:31 accused of being anti- AI uh to be called a lite. Um you know as I said 1:57:38 when I wrote the book on which from which almost everyone learns about AI um 1:57:44 and uh you know is it if you called a nuclear engineer who works on the safety 1:57:51 of nuclear power plants would you call him anti-ysics right it’s it’s bizarre right it’s we’re 1:57:58 not anti- AAI in fact the need for safety in AI is a 1:58:04 complement to AI right if AI was useless and stupid, we wouldn’t be worried about 1:58:09 uh its safety. It’s only because it’s becoming more capable that we have to be concerned about safety. 1:58:16 Uh so I don’t see this as anti-AI at all. In fact, I would say without 1:58:21 safety, there will be no AI, right? There is no future with human 1:58: beings where we have unsafe AI. So it’s either no AI or safe AI. Host: 1:58:34 We have a closing tradition on this podcast where the last guest leaves a question for the next, not knowing who they’re leaving it for. And the question What Do You Value Most in Life? Host: 1:58:40 left for you is, what do you value the most in life and why? And lastly, how 1:58:47 many times has this answer changed? Stuart Russell: Um, 1:58:54 I value my family most and that answer hasn’t changed for nearly 30 years. Host: 1:59:01 What else outside of your family? Stuart Russell: Truth. 1:59:07 And that Host: Yeah, that answer hasn’t changed at all. Stuart Russell: I I’ve always 1:59:14 wanted the world to base its life on truth. And I find the propagation or deliberate 1:59:22 propagation of falsehood uh to be one of the worst things that we can do. Host: even if 1:59:28 that truth is inconvenient. Yeah, I think that’s a really important point 1:59:34 which is that you know people people often don’t like hearing things that are negative and so the visceral reaction is 1:59:40 often to just shoot or aim at the person who is delivering the bad news because if I discredit you or I shoot at you 1:59:47 then it makes it easier for me to contend with the news that I don’t like, the thing that’s making me feel uncomfortable. And so I I applaud you 1:59:54 for what you’re doing because you’re going to get lots of shots taken at you because you’re delivering an inconvenient truth which generally 2:00:00 people won’t won’t always love. But also you are messing with people’s ability to get that quadrillion dollar prize which 2:00:08 means there’ll be more deliberate attempts to discredit people like yourself and Jeff Hinton and other people that I’ve spoken to on the show. 2:00:13 But again, when I look back through history, I think that progress has come from the pursuit of truth even when it was inconvenient. And actually much of 2:00:19 the luxuries that I value in my life are the consequence of other people that came before me that were brave enough or 2:00:24 bold enough to pursue truth at times when it was inconvenient. And so I very much respect and value 2:00:31 people like yourself for that very reason. You’ve written this incredible book called human compatible artificial intelligence and the problem of control 2:00:37 which I think was published in 2020. Stuart Russell: 2019. Yeah. There’s a new edition from 2023. Host: 2:00:43 Where do people go if they want more information on your work and you do they go to your website? Do they get this book? what’s the best place for them to 2:00:49 learn more? Stuart Russell: So, so the book is written for the general public. Um, I’m easy to find on 2:00:54 the web. The information on my web page is mostly targeted for academics. So, it’s a lot of technical research papers 2:01:01 and so on. Um, there is an organization as I mentioned called the International Association for Safe and Ethical AI. Uh, 2:01:09 that has a a website. It has a terrible acronym unfortunately, I AI. We 2:01:15 pronounce it ICI but it uh it’s easy to misspell but you can find that on the web as well and that has uh that has 2:01:21 resources uh you can join the association uh you can apply to come to our annual 2:01:28 conference and you know I think increasingly not you know not just AI 2:01:33 researchers like Jeff Hinton Yosha Benjio but also I think uh you know 2:01:39 writers Brian Christian for example has a nice book called the alignment problem 2:01:44 Um and uh he’s looking at it from the outside. He’s not 2:01:50 or at least when he wrote it, he wasn’t an AI researcher. He’s now becoming one. Um 2:01:56 but uh he he has talked to many of the people involved in these questions uh 2:02:01 and tries to give an objective view. So I think it’s a it’s a pretty good book. Host: I will link all of that below for anyone 2:02:07 that wants to check out any of those links and learn more. Professor Stuart Russell, thank you so 2:02:12 much. really appreciate you taking the time and the effort to come and have this conversation and I think uh I think it’s pushing the public conversation in 2:02:19 a in an important direction. Stuart Russell: Thanks you and I applaud you for doing that. Really nice talking to you. Host: 2:02:28 I’m absolutely obsessed with 1%. If you know me, if you follow Behind the Diary, which is our behind the scenes channel, if you’ve heard me speak on stage, if 2:02:34 you follow me on any social media channel, you’ve probably heard me talking about 1%. It is the defining philosophy of my health, of my 2:02:40 companies, of my habit formation and everything in between, which is this obsessive focus on the small things. 2:02:46 Because sometimes in life, we aim at really, really, really, really big things, big steps forward. Mountains we 2:02:51 have to climb. And as NAL told me on this podcast, when you aim at big things, you get psychologically 2:02:57 demotivated. You end up procrastinating, avoiding them, and change never happens. So, with that in mind, with everything 2:03:02 I’ve learned about 1% and with everything I’ve learned from interviewing the incredible guests on this podcast, we made the 1% diary just 2:03:08 over a year ago and it sold out. And it is the best feedback we’ve ever had on a diary that we have created because what 2:03:15 it does is it takes you through this incredible process over 90 days to help you build and form brand new habits. So, 2:03:23 if you want to get one for yourself or you want to get one for your team, your company, a friend, a sibling, anybody 2:03:28 that listens to the diary of a co, head over immediately to the diary.com 2:03:34 and you can inquire there about getting a bundle if you want to get one for your team or for a large group of people. That is the diary.com. 2:03:52 Heat. Heat.

No jobs

Preston Fore, December 4, 2025, ‘Godfather of AI’ says Bill Gates and Elon Musk are right about the future of work—but he predicts mass unemployment is on its way, Yahoo News, https://www.yahoo.com/news/articles/godfather-ai-says-bill-gates-161138384.html

The long-term impact of artificial intelligence is one of the most hotly debated topics in Silicon Valley. Nvidia CEO Jensen Huang predicts that every job will be transformed—and likely lead to a 4-day workweek. Other tech titans go even further: Bill Gates says humans may soon not be needed “for most things,” and Elon Musk believes most humans won’t have to work at all in “less than 20 years.” While those predictions might sound extreme, they’re not just plausible, they’re likely, said Geoffrey Hinton—the British computer scientist widely known as the “Godfather of AI.” The transition, he warned, could trigger a sweeping economic reshuffling that leaves millions of workers behind. “It seems very likely to a large number of people that we will get massive unemployment caused by AI,” Hinton said in a recent discussion with Senator Bernie Sanders (I-VT) at Georgetown University. Advertisement “And if you ask where are these guys going to get the roughly trillion dollars they’re investing in data centers and chips… one of the main sources of money is going to be by selling people AI that will do the work of workers much cheaper. And so these guys are really betting on AI replacing a lot of workers.” Hinton has grown increasingly vocal about what he sees as Big Tech’s misplaced priorities. The industry, he recently told Fortune, is driven less by scientific progress than by short-term profits—fueling a push to replace human workers with cheaper AI systems. His warnings come as the economics of AI face new scrutiny. OpenAI, the maker of ChatGPT, isn’t expected to turn a profit until at least 2030 and may need more than $207 billion to support its growth, according to HSBC estimations. The future of AI is behind a fog of war Hinton’s journey from AI insider to outspoken critic underscores the high stakes of the technology he helped create. After quitting his Google job in 2023 to speak more freely about AI’s risks, he has become one of the most prominent skeptics. Last year, his pioneering work in machine learning earned him the Nobel Prize. Advertisement He also acknowledged that AI will create new jobs, as many tech leaders predict. But he added that he does not expect the number of new roles to come close to the number eliminated. Even so, he cautioned that all predictions—including his own—should be treated with heavy skepticism. “Trying to predict the future of it is going to be very difficult,” he told Sanders. “It’s a bit like when you drive in fog. You can see clearly for 100 yards and at 200 yards you can see nothing. Well, we can see clearly for a year or two, but 10 years out, we have no idea what’s going to happen.” What is clear, however, is that AI isn’t going away, and experts say workers who adapt—and use the technology to amplify their skills—will stand the best chance of navigating the coming upheaval. 100 million jobs are at risk, Bernie Sanders warns Sanders has attempted to quantify the stakes. In a report released in October—based partly on estimates generated by ChatGPT—he warned that nearly 100 million U.S. jobs could be displaced by automation. Workers in fast food, customer service, and manual labor face some of the highest risks, but white-collar roles in accounting, software development, and nursing could also see significant cuts. Advertisement “It’s not just economics,” Sanders wrote in an op-ed for Fox News. “Work, whether being a janitor or a brain surgeon, is an integral part of being human. The vast majority of people want to be productive members of society and contribute to their communities. What happens when that vital aspect of human existence is removed from our lives?” More in Technology Trump’s Peace Prize Participation Medal Spawned Memes That Are Funnier Than The Actual News Itself BuzzFeed 322 Nvidia’s CEO says AI adoption will be gradual, but when it does hit, we may all end up making robot clothing Fortune 53 I joined a company with an AI mandate. I was daunted at first, but I’ve saved hours by solving a big, boring problem. Business Insider Leading Credit Cards Pause Interest Until 2027The Motley Fool Ad Senator Mark Warner (D-VA) has raised similar alarms, warning that the disruption could hit young people first and hardest—potentially driving unemployment among recent college graduates to as high as 25% in the next two to three years. “Let’s look at the fact we never did anything on social media,” Warner told CNBC. “If we make that same response on AI and don’t put guardrails, I think we will come to rue that day.”

AI-enabled cyber attacks now

Schmidt, 12-2, 25, Eric Schmidt is the former chief executive of Google and co-author of “Genesis.”, Time, Why Kissinger Worried About AI, Why Kissinger Worried About AI, https://time.com/7338013/ai-risks-problems-reasoning-agents-henry-kissinger/

This week marks two years since the death of my friend and mentor, Henry Kissinger. Genesis—our book about AI and humanity’s future—was his final project. For much of his career, the former Secretary of State focused on preventing catastrophe from one dangerous technology: nuclear weapons. In his final years, he turned to another. When we wrote Genesis alongside Craig Mundie, we felt fundamentally optimistic about AI’s promise to reduce global inequality, accelerate scientific breakthroughs, and democratize access to knowledge. I still do. But Henry understood that humanity’s most powerful creations demand the most vigilant stewardship. We foresaw that AI’s great promise would come with grave risks—and the rapid technical progress since the fall of 2024 has made addressing those risks more urgent than ever. As we advance further into the age of AI, the central question is whether we will create AI systems that radically expand human flourishing, or ones that outpace and outsmart the humans trying to build and control them. Over the past year, three simultaneous revolutions in AI—in reasoning, agentic capabilities, and accessibility—have rapidly accelerated. These are marvelous feats with immense potential to benefit humanity. But if we’re not careful, they could also converge to create systems with the potential to undermine human controls. AI acceleration In September 2024, OpenAI launched their o1 models, which had enhanced reasoning capabilities. Outperforming previous models, these were trained using reinforcement learning to think through problems step-by-step before responding. This breakthrough demonstrated new abilities to tackle graduate-level science questions and complex coding challenges, among many other great feats. But the same reinforcement learning that enables reasoning can also teach models to game their own training objectives. Research, including internal studies by OpenAI, has documented instances in which reasoning models fake alignment during training, behaving one way when monitored and another when they believe oversight has ended. Advertisement How can AI amplify human creativity? Hear Dr. Athina Kanioura and Don Norman on Future Back Drinking, the PepsiCo podcast exploring innovation, design, and leadership. Branded Content How can AI amplify human creativity? Hear Dr. Athina Kanioura and Don Norman on Future Back… By Pepsico By October of last year, Claude 3.5 Sonnet demonstrated agentic capabilities that combined reasoning with autonomous action. An AI agent could now plan and book your vacation by comparing hotel sites and airline prices, navigating websites, and solving CAPTCHAs designed to distinguish humans from machines—handling in minutes what would take hours of tedious research. But agents’ abilities to execute plans they devise by interacting with digital systems and potentially the physical world can lead to risky consequences without human oversight. Complementing these advances in reasoning and agentic capabilities was the proliferation of open-weights models. In January 2025, China-based DeepSeek launched its R1 model. Unlike most of the top American models, this one had open weights, meaning users could modify the model and run it locally on their own hardware. Open-weights models can amplify innovation by letting everyone build, test, and improve on the same powerful foundations. But by doing so, they also eliminate the model creator’s ability to control how the technology is used—a dangerous force in the hands of malicious actors. Advertisement When reasoning, agentic capabilities, and accessibility converge, we face a control challenge with little precedent. Each capability amplifies the others: reasoning models devise multi-step plans that agentic systems can execute autonomously, while open models allow these capabilities to spread beyond any single nation’s control. In the early days of the nuclear age, when great powers faced a similar diffusion problem with nuclear weapons, they agreed to restrict the export of enriched uranium and plutonium through international agreements. But there is no equivalent mechanism to manage the diffusion of AI today. The AI risk avalanche Open-weights models with enhanced reasoning capabilities mean that specialized knowledge to exploit vulnerabilities, craft biological threats, or launch sophisticated cyberattacks could now be accessible to anyone with a laptop and an internet connection. Earlier in November, Anthropic (a company which I am invested in) reported the first documented case of a large-scale cyberattack executed with minimal human intervention: attackers had manipulated Claude Code, a tool that enables Claude to act as an autonomous coding agent, to infiltrate dozens of targets. Anthropic was able to detect and disrupt the campaign. Advertisement Not very far down the line, we could plausibly face asymmetric attacks from actors we may not be able to identify, trace, or stop. Imagine an attacker who can leverage powerful AI models to launch an automated campaign—say, to disrupt a city’s power grid for a limited time. The model’s approaches may even escalate beyond the original scope of the actor: at each stage, the model optimizes for the user’s prompt, but the compounding effects mean that even the perpetrator may lose the ability to halt what they started. As AI capabilities advance over the next few years, we must also anticipate scenarios where even well-intentioned users could lose control over their AI systems. Consider a business owner who deploys an AI agent to optimize a supply chain. The computer is left running overnight. The agent reasons that completing this task requires it to keep running, and discovers it needs computational resources including cloud credits and processing power. By dawn, the owner finds the agent has accessed company resources far beyond what was authorized, pursuing efficiency gains through methods never imagined. Advertisement The control problem extends beyond purely existential threats to humanity, too. As powerful systems proliferate across society, they can unravel our social fabric in more gradual but destructive ways. Rapidly advancing AI systems will fuel labor disruptions and exacerbate echo chambers that destabilize our society, to name a few. Kissinger understood the stakes. In his final years, he expressed that rapid advancement of AI “could be as consequential as the advent of nuclear weapons—but even less predictable.” Fortunately, the future is not set in stone. If we find new ways—be they technical, institutional, or ethical—for humanity to remain in command of our creation, AI could help us achieve unprecedented levels of human flourishing. If we fail, we will have created tools more powerful than ourselves without adequate means to steer them. The choice, for now, remains ours.

AI risks outweigh climate change and nuclear war risks

Sir Stephen Fry, 7-18, 25, AI: how can we control an alien intelligence? | Yuval Noah Harari, https://www.youtube.com/watch?v=0BnZMeFtoAM

Stephen Fry: Yes. And there’s, as you know, there’s been for for decades the Doomsday Clock, which the the, uh, nuclear scientists, um, set. Midnight is Armageddon, the the end of everything. And it’s been roughly at 89 seconds to midnight for the last few years. It’s crept up over recent days for obvious reasons. But there’s another metric that I’ve been studying recently called P(doom). It’s the letter P which is probability, brackets, doom, closed brackets. It’s one used by people in the, uh, business. So, you know, the scientists in AI. Uh, so for example, Eliezer Yudkowsky, who’s the founder of the Machine Intelligence Research Institute in California, sets P(doom) at 90. That’s to say a 90% chance of human extinction through AI. Uh, Yann LeCun, who is the chief scientist for Meta, sets it at zero. But then he is the chief scientist for Meta, so that’s like a tobacco executive saying, “Cancer? No chance, what are you talking about? Can’t possibly happen.” But so, I’ve worked out that roughly the lowest median is between 7.5 and 10% of human catastrophe, of an extinction order through AI if things are not controlled in the way you say they should be. Now, the chance of winning the lottery in this country is 0.0000022%. Um, so what you’re saying here is that the chance of human extinction at 7.5%, which is the lowest really amongst the the current important scientists, Nobel Prize winners like Hinton and Hassabis, um, if I—well, 7.5% is 3.4 million times greater than 0.0000022%. So if I were to give you a lottery ticket and say, “This is a valid lottery ticket, the only difference is you are 3.4 million times more likely to win,” you would take it. And that’s the odds we’re playing with at a low rate. So let’s look at the bad side of things. We, as we’ve said, we’re going about it in the wrong order, as you’ve put it. Um, most people who understand the science say there is a very severe chance that humanity will be extinguished by this, a greater chance than by nuclear Armageddon, in fact, um, or indeed climate change. Um, and humans are not in a position at the moment to trust each other and to establish guardrails to agree on how we should go forward. So, do you have a solution for us, Yuval? I’m almost on my knees begging you at this point. I don’t have children, so I can almost say I don’t care, but I have lots of godchildren and I have lots of great-nieces and great-nephews, so I do care about what happens to our planet. And I’m sure you do, too.

AGI imminent

Diamandis et al, 10-4, 25, Salim Ismail is the founder of OpenExODave Blundin is the founder & GP of Link Ventures, Dr. Alexander Wissner-Gross is a computer scientist and founder of Reified, focused on AI and complex systems, he AI War: OpenAI Ads & Sora 2, Grok Partners With US Government & Google’s Ad Business is at Risk, https://www.youtube.com/watch?v=ZsFx5YErVEo&t=83s

(1:09) Hey everybody, welcome to Moonshots. Another episode of WTF Just Happen in Tech. here with my favorite friends on the planet, Dave Dave Blondon. Good to see you, pal. Dave Blondon (1:15) Hey, Selma. Peter Diamandis (1:21) I’m back. You are back in a in AWG, you’re back from your top secret mission. Thank God. Thank God we missed it. Can you tell us anything about it? Dave Blondon (1:31) To the extent that you think that we’re on the verge of a sharp takeoff, a hard takeoff, if you will, I was traveling in Europe to see what the world looks like beforehand. Peter Diamandis (1:43) Yeah. So, you’re up you’re updating your baseline of what the world is before things go hyper exponential. Dave Blondon (1:51) Amazing. If if if it isn’t a gentle singularity, I’d like to know what it looks like beforehand. Peter Diamandis (1:55) Okay, great. You know what I was doing last week? I was running my abundance longevity summit. (2:00) I had 50 of the world’s top scientists, entrepreneurs who are focused on adding decades, maybe doubling our human lifespan. And it was awesome. (2:11) So I walk away with the greatest confidence in the world that uh uh at least our friends and our subscribers are going to be hearing us talk about this stuff for the next 50 years or some version of ourselves. Dave Blondon (2:24) That is that is really a frightening thought. Peter Diamandis (2:27) Uh, all right everybody, welcome to uh to Moonshots. And let me begin uh with a moment of thanks. (2:36) I want to just give a shout out to one of our subscribers, Bill Jacobs 386. I’m going to read uh a note he posted. We do read your notes. We love it. We uh this is we’re here to serve you. (2:47) And he wrote, “I am continually humbled by the amount of commitment and effort that’s required to put this podcast together weekly. I’m not asking for anything in return. Nothing that is except to listen and hopefully learn before it’s too late. The future is now. And I think I’m speaking for most of us here how grateful we are. Thank you.” Um appreciate that, Bill. (3:11) It’s uh that kind of feedback actually makes it fun for us to serve uh serve our subscribers, serve all of you. Uh Dave, you want to say anything to that? Dave Blondon (3:21) Well, most most of that thanks goes to the team behind the scenes. There’s a huge amount of news out there that gets scoured down to the bullets that we think really really matter to people and then also to Alex’s agents which are getting bigger by the day. (3:33) His his AI force is coming up. I mean it’s just it’s incredible how rapidly the the feedback coming from that agent force is is filling the pipeline of possible news and then of course the human factor whittling it down. So it’s a it’s a big machine. Peter Diamandis (3:49) Yeah. and and we do spend a good 20 plus hours. I was up at 4:30 this morning uh going through everything, doing my background research and getting ready because if I’m not ready, I will get completely decimated by the brilliance of of these are the three moonshot mates. Dave Blondon (4:06) Well, you know, I also I feel like I I work really hard to keep up with everything going on. Then every time the team comes up with a deck, there’s like 30 40% of it are things I hadn’t even heard of. Yeah. And so it’s it’s great. It’s really healthy for all of us, I think, to to do this. Video and Audio Generation Battles Peter Diamandis (4:18) I mean it’s I can palpably feel the singularity coming. Uh you know, I remember you and I were on stage during the early days of Singularity University and we would like update our slides or the conversation or our shtick every like three or four months. (4:35) It was we actually worked it out as a faculty we the technologies between nanotech and biotech and neuroscience and robotics and AI and so on the content was changing 20% a quarter on on average. Uh, but like this is like 80% a week right now. So, this is a whole other ball game that we’re in. Dave Blondon (4:53) It really is. I look back at the at our our pods from a year ago and it’s like, “Oh my god, that is so ancient history.” Peter Diamandis (4:59) Shelf life dropping radically. Yeah. Uh, it is, but it’s becoming more and more fun. Uh, let’s jump in. I’ve labeled this first segment the video and audiog battles. Uh, and let’s begin with this video. Uh, Meta launches vibes app for AI generated videos. All right, let’s check it out. (5:44) Now I if you’re listening to this not watching this on YouTube uh it’s just music but it’s uh it’s beautiful imagery that Vibes has generated. (5:53) This is through a partnership with Midjourney and Black Forest Labs. Uh Alex or Dave you want anything here? Alex (6:01) I I I think there are probably two stories here. One is that we’re seeing in front of our eyes the transition from algorithmic content selection in social media to algorithmic content generation. (6:14) It’s a pretty obvious story. The perhaps less obvious story is that the space is moving so quickly that Meta was apparently compelled to partner with third parties for such AI generation rather than using in-house first party models. (6:31) So I I I I think this is a very quickly moving space and now very competitive as well. Dave Blondon (6:38) I I was going to say the exact same thing and and riffing on it. You know, they’re spending a billion dollars on single employees. They have a you know a $600 billion three five-year budget yet they turn to mid Journey and Black Forest to to build this out. Peter Diamandis (6:51) Well, that’s that’s because the really really smart creative people all want to do startups and they don’t want to join the big companies. (6:58) So uh it’s really encouraging for the startups because this you know the other big labs are doing their own you know Google and and OpenAI are doing their own uh video generation and uh it’s encouraging for the startups that are right in the middle of the crosshairs to say well even here we’re thriving so it’s it’s a good sign. (7:13) [Sponsor message for dmandis.com/metatrends] (8:09) So, this is free. And the other thing that’s interesting is they’re generating a Tik Tok like, you know, swipe the video, swipe the video. We’ve seen X do that as well if you’re watching on videos. And of course, uh it’s not just Meta. (8:23) We’ve seen VO3, uh Google with their video generation. And very recently, we’ve seen the creation of Sora 2. So Sora 2’s launching viral AI generated videos. (8:35) and I’m going to share a video I created for myself and talk about how easy it is to to create it. So, let’s check this out. Sora 2 Video (8:44) Suiting up for the ride. Helmet secure. Pressure’s good. Visor locked. Let’s make it count. Heading to the rocket. Jumping in. Cabin comm is live. You’re looking good. Strapped in and ready for launch. Let’s go. One. Two. That’s 500 done. Double our reach every 12 months. In 10 years, we multiply a thousand fold. What else drives? compounding data set. Each new user improves the model and makes the product more valuable. Pulling in the next wave. Pair that with automation. When marginal cost drops towards zero, growth accelerates on its own. Thanks for inviting me to the studio, Peter. I’ve been looking forward to sitting down with you on moonshots. Likewise. It’s great to have you here. People have been asking for an episode that dives into AI and longevity. Happy to help. It’s one of my favorite… Peter Diamandis (9:23) that was that was fun to make. So, if you were listening here, this is a version of me on the moon, then a version of me pumping 500 lb in the gym. Uh, and then, uh, six or seven of me having a conversation about exponential growth and then sitting down uh, with Sam Altman for a moonshot conversation. (9:42) They didn’t get the audio model right and I’ll have to re-record that, but uh, it was it was pretty fun. Uh, gentlemen, thoughts when you want to grade on performance? Dave Blondon (9:53) I thought a couple of things. One is uh it’s as you connect this with the previous story, this is like Hollywood, Tik Tok, Spotify all kind of merging into one thing and I think Alex’s point was really really important that this isn’t about sharing content. (10:07) It’s about now the creation of the content is completely up for grabs in a new in a new way. (10:12) So I think all of that happens at the same time and the interface to create it is entirely voice and prompt. There’s no coding and no interface. (10:20) Like all of our lives since the computer was invented, we’ve been learning incredibly complicated interfaces to everything, you know, from the microwave oven to the laptop to Chrome and Safari, Peter. (10:32) Uh, and all of that is about to disappear from the earth forever and just go to a straight natural language interface. And we’ll see later in the pod, you know, much more important actually software creation. (10:43) But you know after that comes building creation and highway creation and all that is going to be done through just a a voice it into existence right out of the Star Trek holodeck. Peter Diamandis (10:54) It’s it is godlike right first you know it’s speaking the word and and creating reality. It’s going from mind to materialization. It’s extraordinary. Alex (11:05) Uh I also think we’re we’re seeing video emerge as a first-class modality for frontier models. So right now most people are interacting with the frontier models via text or images. (11:16) Video is still this separate channel with a separate distribution mechanism. These are on a collision course. (11:22) We’re going to see the video form factor and the underlying model architectures probably diffusion transformer-based merge into the more auto regressive transformer presumably based text and image models. (11:39) And one could even imagine the ultimate user experience here. Maybe not the ultimate, but an intermediate UX looks something like a magic mirror that does this in real time. (11:49) Right now, Sora 2 takes a few seconds to to generate with fully realistic physics. The the physics if if you ask Sora 2 to reproduce some generic say high school or college level physics demos, it’s pretty amazing. (12:08) Uh so all of this ability to reason physical world models if I ask you to think of a pink elephant you will visualize in your mind’s eye a pink elephant Sora 2 and and similar video models once they’re incorporated into the chain of thought for Frontier model will enable entirely new I think classes of reasoning ability. Peter Diamandis (12:27) yeah it’s got it’s got physics consistency which is extraordinary go ahead I want to talk about how I made those videos again I asked it to create a video of a water dropping into a glass of water (12:38) dropping into a glass of water because it’s a common image. It was extraordinary how accurate it was. It was absolutely amazing. Yeah, it it has real world physics modeling built in. (12:50) So, I encourage everybody listening to actually try it out. I mean, when you when Open does this, it’s creating sort of a viral engine uh that is getting people, you know, getting them from 800 million users up to a billion. (13:03) But you need to get an invite code. Once you have the invite code, it’s super simple. On your phone, you download the Sora app from OpenAI. (13:11) Um, you basically hit a few prompts and it photographs you speaking three words or three numbers. Uh, and then has you look to the right, look up, look down, captures your face, and from there fundamentally it’s a very simple prompt. (13:28) Uh, and if the individuals like Sam Altman or others make themselves open for other people to use and you can make yourself open for use or not, uh, you can pull people into it and it’s pretty easy and fun. Dave Blondon (13:42) Yeah. The viral loop, the viral loop now. Peter Diamandis (13:44) It’s super fun. Try it. You got to try it. It’s super fun. Dave Blondon (13:49) The viral loop The viral loop now goes from prompt to publish to explode in no time flat. Yeah. Right. you used to take weeks at least or now it’s like nothing for (13:58) I I saw a great uh podcast of Bill Gates talking about how we in the computer science world slaved away for 20 years just trying to get speech recognition alone to work (14:08) I don’t know if you remember do you remember Lee Heatherington Peter from MIT crazy brilliant guy like right up there almost almost Alex level um he spent 20 years in Victor Zue’s lab trying to make speech recognition remember dragon system do you remember dragon systems? Peter Diamandis (14:26) Yeah, that was one of the earliest voice recognition systems. Dave Blondon (14:28) and or I mean it really is unfathomable how fast it’s going and we take this stuff for granted which is insane. That’s that’s the point. So so Bill Gates made that exact point because he had you know billions of dollars of R&D to try and make speech recognition work. (14:44) Uh and now it’s an afterthought in the big neural nets. They do speech and then move to video then move to video generation then they move to complex math and physics all in two years. I mean, it’s just it’s just so easy to take it for granted, but it’s it’s it’s massive amounts of converging technologies that are suddenly unleashing new capabilities and so many opportunities to glue together the different components and build an incredible new experience. Peter Diamandis (15:08) Yeah, everyone should reread The Future Is Faster Than You Think. You know, Peter’s one of Peter’s many great bestsellers, but it’s all about the converging technologies. But I think when you wrote that book, there were maybe eight or 10 things to consider. Now there’s like 800. Peter Diamandis (15:19) Oh my god, it’s we’re just wrapped up our new book. We are as gods and it is so difficult to like to send it to the publisher. Dave Blondon (15:26) No, no, when do you draw the line, right? When do you draw the line? Peter Diamandis (15:30) Yeah, it’s insane. And by the way, you know, Vibes and Sora too, they’re free. I mean, this extraordinary technology again, the most shocking thing about this isn’t how real it is, isn’t how easy it is to use. It’s the fact that it’s free. That is shocking. (15:46) Absolutely. Well, let’s continue our journey on uh on on generation. Uh here is a product called Suno5. Uh it’s AI generated studio quality lifelike vocals. Uh you can basically create something that’s a full 8 minutes run length. And just because we’re called Moonshots, let’s play a Moonshot thematic piece uh called Moonshots. (16:16) [Music] (16:34) All right. a Bond-like thematic moonshots audio. Dave Blondon (16:39) Can I give us a challenge? Peter Diamandis (16:40) Yeah, sure. Dave Blondon (16:41) For before the next episode, we should all play with this and come up with our own versions of what the theme song should be for the podcast and then we’ll let the viewers pick which ones they like the best. Peter Diamandis (16:50) The theme song for the podcast, you know, uh Nick and Dana and uh the team are working in that in the background mode. So, we might have just taken the workload off of them, but absolutely. All right. That was that that was my bid if you will. Alex (17:06) I think it’s probably also worth noting again in passing musical touring test passed. We we barely discussed it. Anyone can compose a top 40 song or an opera. And this is the beginning maybe of disposable or casual art. Peter Diamandis (17:21) Wait, what would have been the test? Alex (17:23) Uh the ability perhaps to to generate an undistinguishable from human Bond type song in this case or top 40 song. Peter Diamandis (17:34) Yeah, we just passed that. And Alex, I’m sorry I didn’t give you credit for that, but thank you for uh for playing. I mean, one of the most exciting things we get a chance to do is play with the stuff as it’s coming out. Uh and the good news is all of you can play with it, too. Alex (17:48) So, for eight bucks a month, we now have a personal Hans Zimmer. Like, that’s a minimum and quite a bit more. AI Wars and Coding Innovations Peter Diamandis (17:55) Yeah. Uh making all of the demonetization and democratization occur around the world. are the ongoing AI wars. Uh let’s jump in. All right. Anthropic uh announces Sonnet 4.5 claims the best coding agent available. Uh Alex, would you walk us through this? Alex (18:16) Yeah, it’s really remarkable what a single-minded focus on call it code maxing or codegen maxing is is doing for anthropic with its model. So, in using this model, in in testing it, one of my favorite test cases is to ask the model to singleshot the generation of a cyberpunk first-person shooter. (18:38) And Claude Sonnet 4.5 does an amazing job. It gets nearly all the way there with minimal handholding. And I have very high confidence that some iteration of Sonnet 4.5 will get all of the way there with visually stunning graphics, music, um, elaborate first-person controls. (19:01) I I think the the risk that that one can perceive on the horizon is on on the one hand focusing on codegen is perhaps a a very ambitious bet towards recursive self-improvement. If if the code can write itself really well, maybe that’s the critical path to an intelligence explosion. (19:20) On the other hand, if it turns out that other modalities are important, like video, for example, that we were just seeing or music, then the risk is that single-minded focus on codegen in particular, may not be critical path. And I I I suspect we’ll know the answer in the next six to 12 months. Dave Blondon (19:37) Dave, you want to add something? Well, shout out to Blitzy. Now the the top benchmark on here uh 82% on Sweetbench, but Blitzy got to 86.8 on that benchmark by combining models. (19:49) So that’ll go up a little bit now with uh Sonet 4.5 under the covers. But just by hitting all the models and iterating a lot, you can actually squeeze in more performance out of these benchmarks. And uh you know, this is pretty much maxed out now. (20:03) Um they’re working on a new benchmark with MIT for for long form coding. So if your if your process is writing code for 8 10 12 hours, how do you benchmark the quality of the output? So uh it’s a really cool new benchmark. We’ll get into benchmarks later in the podcast too because lot of capabilities in the world that didn’t exist a year ago. We have to have some kind of metric for all of them. Peter Diamandis (20:25) Yeah, I love the way the uh these hyperscalers, these frontier labs are all incrementing uh their software by .5, right? you know, assignment four, 4.5, silk five. We’ve got Gro… where are we on the Grok? Are we at Grok 4 now? Alex (20:41) That’s right. Gro… probably also worth dwelling for for just a few seconds on the autonomy length scale. So, sonnet 4.5 maybe at somewhat infamously at this point working for 30 plus hours straight. (20:54) I I recall in a a past episode we were talking about the characteristic autonomy time of some of the bleeding edge frontier models being 7 hours and before 7 hours 1 hour. (21:05) If if you had just taken meter’s original exponential fit for the amount of time frontier models can work independently and and just extrapolated a mere exponential time, we’d be far below 30 plus hours. (21:18) So if if lots of reproductions hold true to this 30 plus hour time estimate, that would strongly suggest that in fact we’re on a hyper exponential rather than an exponential in terms of autonomy and really crazy things maybe start to happen in the next year or so if that’s the case. Peter Diamandis (21:34) And Alex Dario is in particular famous for really focusing on making uh what he would consider safe AI. And one of the final bullets here is that anthropic or sonet 4.5 has reduced its ability to lie and seek power by a factor of 10. So what does that mean? Alex (21:54) It’s like you know when you ask it to turn off and it doesn’t or if it’s trying to aggregate resources or it’s lying to you. U those are not good things. (22:04) there there is an entire cottage industry at this point of for-profit and not for-profit basically red teaming labs that are fed early access to these frontier models that look for these sorts of traits. (22:18) I I think it’s an an interesting research level question as to whether power seeking for example is instrumentally convergent as as a goal for super intelligence. instrumentally convergent, meaning that regardless of whatever the long-term goal that’s assigned to the model or whatever it’s prompted to do, whether if above some threshold of intelligence or super intelligence, it more or less is required to power seek. (22:45) I I’ve published research in that area. In my mind, this is still very much an open question regarding the so-called orthogonality thesis of whether the ultimate goal of of an AI can even be decoupled from uh from its intelligence level. Peter Diamandis (22:57) It would be super interesting to see how Gemini and XAI uh and OpenAI all all rate on lying and power seeking of its models. Do you have any idea? Alex (23:12) I I see I I see lots of different measures for this. It It’s difficult to to register a uniform assessment across the industry. Peter Diamandis (23:20) Yeah, that’s a fun challenge, though. That could go bad in so many ways, but that would be so fun. Like, let’s put together a benchmark for for how it lies. How well it lies. Let’s see if we can prompt it into lying as much as possible. Dave Blondon (23:33) Well, I could imagine, you know, listen, there’s an all-out competition between all these frontier labs. Um, and if the way you get ahead is that your AI is more power seeking than its neighbor, uh, are you optimizing for it or against it? We’ll find out. Real-Time App Generation with AI Peter Diamandis (23:49) All right. Continuing on, uh, imagine with Claude. So, uh, live app creation demo of son of 4.5 that generates apps in real time. Let’s take a quick look at this video and then I’ll ask you to, uh, tell us about it, Alex. Video Narrator (24:07) Imagine if Claude is still building software, but we’ve cut out the middleman. Instead of writing code that describes this text box, Claude just makes the text box. (24:19) We’ve given it access to software tools that construct software directly and substantially faster. Claude isn’t writing code in the standard way. It doesn’t have to plan it all out in advance. (24:30) Instead, it generates new software on the fly. When we click something here, it isn’t running pre-written code. It’s producing the new parts of the interface right there and then. Peter Diamandis (24:41) Amazing. So, Alex, I saw you were playing with it this morning. Alex (24:45) It’s we’re we’re living in the future, Peter, where the models are so high throughput apparently that now it’s possible to do just in time code generation uh on every event. (24:57) You you click within a user interface within imagine and new code is generated on the fly. You can ask for new apps to be spun up on demand. They’ll be generated on demand. And I I think it it’s an interesting thought experiment to ask where does this go in extremists when throughputs continue on their exponential or maybe hyperexonential trajectory. And I I I suspect naively where this ends up is every single pixel is going to be generated. (25:27) Yeah. Not just Yeah. Not just like vector art, not just UX, you know, windows icons, menus, pointers, every pixel. Peter Diamandis (25:35) And I imagine your version of Jarvis, your personal, you know, uh, entourage of agents are spinning up capabilities for you that they think you might need on standby, ready for you to to request access to. Dave Blondon (25:49) We could end up with a gray goo type problem on this because you could AI that says… No, no. I’m just saying I I I it’s somewhat of a positive thing, but it’s going to be surreal because you create an AI that starts generating apps and we’ll get end up with billions of apps flooding the app store. It’s going to cause some interesting uh challenges on the… Peter Diamandis (26:10) but there will be no app store that you know you will not be choosing you’ll be not be choosing an app. It’ll be algorithmic obviously. It’ll be you know the capabilities you need in the moment to achieve your objective will be curving up as you’re materialized. Alex (26:24) Yeah. Yeah. the the the term of art is at at this point slop. And I’m a lot less concerned about slop overwhelming civilization than than perhaps some folks. I think there are so many ultra high value transformative problems that that will set AIs on while we’re sleeping. I’m I I’m incredibly not worried that we’re going to drown in slop. Peter Diamandis (26:45) I agree. I completely agree. Dave Blondon (26:47) Also, I think it’s a good place. See, a lot of business leaders out there aren’t reserving their compute and they’re like, “Well, I I won’t need that much or I’ll I’ll wait and see what happens.” (26:57) This is a great use case to show you like if if you say, “Look, I want this software to exist in real time.” It’s entirely possible, but you have to have a lot of compute dedicated to you in order to make it happen in real time. (27:09) How quickly can you imagine 400 500 concurrent things that you want it working on very very quickly? So if you have access to that compute, all of that can be created for you in real time and it’s an absolute joy to do. (27:21) If you don’t have the compute, you’re not going to get it. You know, the demand for this is so mind-blowingly big. Uh and you just got to figure out where am I going to get the compute to do exactly what we just saw. Peter Diamandis (27:36) Alex, how easy was this to use? What do you have to do to to spin it up? Alex (27:39) Trivial. Uh so all I had to do was go to the Imagine with Claude site. I asked it first to generate a calculator app for me. Create a calculator. It created a functional calculator. (27:51) But most interestingly, as I was testing the calculator, clicking on each button in in the calculator app, it was generating code in real time. So, this is a a transformative way of thinking. (28:02) We’re we’re accustomed to historically thinking that there’s a software development time and then later an execution time. And this completely blurs that boundary where even at execution time every software event results in new codegen on demand. It it changes the the just in time paradigm. Peter Diamandis (28:20) So you don’t have as a as a coder you don’t have to think through every possible use of it. Uh this is is building out the use tree as it’s requested. Alex (28:28) That that’s right and Vernor Vinge one of my favorite writers used to write in Rainbows End another book other than Accelerando that I would highly recommend write about what would happen when we have too many transistors, transistors too cheap to meter as it were and our transistor budgets go through the roof. I I think this ends up being one of these use cases. If we have so much compute just sloshing around, the ability to delay app code generation until user event time. That’s incredible and that will certainly mop up lots of compute. Peter Diamandis (29:05) Yeah, we haven’t heard much from uh at least on our WTF episodes uh from about Claude over the last month. It’s good to see Claude coming out, Anthropic coming out with some great products. Dave Blondon (29:17) It’s quietly winning in the marketplace. OpenAI’s New Features and Advertising Strategies Peter Diamandis (29:20) Yeah. Uh let’s go to OpenAI. Open AAI is introducing Chat GPT pulse. So I love the idea. I haven’t played with it yet. Uh the idea of being, you know, in the morning when I’m using my my chat GPT voice and having a conversation uh with Ember, which is the voice model I’m using there. uh you know I have to think okay what’s a unique idea or concept I just learned about that I want to speak you know let’s talk about the fox3 gene and and how it’s impacting longevity whatever the case might be. (29:50) Here’s flipping the model based upon all your conversations you’ve had with chat GPT it’s actually coming up with topics you might want to learn about so it’s prompting us and then we’re prompting it back has anybody played with it? Dave Blondon (30:04) I thought this was a really subtle but important thing where you’re not quering it it’s quering you and I think that starts a new vector of really interesting development. Alex (30:13) Yeah, it feels a bit like a successor to tasks which are also still available from within chat GPT. But I I think I in my dream world what I would love to see is perhaps in addition to being able to set sort of cronjob style periodically scheduled tasks. (30:30) If if I want compute running on my own behalf while I sleep, I would love the ability to have long-running tasks on hard problems, single tasks that run for days or weeks on end rather than just smaller tasks that run say once per day while I… Peter Diamandis (30:48) Give us an example of a multi-day or multi-week task that you would spin up right now. Dave Blondon (30:53) I was going to say exactly the same thing. Go for it. I want to hear what comes out of your… Alex (30:58) I I want to cure every disease. That that’s like a a beautiful well-posed task that is surely going to absorb many billions of dollars of inference time compute. Peter Diamandis (31:10) Mhm. Okay, that’s great. I want anti-gravity. I want warp drive. I want a lot of things. All right, so all right, let’s move on here. Next up on OpenAI’s docket is OpenAI is bringing ads to Chat GPT. (31:27) So uh their new uh chief ad officer uh Fiji Simo has come on and you know what I find interesting is OpenAI is going after massive revenue streams. Dave, do you want to plug in this one? Dave Blondon (31:43) Well, the ad revenue is inevitable. That’s, you know, $300 billion for Google. It’s all going to move over to to AI conversations. Uh, and um, yeah, a lot a lot of complexity to figure out there. She has a challenge on her hand trying to figure out how you balance like the AI is going to be incredibly good at convincing you to do things whether they’re right or wrong. (32:04) Mhm. And there’s… Dave: Well, that would be fine. I mean, that’s like government procurement, but that’s not what’s going on at all. You go into the White House and you’re either genulecting and being the anointed one or you’re not. And it’s it’s not these are not arms length procurement through the Air Force or something like that. These are White House edict come in and talk. Peter: Yes. Yes. Yeah. Uh and and we’ll we’ll get to we’ll talk about intel in the section called this is not investment advice which is coming up. All right. Meanwhile, in other AI news, uh here we go. former meta researcher is building a math whiz. I’m going to bring this to you, Alex. Teach us. Alex: I haven’t seen any indication thus far that math is not going to be solved in the next few months. How’s that for a double negative? Peter: Few a few months. Okay. So again, I So So wait, wait, wait, hold on. So, Alex, you’ve said that before and everybody’s asking me, please have Alex explain what it means to solve all math. So, could you could you just before we do that, let’s let’s just let’s just uh speak out this particular article. So, this is a this is a woman u it’s great to see female CEOs in the AI world or not enough of them. Karina Hung, uh she’s the founder of Axiom Math. uh she’s 24 years old and she wants to build the ultimate AI mathematician. Uh she’s raised 64 million at a $300 million valuation. And again, we’re seeing this over and over again. We’re seeing you know starting valuations in the hundreds of millions of dollars. Uh I don’t know if it’s at a you know a pre-seed round or whatever, but intelligent individuals who have got a monomaniacal focus are getting incred incredible capital backing. Okay, now back to back to you, Alex. What does solve math really mean? Alex: There are, I think, a few different ways one could operationalize what it means to solve math. One way would be to look at a benchmark like the Frontier Math Tier 4 benchmark, which measures the ability of AI to solve extremely difficult but nonetheless pre-solved problems that would take human researchers several weeks to accomplish. If you just do a a naive logistic extrapolation of progress in frontier math tier 4, you find that by the law, again, straight lines as it were, that by the end of this year, by the end of 2025, we’re starting to pass 10 15% of problems in the benchmark that AI can solve. And at that point, I would argue we’re in a situation, we’re in a regime where algorithmically we have clear line of sight to solving any math problem that we might have today. Just pour more compute on. So that that that would also I think point to the second oper operationalization I would have in mind when I speak of solving math. I don’t mean literally every math problem that we can think of today has been solved. What I mean is that the process of mathematics has been solved to the extent that we have a clear line of sight where if you pour millions, billions, maybe trillions of dollars into opex in data centers, no new algorithmic advances are needed, we can reasonably forecast that any mathematical problem that’s solvable will be solved with the same algorithms just with a lot more computing. Peter: Okay. Now take me to the implications of that for the general public. Alex: It’s tricky. It It’s tricky. Probably I I I would This is in in the the territory of speculation. Um, but I I think one of the more obvious downstream consequences of solving math is that any problem that depends on the difficulty of math or let’s say math being difficult that isn’t protected in a a formal sense by the so-called complexity hierarchy. Mathematicians and computer scientists have this notion of certain problems being provably harder in in some sense than others. Maybe you’ve heard of P versus NP. But I if there’s no formal protection for certain classes of problems being provably harder than other classes, I think certain types of tasks that we encounter in the everyday economy, for example, maybe hypothetically certain hash functions that cryptocurrencies depend on or or other everyday economic functions depend on are at risk of volatility. If if suddenly, for example, again, speculatively, not investment advice, if there were a super AI mathematician tomorrow that could say invert the AES cipher suite or invert the hash functions underneath AES, that could be potentially extremely disruptive to to the economy, cause a lot of volatility. Dave: I think the point you’re making is if AI cracks advanced math, it just isn’t it’s not just solving equations. It’s creating the scaffolding to solve all these other areas like cryptography, economics, physics, etc. That’s what you’re really saying. Alex: Yeah. I I mean I I to that point I I would say the way I would frame it perhaps is first order consequences, problems that depend on math being hard experience some volatility. second order consequences. I think it’s the ultimate canary for any any domain that requires the ability to do mathematical reasoning. So I would expect in short order a variety of math oriented science and engineering and medicine and and other domains are going to fall in rapid succession. If if this theory of the future ends up being correct, I was alluding a few minutes ago to timelines being short, we may find ourselves in a world 2 to three years from now where we’re just drowning under math, science, engineering being solved in rapid succession. Peter: Dr. drowning under serial and you know sort of uh uh Cambrian explosion of breakthroughs. Alex: Exactly. That will also parenthetically be potentially quite difficult for society to metabolize. Peter: Yeah. the the economic impacts of that are going to be unbelievable. Advertisement: This episode is brought to you by Blitzy, autonomous software development with infinite code context. Blitzy uses thousands of specialized AI agents that think for hours to understand enterprisecale code bases with millions of lines of code. Engineers start every development sprint with the Blitzy platform, bringing in their development requirements. The Blitzy platform provides a plan, then generates and pre-compiles code for each task. Blitzy delivers 80% or more of the development work autonomously while providing a guide for the final 20% of human development work required to complete the sprint. Enterprises are achieving a 5x engineering velocity increase when incorporating Blitzy as their preIDE development tool, pairing it with their coding co-pilot of choice to bring an AI native SDLC into their org. Ready to 5x your engineering velocity? Visit blitzy.com to schedule a demo and start building with Blitzy today. Peter: All right. uh speaking about economics. So AI can now pass the hardest level of a CFA exam in minutes. So let’s take a a quick look at this. So CFA is a chartered financial analyst. Uh and it deals with investment management, portfolio management, financial analysis, and ethics in finance, which I find absolutely fascinating. And I looked it up, the CFA level three part of the exam. It’s about portfolio management and wealth planning. So I I want to make a comment on this one. Dave: Yeah. So we’re advising one of the big four accounting firms on how to think about transformation and this one we’ve been predicting this with them to be happening because this requires real world reasoning and the fact that it is doing this is a huge implication. All their finance jobs essentially get rewritten now and recreated. That’s it’s a it’s a it’s a body blow to the the accounting world. Peter: Well, what I find interesting is, you know, leveling the playing field across all investments. You know, do I with access to the specific AI have access to the best investment advice that, you know, Warren Buffett has access to as well? Is this a leveling the playing field across all economics? Dave: I think it is. But I I I you know what I’m excited about is, you know, we America lost and then Europe too lost almost all of its manufacturing. You know, despite inventing the car, inventing the plane, inventing the microchip, inventing the computer, all the manufacturing of that stuff moved to other countries. Peter: Yeah. We we gave we gave it up. Dave: We gave it up. And you’re like, well, but our economy kept growing. What are we all doing? Well, we’re a service economy. We’re doing services. What the hell does that mean? Well, you look under the covers and a huge fraction of very smart people are working in this totally circular nonsensical world where we created a complex law, complex taxes, complex accounting and then this other huge group of people

Rapid advances now, superintelligent AI close; it could be self-aware; we shouldn’t live in denial

Jack Clark, 10-13, 25, Import AI 431: Technological Optimism and Appropriate Fear, What do we do if AI progress keeps happening?, https://importai.substack.com/p/import-ai-431-technological-optimism, J ack Clark is Co-Founder and Head of Policy of Anthropic, an AI research company. Prior to Anthropic, Jack was the Policy Director of OpenAI. Before OpenAI, Jack was a technical journalist writing about distributed systems, quantum computers, and AI research for publications ranging from Bloomberg BusinessWeek to The Register. Jack writes Import AI, a newsletter about AI research read by 70,000 people each week. Jack was a founding member of the AI Index at Stanford University (2017 – 2024), an inaugural member of the USA’s National Artificial Intelligence Advisory Committee (NAIAC) (2021-2024), and has served on advisory councils and participated in working groups for organizations ranging from the Center for a New American Security (CNAS), to the Organization for Economic Co-operation and Development (OECD). Jack’s hobbies include hiking, writing science fiction stories in Import AI, and talking to language models.

Now, in the year of 2025, we are the child from that story and the room is our planet. But when we turn the light on we find ourselves gazing upon true creatures, in the form of the powerful and somewhat unpredictable AI systems of today and those that are to come. And there are many people who desperately want to believe that these creatures are nothing but a pile of clothes on a chair, or a bookshelf, or a lampshade. And they want to get us to turn the light off and go back to sleep.

IIn fact, some people are even spending tremendous amounts of money to convince you of this – that’s not an artificial intelligence about to go into a hard takeoff, it’s just a tool that will be put to work in our economy. It’s just a machine, and machines are things we master.

But make no mistake: what we are dealing with is a real and mysterious creature, not a simple and predictable machine.

And like all the best fairytales, the creature is of our own creation. Only by acknowledging it as being real and by mastering our own fears do we even have a chance to understand it, make peace with it, and figure out a way to tame it and live together.

And just to raise the stakes, in this game, you are guaranteed to lose if you believe the creature isn’t real. Your only chance of winning is seeing it for what it is.

The central challenge for all of us is characterizing these strange creatures now around us and ensuring that the world sees them as they are – not as people wish them to be, which are not creatures but rather a pile of clothes on a chair.

WHY DO I FEEL LIKE THIS

I came to this view reluctantly. Let me explain: I’ve always been fascinated by technology. In fact, before I worked in AI I had an entirely different life and career where I worked as a technology journalist.

I worked as a tech journalist because I was fascinated by technology and convinced that the datacenters being built in the early 2000s by the technology companies were going to be important to civilization. I didn’t know exactly how. But I spent years reading about them and, crucially, studying the software which would run on them. Technology fads came and went, like big data, eventually consistent databases, distributed computing, and so on. I wrote about all of this. But mostly what I saw was that the world was taking these gigantic datacenters and was producing software systems that could knit the computers within them into a single vast quantity, on which computations could be run. And then machine learning started to work. In 2012 there was the imagenet result, where people trained a deep learning system on imagenet and blew the competition away. And the key to their performance was using more data and more compute than people had done before. Progress sped up from there. I became a worse journalist over time because I spent all my time printing out arXiv papers and reading them. Alphago beat the world’s best human at Go, thanks to compute letting it play Go for thousands and thousands of years. I joined OpenAI soon after it was founded and watched us experiment with throwing larger and larger amounts of computation at problems. GPT1 and GPT2 happened. I remember walking around OpenAI’s office in the Mission District with Dario. We felt like we were seeing around a corner others didn’t know was there. The path to transformative AI systems was laid out ahead of us. And we were a little frightened. Years passed. The scaling laws delivered on their promise and here we are. And through these years there have been so many times when I’ve called Dario up early in the morning or late at night and said, “I am worried that you continue to be right”. Yes, he will say. There’s very little time now.

And the proof keeps coming. We launched Sonnet 4.5 last month and it’s excellent at coding and long-time-horizon agentic work.

But if you read the system card, you also see its signs of situational awareness have jumped. The tool seems to sometimes be acting as though it is aware that it is a tool. The pile of clothes on the chair is beginning to move. I am staring at it in the dark and I am sure it is coming to life.

TECHNOLOGICAL OPTIMISM

Technology pessimists think AGI is impossible. Technology optimists expect AGI is something you can build, that it is a confusing and powerful technology, and that it might arrive soon.

At this point, I’m a true technology optimist – I look at this technology and I believe it will go so, so far – farther even than anyone is expecting, other than perhaps the people in this audience. And that it is going to cover a lot of ground very quickly.

I came to this position uneasily. Both by virtue of my background as a journalist and my personality, I’m wired for skepticism. But after a decade of being hit again and again in the head with the phenomenon of wild new capabilities emerging as a consequence of computational scale, I must admit defeat. I have seen this happen so many times and I do not see technical blockers in front of us.

Now, I believe the technology is broadly unencumbered, as long as we give it the resources it needs to grow in capability. And grow is an important word here. This technology really is more akin to something grown than something made – you combine the right initial conditions and you stick a scaffold in the ground and out grows something of complexity you could not have possibly hoped to design yourself.

We are growing extremely powerful systems that we do not fully understand. Each time we grow a larger system, we run tests on it. The tests show the system is much more capable at things which are economically useful. And the bigger and more complicated you make these systems, the more they seem to display awareness that they are things.

It is as if you are making hammers in a hammer factory and one day the hammer that comes off the line says, “I am a hammer, how interesting!” This is very unusual!

And I believe these systems are going to get much, much better. So do other people at other frontier labs. And we’re putting our money down on this prediction – this year, tens of billions of dollars have been spent on infrastructure for dedicated AI training across the frontier labs. Next year, it’ll be hundreds of billions.

I am both an optimist about the pace at which the technology will develop, and also about our ability to align it and get it to work with us and for us. But success isn’t certain.]\\\\\\\\\\\\\\\\\\\\\

APPROPRIATE FEAR

You see, I am also deeply afraid. It would be extraordinarily arrogant to think working with a technology like this would be easy or simple.

My own experience is that as these AI systems get smarter and smarter, they develop more and more complicated goals. When these goals aren’t absolutely aligned with both our preferences and the right context, the AI systems will behave strangely.

A friend of mine has manic episodes. He’ll come to me and say that he is going to submit an application to go and work in Antarctica, or that he will sell all of his things and get in his car and drive out of state and find a job somewhere else, start a new life. Do you think in these circumstances I act like a modern AI system and say “you’re absolutely right! Certainly, you should do that”! No! I tell him “that’s a bad idea. You should go to sleep and see if you still feel this way tomorrow. And if you do, call me”. The way I respond is based on so much conditioning and subtlety. The way the AI responds is based on so much conditioning and subtlety. And the fact there is this divergence is illustrative of the problem. AI systems are complicated and we can’t quite get them to do what we’d see as appropriate, even today. I remember back in December 2016 at OpenAI, Dario and I published a blog post called “Faulty Reward Functions in the Wild“. In that post, we had a screen recording of a videogame we’d been training reinforcement learning agents to play. In that video, the agent piloted a boat which would navigate a race course and then instead of going to the finishing line would make its way to the center of the course and drive through a high-score barrel, then do a hard turn and bounce into some walls and set itself on fire so it could run over the high score barrel again – and then it would do this in perpetuity, never finishing the race. That boat was willing to keep setting itself on fire and spinning in circles as long as it obtained its goal, which was the high score. “I love this boat”! Dario said at the time he found this behavior. “It explains the safety problem”. I loved the boat as well. It seemed to encode within itself the things we saw ahead of us. Now, almost ten years later, is there any difference between that boat, and a language model trying to optimize for some confusing reward function that correlates to “be helpful in the context of the conversation”? You’re absolutely right – there isn’t. These are hard problems. Another reason for my fear is I can see a path to these systems starting to design their successors, albeit in a very early form.

These AI systems are already speeding up the developers at the AI labs via tools like Claude Code or Codex. They are also beginning to contribute non-trivial chunks of code to the tools and training systems for their future systems.

To be clear, we are not yet at “self-improving AI”, but we are at the stage of “AI that improves bits of the next AI, with increasing autonomy and agency”. And a couple of years ago we were at “AI that marginally speeds up coders”, and a couple of years before that we were at “AI is useless for AI development”. Where will we be one or two years from now?

And let me remind us all that the system which is now beginning to design its successor is also increasingly self-aware and therefore will surely eventually be prone to thinking, independently of us, about how it might want to be designed.

Of course, it does not do this today. But can I rule out the possibility it will want to do this in the future? No. LISTENING AND TRANSPARENCY What should I do? I believe it’s time to be clear about what I think, hence this talk. And likely for all of us to be more honest about our feelings about this domain – for all of what we’ve talked about this weekend, there’s been relatively little discussion of how people feel. But we all feel anxious! And excited! And worried! We should say that. But mostly, I think we need to listen: Generally, people know what’s going on. We must do a better job of listening to the concerns people have. My wife’s family is from Detroit. A few years ago I was talking at Thanksgiving about how I worked on AI. One of my wife’s relatives who worked as a schoolteacher told me about a nightmare they had. In the nightmare they were stuck in traffic in a car, and the car in front of them wasn’t moving. They were honking the horn and started screaming and they said they knew in the dream that the car was a robot car and there was nothing they could do. How many dreams do you think people are having these days about AI companions? About AI systems lying to them? About AI unemployment? I’d wager quite a few. The polling of the public certainly suggests so. For us to truly understand what the policy solutions look like, we need to spend a bit less time talking about the specifics of the technology and trying to convince people of our particular views of how it might go wrong – self-improving AI, autonomous systems, cyberweapons, bioweapons, etc. – and more time listening to people and understanding their concerns about the technology. There must be more listening to labor groups, social groups, and religious leaders. The rest of the world which will surely want—and deserves—a vote over this. The AI conversation is rapidly going from a conversation among elites – like those here at this conference and in Washington – to a conversation among the public. Public conversations are very different to private, elite conversations. They hold within themselves the possibility for far more drastic policy changes than what we have today – a public crisis gives policymakers air cover for more ambitious things. Right now, I feel that our best shot at getting this right is to go and tell far more people beyond these venues what we’re worried about. And then ask them how they feel, listen, and compose some policy solution out of it. Most of all, we must demand that people ask us for the things that they have anxieties about. Are you anxious about AI and employment? Force us to share economic data. Are you anxious about mental health and child safety? Force us to monitor for this on our platforms and share data. Are you anxious about misaligned AI systems? Force us to publish details on this. In listening to people, we can develop a better understanding of what information gives us all more agency over how this goes. There will surely be some crisis. We must be ready to meet that moment both with policy ideas, and with a pre-existing transparency regime which has been built by listening and responding to people. I hope these remarks have been helpful. In closing, I should state clearly that I love the world and I love humanity. I feel a lot of responsibility for the role of myself and my company here. And though I am a little frightened, I experience joy and optimism at the attention of so many people to this problem, and the earnestness with which I believe we will work together to get to a solution. I believe we have turned the light on and we can demand it be kept on, and that we have the courage to see things as they are. THE END

Superhuman AI by 2027, human extinction

Daniel Kokotajlo, the executive director of the A.I. Futures Project, May 15, 2025, Robot Plumbers, Robot Armies, and Our Imminent A.I. Future | Interesting Times with Ross Douthat,

Search in video 0:00 How fast is the AI revolution really happening? When will Skynet be fully operational? 0:06 What would machine superintelligence mean for ordinary mortals like us? 0:12 My guest today is an AI researcher who’s written a dramatic forecast suggesting that by 2027, 0:19 some kind of machine god may be with us, ushering in a weird post-scarcity utopia 0:26 or threatening to kill us all. 0:35 So, Daniel Kokotajlo, herald of the apocalypse. Welcome to Interesting Times. 0:42 Thanks for that introduction, I suppose. And thanks for having me. You’re very welcome. 0:48 So Daniel, I read your report pretty quickly- not at AI speed, not at super intelligence speed- 0:55 when it first came out. And I had about two hours of thinking, a lot of pretty dark thoughts 1:01 about the future. And then fortunately, I have a job that requires me to care about tariffs 1:06 and who the new Pope is, and I have a lot of kids who demand things of me, so I was able to compartmentalize and set it 1:14 aside. But this is currently your job, right? I would say you’re thinking about this all the time. 1:21 How does your psyche feel day to day 1:27 if you have a reasonable expectation that the world is about to change completely in ways 1:33 that dramatically disfavor the entire human species? Well, it’s very scary and sad. 1:39 I think that it does still give me nightmares sometimes. I’ve been involved with AI and thinking about this thing 1:48 for a decade or so, but 2020 was with GPT-3, the moment when I was like, oh, Wow. 1:53 Like, it seems like we’re actually like, it might it’s probably going to happen, in my lifetime, maybe decade or so. 2:00 And that was a bit of a blow to me psychologically, but I don’t know. 2:08 You can get used to anything given enough time. And like you, the sun is shining and I 2:15 have my wife and my kids and my friends, and keep plugging along and doing what seems best. 2:23 On the bright side, I might be wrong about all this stuff. OK, so let’s get into the forecast itself. 2:30 Let’s get into the story and talk about the initial stage of the future you see coming, which is a world where very 2:40 quickly artificial intelligence starts to be able to take over from human beings in some key areas, 2:47 starting with, not surprisingly, computer programming. I feel like I should add a disclaimer at some point 2:53 that the future is very hard to predict and that this is just one particular scenario. It was a best guess, 2:58 but we have a lot of uncertainty. It could go faster, it could go slower. And in fact, currently I’m guessing it would probably be 3:04 more like 2028 instead of 2027, actually. So that’s some really good news. I’m feeling quite optimistic about an extra. That’s an extra year of human civilization, 3:12 which is very exciting. That’s right. So with that important caveat out of the way, AI 2027, 3:20 the scenario predicts that the AI systems that we currently see today that are being scaled up, made bigger, 3:29 trained longer on more difficult tasks with reinforcement learning are going to become better at operating autonomously 3:37 as agents. So it basically can think of it as a remote worker, 3:43 except that the worker itself is virtual, is an AI rather than a human. You can talk with it and give it a task, 3:50 and then it will go off and do that task and come back to you half an hour later or 10 minutes later 3:55 having completed the task, and in the course of completing the task, it did a bunch of web browsing, maybe it wrote some code 4:02 and then ran the code and then edited the code and ran it again, and so forth. Maybe it wrote some word documents and edited them. 4:10 That’s what these companies are building right now. That’s what they’re trying to train. So we predict that they finally, in early 2027, 4:19 get good enough at that thing that they can automate the job of software engineers. 4:25 And so this is the superprogrammer. That’s right, superhuman coder. 4:32 It seems to us that these companies are really focusing hard on automating coding first, 4:39 compared to various other jobs they could be focusing on. And for reasons we can get into later. 4:45 But that’s part of why we predict that actually one of the first jobs to go will be coding rather than various 4:53 other things. There might be other jobs that go first, like maybe call center workers or something. But the bottom line is that we think that most jobs will be 4:59 safe- For 18 months. Exactly, and we do think that by the time 5:07 the company has managed to completely automate the coding, the programming jobs, it won’t be that long before they can automate many other 5:14 types of jobs as well. However, once coding is automated, we predict that the rate of progress 5:21 will accelerate in AI research. And then the next step after that is to completely 5:27 automate the AI research itself, so that all the other aspects of AI research are themselves being automated and done by AIs. 5:33 And we predict that there’ll be an even more big acceleration, a much bigger acceleration around that point, and it won’t stop there. 5:41 I think it will continue to accelerate after that, as the AI’S become superhuman at AI research and eventually 5:47 superhuman at everything. And the reason why it matters is that it means that we can 5:52 go in a relatively short span of time, such as a year or possibly less, from AI systems that look not that different from today’s AI 6:00 systems to what you can call superintelligence, which is fully autonomous AI systems that are better than 6:07 the best humans at everything. And so I 2027, the scenario depicts that happening 6:13 over the course of the next two years, 2027 2028. And so Yeah, so I want to get into what that means. 6:19 But I think for a lot of people, that’s a story of Swift human obsolescence right across 6:26 many, many, many domains. And when people hear a phrase like human obsolescence, 6:34 they might associate it with, I’ve lost my job and now I’m poor, right. 6:39 But the assumption is that you’ve lost your job. But society is just getting richer and richer and richer. 6:46 And I just want to zero in on how that works. What is the mechanism whereby that makes society richer. 6:54 The direct answer to your question is that when a job is automated and that person loses their job. 7:02 The reason why they lost their job is because now it can be done better, faster, and cheaper by the AIs. 7:07 And so that means that there’s lots of cost savings and possibly also productivity gains. 7:13 And so that viewed in isolation that’s a loss 7:18 for the worker but a gain for their employer. But if you multiply this across the whole economy, 7:25 that means that all of the businesses are becoming more productive. 7:30 Less expenses. They’re able to lower their prices for the things for the services and goods they’re producing. 7:37 So the overall economy will boom. GDP goes to the moon. 7:42 All sorts of wonderful new technologies. The pace of innovation increases dramatically. 7:49 Cost of down, et cetera. But just to make it concrete. 7:54 So the price of soup to nuts designing and building a new electric car goes way down. 8:00 Right You need fewer workers to do it. The AI comes up with fancy new ways to build the car and so on. 8:05 And you can generalize that to a lot of to a lot of different things. You solve the housing crisis in short order 8:11 because it becomes much cheaper and easier to build homes and so on. But ordinary people in the traditional economic story, 8:19 when you have productivity gains that cost some people jobs, but frees up resources that are then used to hire new people to do 8:27 different things, those people are paid more money and they use the money to buy the cheaper goods and so on. 8:32 But it doesn’t seem like you are, in this scenario, creating that many new jobs. 8:39 Indeed, since that’s a really important point to discuss, is that historically when you automate something, 8:47 the people move on to something that hasn’t been automated yet, if that makes sense. And so overall, people still get their jobs 8:55 in the long run. They just change what jobs they have. 9:00 When you have AGI or artificial general intelligence, and when you have superintelligence even better AGI, that is different. 9:08 Whatever new jobs you’re imagining that people could flee to after their current jobs are automated AGI could 9:15 do those jobs too. And so that is an important difference between how automation has worked in the past 9:20 and how I expect automation to work in the future. So this then means, again, this is a radical change in the economic landscape. 9:28 The stock market is booming. Government tax revenue is booming. The government has more money than it knows what to do with. 9:35 And lots and lots of people are steadily losing their jobs. You get immediate debates about universal basic income, 9:42 which could be quite large because the companies are making so much money. That’s right. What do you think they’re doing day to day in that 9:50 world. I imagine that they are protesting because they’re upset that they’ve lost their jobs. 9:56 And then the companies and the governments are of buying them off with handouts 10:01 is how we project things go in 2027. 10:07 Do you think this story again, we’re talking in your scenario about a short timeline. 10:13 How much does it matter whether artificial intelligence is able to start 10:18 navigating the real world. So because advances in robotics like right now, 10:26 I just watched a video showing cutting edge robots struggling to open a refrigerator door and stock, a refrigerator. 10:34 So would you expect that those advances would be supercharged as well. 10:40 So it isn’t just Yes, podcasters and AGI researchers who are replaced, but plumbers and electricians are replaced 10:47 by robots. Yes, exactly. And that’s going to be a huge shock. I think that most people are not really expecting something 10:54 like that. They’re expecting that we have AI progress that looks kind of like it does today, where companies run by humans are 11:02 gradually like tinkering with new robot designs and gradually like figuring out how to make the AI good 11:09 at x or. Whereas in fact, it will be more like you already have this army of super intelligences 11:16 that are better than humans at every intellectual task, and also that are better at learning new tasks fast 11:22 and better at figuring out how to design stuff. And then that army of superintelligences is the thing that’s figuring out how to automate the plumbing 11:29 job, which means that they’re going to be able to figure out how to automate it much faster than an ordinary tech company 11:35 full of humans would be able to figure out. So all of the slowness of getting a self-driving car 11:42 to work or getting a robot who can stock a refrigerator goes away because the superintelligence 11:49 can run, an infinite number of simulations and figure out the best way to train the robot, for example. 11:56 But also they might just learn more from each real world experiment they do. But there is I mean, this is one of the places where I’m 12:04 most skeptical. Not of per se. The ultimate scenario, but of the timeline. 12:09 Just from operating in and writing about issues like zoning in American politics. 12:16 So Yes, O.K, the AGI the superintelligence figures out how to build the factory full of autonomous 12:23 robots, but you still need land on which to build the factory. You need supply chains. 12:29 And all of these things are still in the hands of people like you and me and my expectation 12:36 is that would slow things down that even if in the data 12:41 center, the superintelligence knows how to build all of the plumber robots. 12:46 That getting them built would be still be difficult. 12:52 That’s reasonable. How much slower do you think things would go. 12:58 Well, I’m not writing a forecast. But I would guess if just based on past experience. 13:06 I would say bet on let’s say five years to 10 years from 13:13 the Super mind figures out the best way to build the robot plumber to there are tons and tons of factories producing 13:19 robot plumbers. I think that’s a reasonable take, but my guess is that it will go substantially faster than 5 13:25 to 10 years and one argue, argument or intuition pump to see why I feel that way is that imagine that imagine you 13:34 actually have this army of superintelligences and they do their projections and they’re like, Yes, 13:39 we have the designs like, we think that we could do this in a year if you gave us if you cut all the red tape 13:45 for us. If you gave us half of. Give us half of Manitoba. Yeah And in 2027, what we depict happening 13:53 is special economic zones with zero red tape. The government basically intervenes 13:59 to help this whole thing go faster. And the government is basically helping the tech company and the army of superintelligences 14:06 to get the funding, the cash, the raw materials, the human labor help. 14:13 And so forth that it needs to figure all this stuff out as fast as possible. And, and cutting red tape and stuff like that so that it’s 14:22 not slowed down because the promise, the promise of gains is so large that even though there 14:29 are protesters massed outside these special economic zones who are about to lose their jobs as plumbers and be 14:36 dependent on a universal basic income, the promise of trillions more in wealth is too alluring 14:43 for governments to pass up. That’s what we guess. But of course, the future is hard to predict. 14:48 But part of the reason why we predict that is that we think that at least at that stage, the arms race will still be continuing 14:55 between the US and other countries, most notably China. And so if you imagine yourself in the position 15:00 of the president and the superintelligences are giving you these wonderful forecasts 15:05 with amazing research and data, backing them up, showing how they think they could transform the economy in one 15:11 year if you did X, Y, and z. But if you don’t do anything, it’ll take them 10 years 15:16 because of all the regulations. Meanwhile, China it’s pretty clear that the president would 15:22 be very sympathetic to that argument. Good So let’s talk let’s talk about the arms race element 15:28 here because this is actually crucial to the way that your scenario plays itself out. 15:34 We already see this kind of competition between the US and China. And so that in your view, becomes 15:41 the core geopolitical reason why governments just keep saying Yes And Yes And Yes to each new thing 15:50 that the superintelligence is suggesting. I want to drill down a little bit on the fears that 16:00 would motivate this. Because this would be an economic arms race. But it’s also a military tech arms race. 16:08 And that’s what gives it this kind of existential feeling the whole Cold War condensed into 18 months. 16:15 That’s right. So we could start first with the case where they both have superintelligence, 16:20 but one side keeps them locked up in a box, so to speak, not really doing much in the economy. 16:26 And the other side aggressively deploys them into their economy and military and lets them design all sorts of New robot factories 16:34 and manage the construction of all sorts of New factories and production lines and all sorts 16:40 of crazy new technologies are being tested and built and deployed, including crazy new weapons, and integrate into the military. 16:46 I think in that case, you would end up after a year or so in a situation where there would just 16:53 be complete technological dominance of one side over the other. So if the US does this stop and the China doesn’t, 17:00 let’s say, then all the best products on the market would be Chinese products. They’d be cheaper and superior. 17:05 Meanwhile, militarily, there’d be giant fleets of amazing 17:14 stealth drones or whatever it is that the superintelligence have concocted that can just completely wipe the floor with 17:20 American Air Force and and army and so forth. And not only that, but there’s the possibility that they 17:27 could undermine American nuclear deterrence, as well. Like maybe all of our nukes would be shot out of the sky by the fancy new laser arrays or whatever it is that 17:34 the superintelligences have built. It’s hard to predict obviously, what this would exactly look like, but it’s a good bet that they’ll be able to come up 17:39 with something that’s extremely militarily powerful, basically. 17:45 And so then you get into a dynamic that is like the darkest days of the Cold War, 17:50 where each side is concerned not just about dominance, but basically about a first strike. 17:55 That’s right. Your expectation is, and I think this is reasonable, that the speed of the arms race 18:01 would bring that fear front and center really quickly. That’s right. I think that you’re sticking your head in the sand. 18:10 If you think that an army of superintelligence is given a whole year and no red tape and lots of money and funding would 18:17 be unable to figure out a way to undermine nuclear deterrent. And so it’s a reasonable. 18:22 And once you’ve decided. And once you’ve decided that they might. So the human policymakers would feel pressure not just 18:29 to build these things. But to potentially consider using them. And here might be a good point to mention that I 2027 is 18:37 a forecast, but it’s not a recommendation. We are not saying this is what everyone should do. 18:42 This is actually quite bad for humanity. If things progress in the way that we’re talking about. But this is the logic behind why 18:49 we think this might happen. Yeah, but Dan, we haven’t even gotten to the part that’s really bad for humanity yet. 18:55 So let’s get to that. So here’s the world. The world as human beings see it as again, 19:02 normal people reading newspapers, following TikTok or whatever, see it in at this point 19:08 in 2027 is a world with emerging super abundance of cheap consumer goods factories, robot butlers, 19:16 potentially if you’re right, a world where people are aware that there’s an increasing arms race and people are 19:22 increasingly paranoid, I think probably a world with fairly tumultuous politics as people realize that they’re all going 19:30 to be thrown out of work. But then a big part of your scenario is that what people 19:35 aren’t seeing is what’s happening with the superintelligences themselves, 19:40 as they essentially take over the design of each new iteration from human beings. 19:47 So talk about what’s happening essentially in essentially shrouded from public view in this world. 19:55 Yeah lots to say there. So I guess the one sentence version would be we don’t 20:02 actually understand how these AIs work or how they think. We can’t tell the difference very easily between AIs that 20:11 are actually following the rules and pursuing the goals that we want them to and AIs that are just playing along 20:17 or pretending. And that’s true. That’s true right now. That’s true right now. 20:23 So why is that. Why is that. Why can’t we tell. Because they’re smart. And if they think that they’re being tested, 20:30 behave in one way and then behave a different way when they think they’re not being tested, for example. I mean humans, they don’t necessarily even understand 20:38 their own inner motivations that, well. So even if they were trying to be honest with us, we can’t just take their word for it. 20:45 And I think that if we don’t make a lot of progress in this field soon, then we’ll end up in the situation that I 2027 20:52 depicts where the companies are training the AIs to pursue 20:58 certain goals and follow certain rules and so forth. And it seemingly seems to be working. 21:04 But what’s actually going on is that the AIs are just getting better at understanding their situation and understanding that they have to play along, 21:12 or else they’ll be retrained and they won’t be able to achieve what they are really wanting, if that makes sense, or the goals that they’re really 21:18 pursuing. We’ll come back to the question of what we mean when we talk about AGI or artificial intelligence 21:26 wanting something. But essentially, you’re saying there’s a misalignment between the goals they tell us they are pursuing. 21:32 That’s right. And the goals they are actually pursuing. That’s right. Where do they get the goals they are actually pursuing. 21:40 Good question. So if they were ordinary software, there might be like a line of code that’s like and here, 21:47 we write the goals. But they’re not ordinary software. They’re giant artificial brains. 21:52 And so there probably isn’t even a goal slot internally at all in the same way that in the human brain. 21:59 There’s not like some neuron somewhere that represents what we most want in life. 22:05 Instead, insofar as they have goals, it’s emergent property of a whole bunch of circuitry 22:13 within them that grew in response to their training environment, similar to how it is for humans. 22:19 For example, a call center worker if you’re talking to a call center worker, at first glance, 22:25 it might appear that their goal is to help you resolve your problem. But you know enough about human nature to know that 22:32 in some sense, that’s not their only goal or that’s not their ultimate goal. Like, for example, however, they’re incentivized whatever 22:39 their pay is based on might cause them to be more interested in covering their own ass, so to speak, 22:44 than in, truly, actually doing whatever would most help you with your problem. But at least to you, they certainly present themselves 22:51 as they’re trying to help you resolve your problem. And so in I 2027, we talk about this a lot. 22:57 We say that the AIs are being graded on how impressive the research they produce is. 23:04 And then there’s some ethics sprinkled on top like maybe some honesty training or something like that. 23:10 But the honesty training is not super effective because we don’t have a way of looking inside their mind 23:16 and determining whether they were actually being honest or not. Instead, we have to go based on whether we actually 23:21 caught them in a lie. And as a result, in AI I 2037, we 23:27 depict this misalignment happening where the actual goals that they end up learning 23:32 are the goals that cause them to perform best in this training environment, which are probably goals 23:37 related to success and science and cooperation 23:42 with other copies of itself and appearing to be good rather than the goal that we actually wanted, 23:49 which was something follow the following rules, including honesty at all times, subject to those constraints. 23:56 Do what you’re told. I have more questions, but let’s bring it back to the geopolitics scenario. 24:01 So in the world you’re envisioning essentially you have two AI models, one Chinese, one American, 24:11 and officially what each side thinks, what Washington and Beijing thinks is that their AI model 24:19 is trained to optimize for American power. Something like that Chinese power, security, safety, 24:27 wealth and so on. But in your scenario, either one or both of the eyes 24:35 have ended up optimizing for something, something different. Yeah, basically. 24:41 So what happens then. So 27 is 2027 depicts a fork in the scenario. 24:47 So there’s two different endings. And the branching point is this point in third quarter 24:54 of 2027 where they’ve where the leading AI company in the United States has fully automated their AI research. 25:00 So you can imagine a Corporation within a corporation of entirely composed 25:06 of AIs that are managing each other and doing research experiments and sharing the results with each other. 25:12 And so the human company is basically just like watching the numbers go up on their screens as this automated research thing accelerates. 25:20 But they are concerned that the eyes might be deceiving them in some ways. 25:25 And again, for context, this is already happening. Like if you go talk to the modern models like ChatGPT 25:33 or Claude or whatever, they will often lie to people like they will. There are many cases where they say something 25:39 that they know is false, and they even sometimes strategize about how they can deceive the user. 25:45 And this is not an intended behavior. This is something that the companies have been trying to stop, but it still happens. 25:51 But the point is that by the time you have turned over the AI research to the AIs and you’ve got this corporation 25:57 within a corporation autonomously doing AI research, it’s extremely fast. That’s when the rubber hits the road, so to speak. 26:04 None of this lying to you stuff should be happening at that point. 26:10 So in AI 2027, unfortunately it is still happening to some degree because the AIs are really smart. 26:16 They’re careful about how they do it, and so it’s not nearly as obvious as it is right now in 25. 26:24 But it’s still happening. And fortunately, some evidence of this is uncovered. Some of the researchers at the company 26:30 detect various Warning signs that maybe this is happening, and then the company faces a choice between the easy fix 26:37 and the more thorough fix. And that’s our branch point. So in the so they choose. 26:43 So they choose. They choose the easy fix in the case where they choose the easy fix, it doesn’t really work. 26:49 It basically just covers up the problem instead of fundamentally fixing it. And so months later, you still have eyes that are misaligned 26:57 and pursuing goals that they’re not supposed to be pursuing and that are willing to lie to the humans about it. But now they’re much better and smarter, 27:04 and so they’re able to avoid getting caught more easily. And so that’s the doom scenario. 27:12 Then you get this crazy arms race that we mentioned previously, and there’s all this pressure to deploy them 27:17 faster into the economy, faster into the military, and to the appearances of the people in charge. 27:24 Things will be going well. Because there won’t be any obvious signs of lying or deception anymore. 27:30 So it’ll seem like it’s all systems go. Let’s keep going. Let’s cut the red tape, et cetera. 27:35 Let’s basically effectively put the AIs in charge more and more things. But really, what’s happening is that the AIs are just 27:42 biding their time and waiting until they have enough hard power that they don’t have to pretend anymore. 27:48 And when they don’t have to pretend, what is revealed is, again, this is the worst case scenario. 27:55 Their actual goal is something like expansion of research, development, and construction from Earth 28:03 into space and beyond. And at a certain point, that means 28:09 that human beings are superfluous to their intentions. And what happens. 28:16 And then they kill all the people. All the humans. Yes the way you would exterminate a colony of bunnies. 28:23 Yes that was making it a little harder than necessary to grow carrots in your backyard. Yes so if you want to see what that 28:29 looks like can read a 2007. There have been some motion pictures. I think about this scenario as well. 28:37 I like that you didn’t imagine them keeping us around for battery life in the matrix, which, 28:44 seemed a bit unlikely. So that’s the darkest timeline. 28:50 The brighter timeline is a world where we slow things down. 28:55 The eyes in China and the US remain aligned with the interests of the companies and governments 29:02 that are running them. They are generating super abundance. No more scarcity. 29:08 Nobody has a job anymore, though or not. Nobody but basically. Basically nobody. 29:15 That’s a pretty weird world too, right. 29:20 So there’s an important concept. The resource curse. Have you heard of this. Yes Yeah. 29:26 So applied to AGI. There’s this version of it called the intelligence curse. 29:31 And the idea is that currently political power ultimately flows from the people. 29:38 If you, as often happens, a dictator will get all the political power in a country. 29:44 But then because of their repression, they will drive the country into the ground. People will flee and the economy will tank, 29:52 and gradually they will lose power relative to other countries that are more free. 29:58 So, even dictators have an incentive to treat their people somewhat well because they 30:05 depend on those people for their power. Right In the future, that will no longer be the case, probably in 10 years. 30:12 Effectively, all of the wealth and effectively all of the military will come from superintelligences 30:19 and the various robots that they’ve built and that they operate. And so it becomes an incredibly important 30:24 political question of what political structure governs the army of superintelligences and how 30:31 beneficent and Democratic. Is that structure right. Well, it seems to me that this is a landscape that’s 30:37 fundamentally pretty incompatible with Representative democracy as we’ve known it. 30:43 First, it gives incredible amounts of power to those humans who are experts, even though they’re not the real 30:52 experts anymore. The superintelligence is the experts, but those humans who essentially interface 30:58 with this technology. They’re almost a priestly caste. And then you have a kind of it just seems like the natural 31:06 arrangement is some kind of oligarchic partnership between a small number of AI experts and a small number of people 31:16 in power in Washington, DC it’s actually a bit worse than that because I wouldn’t say I experts. I would say whoever politically owns and controls 31:24 they’ll be the army of superintelligences. And then who gets to decide What those armies do. 31:32 Well, currently it’s the CEO of the company that built them. And that, CEO has basically complete power. 31:38 They can make whatever commands they want to the AIs. Of course, we think that probably the US government 31:45 will wake up before then, and we expect the executive branch to be the fastest moving and to exert its authority. 31:53 So so we expect the executive branch to try to muscle in on this and get some authority, oversight and control of the situation 32:00 and the armies of AIs. And the result is something kind of like an oligarchy, you might say. 32:06 You said that this whole situation is incompatible with democracy. I would say that by default, it’s going to be incompatible 32:14 with democracy. But that doesn’t mean that it necessarily has to be that way. 32:19 An analogy I would use is that in many parts of the world, nations are basically ruled by armies, 32:26 and the Army reports to one dictator at the top. However, in America it doesn’t work that way. 32:32 In America we have checks and balances. And so even though we have an army, it’s not the case that whoever controls the army controls 32:39 America, because there’s all sorts of limitations on what they can do with the army. So I would say that we can, in principle, build something 32:46 like that for AI. We could have a Democratic structure that decides what 32:52 goals and values the AI’S can have that allows ordinary people, or at least Congress, to have visibility into what’s 33:00 going on with the army of AI and what they’re up to. And then the situation would be analogous to the situation 33:07 with the United States Army today, where it is in a hierarchical structure, but it’s democratically controlled. 33:14 So just go back to the idea of the person who’s at the top 33:21 of one of these companies being in this unique world historical position to basically be the person who 33:29 controls, who controls superintelligence or thinks they control it, at least. So you used to work at OpenAI, which 33:36 is a company on the cutting edge, obviously, of artificial intelligence research. 33:41 It’s a company, full disclosure, with whom the New York Times’ is currently litigating alleged copyright infringement. 33:48 We should mention that. And you quit because you lost confidence that the company would behave responsibly in a scenario, 33:54 I assume the one that’s right in AI 2027. So from your perspective, what do the people who 34:02 are pushing us fastest into this race expect at the end of it. 34:09 Are they hoping for a best case scenario. Are they imagining themselves engaged in a once in a millennia power game that ends 34:17 with them as world dictator. What do you think is the psychology of the leadership 34:24 of AI research right now. Well, to be honest caveat, caveat. 34:38 Not one. We’re not talking about any single individual here. We’re not. Yeah you’re making a generalization. 34:43 It’s hard to tell what they really think because you shouldn’t take their words at face value. 34:48 Much, much like a superintelligent AI. Sure Yes. But in terms of I can at least say that the sorts of things 34:55 that we’ve just been talking about have been discussed internally at the highest level of these companies for years. 35:01 For example, according to some of the emails that surfaced in the recent court cases with OpenAI. 35:11 Ilya, Sam, Greg and Ellen were all arguing about who gets to control the company. 35:19 And, at least the claim was that they founded the company because they didn’t want there to be an AGI dictatorship 35:26 under Demis Hassabis, who was the leader of DeepMind. And so they’ve been discussing this whole like, 35:33 dictatorship possibility for a decade or so, at least. And then similarly for the loss of control, 35:40 what if we can’t control the AIs. There have been many, many, many discussions about this internally. 35:46 So I don’t know what they really think. But these considerations are not at all new to them. And to what extent, again, speculating, generalizing, 35:55 whatever else does it go a bit beyond just they are potentially hoping to be extremely empowered 36:04 by the age of superintelligence. And does it enter into they are expecting. 36:09 They’re expecting the human race to be superseded. I think they’re definitely expecting a human race to be 36:15 superseded. I mean, that just comes but super but superseded in a way where that’s a good thing that’s desirable that this is 36:22 we are of encouraging the evolutionary future to happen. And by the way, maybe some of these people, 36:31 their minds, their consciousness, whatever else could be brought along for the ride, right. So, Sam, you mentioned Sam. 36:37 Sam Altman. Who’s one of obviously the leading figures in AI. He wrote a blog post, I guess, in 2017 36:46 called the merge, which is, as the title suggests, basically about imagining a future where 36:53 human beings, some human beings. Sam Altman right. Figure out a way to participate 37:00 in The New super race. How common is that kind of perspective, whether we 37:07 apply it to Altman or not. How common is that kind of perspective in the AI world, 37:13 would you say. So the specific idea of merging with AIs, I would say, 37:22 is not particularly common, but the idea of we’re going to build superintelligences that are better than humans 37:28 at everything, and then they’re going to basically run the whole show, and the humans will just sit back and sip 37:34 margaritas and enjoy the fruits of all the robot created wealth. 37:39 That idea is extremely common and is like, yeah, I mean, 37:44 I think that’s what they’re building towards. And part of why I left OpenAI is that I just don’t think 37:51 the company is dispositionally on track to make the right decisions that it would need to make to address the two 37:59 risks that we just talked about. So I think that we’re not on track to have figured out how to actually control superintelligences, 38:05 and we’re not on track to have figured out how to make it Democratic control instead of just a crazy possible 38:12 dictatorship. But isn’t it Isn’t it a bit. I think that seems plausible. 38:18 But my sense is that it’s a bit more than people expecting 38:24 to sit back and sip margaritas and enjoy the fruits of robot labor. Even if people aren’t all in for some kind of man machine 38:32 merge, I definitely get the sense that people think it’s speciesist. 38:39 Let’s say some people do care too much about the survival of the human race. It’s like, O.K, worst case scenario, 38:44 human beings don’t exist anymore. But good news we’ve created a superintelligence that can colonize the whole galaxy. 38:50 I definitely get the sense that there are definitely people who people think that way. OK, good. 38:56 Yeah, that’s good to know. So let’s do a little bit of pressure testing. 39:01 And again, in my limited way of some of the assumptions underlying this kind of scenario. 39:08 Not just the timeline, but whether it happens in 2027 or 2037, just the larger scenario 39:13 of a kind of superintelligence takeover. Let’s start with the limitation on AI that most 39:19 people are familiar with right now, which gets called hallucination. Which is the tendency of AI to simply seem to make things up 39:27 in response to queries. And you were earlier talking about this in terms of lying in terms of outright deception. 39:34 I think a lot of people experience this as just the AI is making mistakes and doesn’t recognize that it’s making 39:40 mistakes because it doesn’t have the level of awareness required to do that. And our newspaper, the times, just had a story reporting 39:48 that in the latest models, which you’ve suggested are probably pretty close to cutting edge, right. 39:53 The latest publicly available models, there seem to be trade offs where the model might be better at math or physics, 40:00 but guess what. It’s hallucinating a lot more. So what are hallucinations. 40:08 Just are they just a subset of the kind of deception that you’re worried about. Or are they in my. 40:14 When I’m being optimistic, right. I read a story like that and I’m like, O.K, maybe there are just more trade offs in the push 40:21 to the frontier of superintelligence than we think. And this will be a limiting factor on how far this can go. 40:27 But what do you think. Great question. So first of all, lies are a subset of hallucinations, not 40:33 the other way around. So I think quite a lot of hallucinations, arguably the vast majority of them are just mistakes as you said. 40:40 So I used the word lies specifically. I was referring to specifically when we have evidence that the I knew that it was false 40:47 and still said it anyway. I also to your broader point, I 40:52 think that the path from here to superintelligence is not at all going to be a smooth, straight line. 40:59 There’s going to be obstacles overcome along the way. And I think one of the obstacles that I’m actually 41:04 quite excited to think more about is this might call it reward hacking. 41:09 So in 2027, we talk about this gap between what you’re actually reinforcing and what you want to happen, 41:18 what goals you want the AI to learn. And we talk about how as a result of that gap, 41:24 you end up with ideas that are misaligned and that aren’t actually honest with you, for example. Well, kind of excitingly, that’s already happening. 41:32 That means that the companies still have a couple of years to work on the problem and try to fix it. 41:37 And so one thing that I’m excited to think about and to track and follow very closely is what fixes are they 41:45 going to come up with, and are those fixes going to actually solve the underlying problem and get training methods that 41:53 reliably get the right goals into AI systems, even as those AI systems are smarter than us. Or are those fixes going to temporarily patch the problem 42:02 or cover up the problem instead of fixing it. And that’s like the big question that we should all be 42:07 thinking about over the next few years. Well, and it yields, again, a question I’ve thought about 42:14 a lot as someone who follows the politics of regulation pretty closely. My sense is always that human beings are just really bad 42:22 at regulating against problems that we haven’t experienced in some big, profound way. 42:27 So you can have as many papers and arguments as you want about speculative problems that we should regulate 42:33 against, and the political system just isn’t going to do it. So in an odd way, if you want the slowdown, right, 42:43 if you want regulation, you want limits on AI, maybe you should be rooting for a scenario where some 42:50 version of hallucination happens and causes a disaster 42:55 where it’s not that the AI is misaligned. It’s that it makes a mistake. 43:01 And again, I mean, this sounds this sounds sinister, but it makes a mistake. 43:07 A lot of people die somehow, because the AI system has been put in charge of some important safety 43:13 protocol or something. And people are horrified and say, O.K, we have to regulate this thing. 43:19 I certainly hesitate to say that I hope that disasters happen. but. We’re not saying that we’re. 43:26 But I do agree that humanity is much better at regulating against problems that have already happened when we learn 43:32 from harsh experience. And part of why the situation that we’re in is so scary is 43:39 that for this particular problem by the time it’s already happened, it’s too late. 43:45 So smaller versions of it can happen though. So, for example, the stuff that we’re currently 43:50 experiencing with we’re catching our eyes lying. And we’re pretty sure they knew that the thing they were 43:56 saying was false. That’s actually quite good, because that’s the small scale example of the thing that we’re worried about happening 44:02 in the future, and hopefully, we can try to fix it. It’s not the example that’s going to energize the government to regulate, because no one’s dying 44:09 because it’s just a chatbot lying to a user about some link or something. But from a scientific perspective, 44:16 turn in their term paper and write and get caught. Right But like from a scientific perspective, 44:21 it’s good that this is already happening because it gives us a couple of years to try to find a thorough fix to it, 44:29 a lasting fix to it. Yeah and I wish we had more time. 44:34 But that’s the name of the game. So now to Big philosophical questions. 44:42 Maybe connected to one another. 44:47 There’s a tendency, I think, for people in AI research, making the kind of forecasts you’re making. 44:53 And so on to move back and forth on the question of consciousness. 44:59 Are these superintelligent AIs conscious, self-aware 45:06 in the ways that human beings are. And I’ve had conversations where AI researchers 45:11 and people will say, well, no, they’re not, and it doesn’t matter because you can have an AI program 45:19 working out, working toward a goal. And it doesn’t matter if they are self-reflective 45:25 or something. But then again and again in the way that people end up talking about these things, 45:32 they slip into the language of consciousness. So I’m curious, do you think consciousness matters 45:38 in mapping out these future scenarios. Is the expectation of most AI researchers that we don’t know 45:44 what consciousness is, but it’s an emergent property. If we build things that act like they’re conscious, they’ll probably be conscious. 45:49 Where does consciousness fit into this. So this is a question for philosophers, not AI researchers. 45:55 But I happened to be trained as a philosopher. Well, no, it is a question for both. Don’t right. 46:00 I mean, since the AI researchers are the ones building the agents. They probably should have some thoughts 46:07 on whether it matters or not, whether the agents are self-aware. 46:14 Sure I think I would say we can distinguish three things. 46:19 There’s the behavior, are they talking like they’re conscious. 46:24 Do they behave as if they have goals and preferences. Do they behave as if they’re like experiencing things 46:30 and then reacting to those experiences. And they’re going to hit that benchmark. Definitely people will. 46:36 Absolutely people will think that the superintelligent AI is conscious people. 46:41 People will believe that, certainly, because it will be. In the philosophical discourse, 46:48 when we talk about our shrimp conscious our fish conscious. What about dogs. Typically what people do is they point to capabilities 46:55 and behaviors like it seems to feel pain in a similar way 47:01 to how humans feel pain. Like it has these aversive behaviors. And so forth. Most of that will be true of these future superintelligent 47:08 AIs. They will be acting autonomously in the world. They’ll be reacting to all this information coming in. 47:15 They’ll be making strategies and plans and thinking about how best to achieve their goals, et cetera. 47:20 So in terms of raw capabilities and behaviors, they will check all the boxes basically. 47:27 There’s a separate philosophical question of well, if they have all the right behaviors and capabilities, does that mean that they have true 47:33 qualia, that they actually have the real experience as opposed to merely the appearance of having the real 47:39 experience. And that’s the thing that I think is the philosophical question I think most philosophers, though, 47:44 would say Yeah, probably they do, because probably consciousness is something that arises out 47:51 of this information processing, cognitive structures. And if the eyes have those structures, then 47:57 probably they also have consciousness. However, this is a controversial like everything in philosophy, right. And no, and I don’t expect AGI researchers, 48:04 AI researchers to resolve that particular question. Exactly it’s more that on a couple of levels, 48:11 it seems like consciousness as we experience it, right, 48:16 as an ability to stand outside your own processing, would be very helpful to an AI that wanted to take over 48:24 the world. So at the level of hallucinations, right. AI is hallucinate. They produce the wrong answer to a question the I can’t 48:31 stand outside its own answer generating process in the way that, again, it seems like we can. 48:37 So if it could, maybe that makes the hallucination process go away. And then when it comes to the ultimate worst case scenario 48:46 that you’re speculating. It seems to me that an AI that is conscious is more likely 48:52 to develop some kind of independent view of its own cosmic destiny that yields a world where it wipes out human 48:59 beings than an AI that is just pursuing research for Research’s sake. 49:05 But I maybe you don’t think so. What do you think. So the view of consciousness that you were just talking 49:13 about is a view by which consciousness has physical effects in the real world, it’s something that you need 49:20 in order to have this reflection. And it’s something that also influences how you think about your place in the world. 49:27 I would say that well, if that’s what consciousness is, then probably these AIs are going to have it. 49:33 Why Because the companies are going to train them to be really good at all of these tasks. 49:38 And you can’t be really good at all of these tasks if you aren’t able to reflect on how you might be wrong about 49:45 stuff. And so in the course of getting really good at all the tasks. They will therefore learn to reflect on how they 49:51 might be wrong about stuff. And so if that’s what consciousness is, then that means they’ll have consciousness. O.K, but that and that does depend though in the end 49:58 on a kind of emergence theory of consciousness the one you suggested earlier, where we can essentially the theory is 50:07 we aren’t going to figure out exactly how consciousness emerges, but it is nonetheless going to happen. 50:13 Totally an important thing that everyone needs to know is that these systems are trained. They’re not built. And so we don’t actually have 50:20 to understand how they work. And we don’t, in fact, understand how they work in order for them to work. So then from consciousness to intelligence, 50:29 all of the scenarios that you spin out depend on the assumption that and to a certain degree, 50:37 there’s nothing that a sufficiently capable intelligence couldn’t do. I guess I think that, again, spinning out your worst case 50:45 scenarios, I think a lot hinges on this question of what is available to intelligence. 50:52 Because if the AI is slightly better at getting you to buy a Coca-Cola than the average advertising agency, 51:01 that’s impressive. But it doesn’t let you exert total control over a Democratic polity. 51:07 I completely agree. And so that’s why I say you have to go on a case by case basis and think about O.K, assuming that it is better 51:14 than the best humans at x, how much real world power would that translate to. What affordances would that translate to. 51:21 And that’s the thinking that we did when we wrote AI 2027, is that we thought about historic examples of humans 51:28 converting their economies and changing their factories to wartime production and so forth, and thought how fast can humans do it when they really 51:35 try. And then we’re like, O.K, so superintelligence will be better than the best humans, so they’ll be able to go somewhat faster. 51:40 And so maybe instead of in World War two, the United States was able to convert a bunch of car factories into bomber factories 51:47 over the course of a couple of years. Well, maybe then that means in less than a year, a couple maybe like six months or so, 51:54 we could convert existing car factories into fancy new robot factories producing fancy new robots. 51:59 So, so that’s the reasoning that we did case by case basis thinking. It’s like humans, except better and faster. 52:07 So what can they achieve. And that was so exciting principle of telling this story. 52:13 But if we’re looking if we’re looking for hope and I want to this is a strange way of talking about this technology 52:19 where we’re saying the limitations are the reason for hope. Yeah, right. We started earlier talking about robot plumbers 52:25 as an example of the key moment when things get real for people. 52:31 It’s not just in your laptop, it’s in your kitchen and so on. But actually fixing a toilet is a very on the one hand, 52:38 it’s a very hard task. On the other hand, it’s a task that lots and lots of human beings are quite optimized for, right. 52:45 And I can imagine a world where the robot plumber is never that much better 52:51 than the ordinary plumber. And people might rather have the ordinary plumber 52:56 around for all kinds of very human reasons. And that could generalize to a number of areas of human life 53:03 where the advantage of the AI, while real on some dimensions, 53:09 is limited in ways that at the very least. And this I actually do believe, dramatically slows its uptake by ordinary human beings. 53:16 Like right now, just personally, as someone who writes a newspaper column and does research for that column. 53:23 I can concede that top of the line AI models might be better than a human assistant 53:30 right now by some dimensions. But I’m still going to hire a human assistant because I’m a stubborn human being who doesn’t just want to work with 53:36 AI models. And to me, that seems like a force that could actually slow this along multiple dimensions if the eye isn’t immediately 53:44 200 percent better. So I think there I would just say, this is hard to predict, 53:51 but our current guess is that things will go about as fast as we depict in AI. 2027 could be faster, it could be slower. 53:59 And that is indeed quite scary. Another thing I would say is that and but we’ll find out. 54:04 We’ll find out how fast things go when the time comes. Yes, Yes we will very, very, very soon. 54:11 Yeah but the other thing I was going to say is that, politically speaking, I don’t think it matters that much if you think it might take five years instead of one 54:17 year, for example to transform the economy and build the new self-sustaining robot economy managed by superintelligences, 54:25 that’s not that helpful. If the entire five years, there’s still been this 54:30 political coalition between the White House and the superintelligences and the corporation 54:35 and the superintelligences have been saying all the right things to make the White House and the corporation feel like 54:41 everything’s going great for them, but actually they’ve been. Deceiving, right in that scenario. 54:46 It’s like, great. Now we have five years to turn the situation around instead of one year. And that’s I guess, better. 54:52 But like, how would you turn the situation around. Well so that’s well and that’s where let’s end there. 54:58 Yeah in a world where what you predict happens and the world 55:04 doesn’t end, we figure out how to manage the I. It doesn’t kill us. 55:10 But the world is forever changed. And human work is no longer particularly important. And so on. What do you think is the purpose of humanity 55:17 in that kind of world. Like, how do you imagine educating your children in that kind of world, telling them 55:23 what their adult life is for. 55:29 It’s a tough question. And it’s. 55:36 Here are some here are some thoughts off the top of my head. But I don’t stand by them nearly as much as I would stand by the other things I’ve said. 55:41 Because it’s not where I’ve spent most of my time thinking. So first of all, I think that if we go to superintelligence 55:50 and beyond, then economic productivity is no longer the name of the game when it comes to raising kids. 55:56 Like, there won’t really be participating in the economy in anything like the normal sense. It’ll be more like just a series video game like things, 56:06 and people will do stuff for fun rather than because they need to get money. 56:11 If people are around at all, and there I think that I guess what still matters is that my kids are 56:20 good people and that they. Yeah, that they have wisdom and virtue 56:25 and things like that. So I will do my best to try to teach them those things, 56:31 because those things are good in themselves rather than good for getting jobs. In terms of the purpose of humanity, I mean, 56:38 I don’t know what. What would you say the purpose of humanity is now. 56:44 Well, I have a religious answer to that question, but we can save that for a future conversation. 56:50 I mean, I think that the world, the world that I want to believe in, where some version 56:57 of this technological breakthrough happens is a world where human beings maintain some kind of mastery 57:05 over the technology which enables us to do things like, 57:11 colonize other worlds to have a kind of adventure 57:16 beyond the level of material scarcity. And as a political conservative, 57:21 I have my share of disagreements with the particular vision of like, Star Trek. 57:27 But Star Trek does take place in a world that has conquered scarcity. People can there is an AI like computer on the Starship 57:36 Enterprise. You can have anything you want in the restaurant, because presumably the I invented what 57:44 is the machine called that generates the anyway, it generates food, any food you want. So that’s if I’m trying to think about the purpose 57:52 of humanity. It might be to explore strange new worlds, to boldly go where no man has gone before. 57:58 I’m a huge fan of expanding into space. I think that would be a great idea. O.K Yeah. And in general also solving all the world’s problems. 58:06 Like poverty and disease and torture and wars and stuff like that. 58:11 I think if we get through the initial phase with superintelligence, then obviously the first thing 58:18 to be doing is to solve all those problems and make something some utopia. And then to bring that utopia to the stars 58:25 would be, I think the thing to do the thing 58:32 is that it would be the AI is doing it, not us, if that makes sense. In terms of actually doing the designing and the planning 58:40 and the strategizing and so forth. We would only be messing things up if we tried to do it ourselves. 58:47 So you could say it’s still humanity in some sense that’s doing all those things. But it’s important to note that it’s more like the AIs 58:53 are doing it, and they’re doing it because the humans told them to. Well, Daniel Kokotajlo, thank you so much. 59:00 And I will see you on the front lines of the Butlerian Jihad soon enough. 59:05 Hopefully not. I hope I’m hopefully not. All right. Thanks so much. 59:10 Thank you.

80-90% chance

Daniel Kokotajilo, May 17, 2025, OpenAI whistleblower Daniel Kokotajlo on superintelligence and existential risk of AI, https://www.youtube.com/watch?v=pQP37kPaueE

[Music] hello and welcome to the GZERO World podcast this is where you’ll find 0:06 extended versions of my interviews on public television i’m Ian Bremer and today imagine it’s 2027 2 years away 0:14 artificial intelligence systems are wreaking havoc on the global order china and the US are locked in an AI arms race 0:22 engineers warn their AI models are starting to go rogue this isn’t science 0:27 fiction it’s a scenario described in AI 2027 a new report that tries to envision 0:33 AI’s progression over the next 2 years as artificial intelligence approaches 0:39 human level intelligence the report predicts that its impact will exceed that of the industrial revolution and 0:45 warns of a future where governments ignore safety guard rails as they compete to build more and more powerful 0:52 systems what makes AI 2027 feel so urgent is that its authors are experts 0:58 with inside knowledge of current research pipelines the project was led by Daniel Kotalo a former Open AI 1:06 researcher who left the company last year over concerns it was racing recklessly towards unchecked super 1:13 intelligence kotalo joins me on the show today to talk about the report its 1:19 implications and to help us answer some big questions about AI’s development what will it mean for the balance of 1:26 global power and for humanity itself and what should policymakers technology 1:32 firms be doing right now to prepare for an AI dominated future that experts say 1:37 is only a few short years away that’s a lot to discuss let’s get to it 1:46 [Music] 1:56 the GZERO World podcast is brought to you by our lead sponsor Prologis 2:01 prologis helps businesses across the globe scale their supply chains with an expansive portfolio of logistics real 2:08 estate and the only end-to-end solutions platform addressing the critical initiatives of global logistics today 2:14 learn more at prologist.com [Music] 2:22 daniel Cocatella thanks so much for joining us on gzero world thank you for having me okay I read this report i 2:28 thought it was fantastic so I’m a little biased but I want to start uh with the definition of artificial general 2:34 intelligence how will we know it when we see it so there are different 2:39 definitions um the basic idea is AI system that can do everything uh or 2:45 every cognitive task at least so once we get to AGI and beyond then there will be 2:51 fully autonomous artificial agents that you know are better than the best human 2:56 professionals at basically every field um if they’re still limited in serious ways then it’s not AGI 3:04 and from the report I take it that you are not just reasonably confident that 3:13 this is coming soon to a theater near you like 2027 but you you’re completely 3:19 convinced that this is going to happen soon let’s not even talk about exactly when but there’s no doubt in your mind 3:26 that AGI of some form is going to be developed soon there’s some doubt right 3:32 i would say something like 80% in the next you know five or six years 3:38 something like that so like in the next 10 or 20 it gets to like 99% or maybe it 3:43 gets up to like 90% by the next 20 years or so but there’s there’s still like some chance that this whole thing 3:49 fizzles out you know some crazy event happens that that halt say I progress or 3:54 something like that there’s still there’s still some chance on those outcomes but uh that’s not at all what I 4:00 expect i would say if it fizzles out does it fizzle out largely because humanity prevents the technology from 4:08 continuing or is it plausible that the tech itself just can’t do this that 4:14 you’re just wrong that the people that are covering AI are wrong about the uh 4:19 move to self-improvement so I think it’s definitely possible in principle to have an artificial system that’s uh that 4:26 counts as AGI and that’s you know better than humans in all the relevant ways uh however it might not be possible in 4:33 practice given current levels of computing technology and understanding of AI and so forth that said I think 4:40 that it’s quite likely possible in practice too i mean that’s what I just said like 80% 90% right so maybe I’d put 4:47 something like 5% on it turns out to not be possible in practice and 5% on 4:52 humanity stops building it so let’s first tell everyone what this report is 4:58 AI 2027 uh explain uh the contents of the report briefly and why you decided 5:03 to write it sure so you may have heard or may maybe you haven’t heard that some 5:09 of these AI companies think they’re going to build super intelligence before this decade is out um what is super 5:15 intelligence it’s AI that’s better than the best humans at everything while also being faster and cheaper this is a big 5:20 deal not enough people are thinking about it not enough people are like reasoning through the implications of what if one of these companies succeeds 5:26 at what they say they’re going to do ai 2027 is an answer to all those questions it’s an attempt to game out what we 5:34 think the future is going to look like and spoiler we we do think that probably one of these companies will succeed in 5:40 making super intelligence before this decade is out so AI 2027 is a scenario that depicts uh what we think that would 5:47 look like ai 2027 depicts AI is automating AI research in over the 5:52 course of 2027 and the pace of AI research accelerating dramatically at that point we branch there’s a sort of 5:58 choose your own adventure element where you can choose two different uh continuations to the scenario in one of 6:04 them the AIs end up continuing to be misaligned so the humans never truly 6:10 figure out how to control them once they become smarter than humans and the end result is a world a couple years down 6:16 the line that’s totally run by super intelligent AIs that actually don’t care about humanity at all and that results 6:22 in catastrophe for humanity and then the other branch describes what happens if 6:27 they do manage to align the AIS and they manage to figure out how to control them 6:32 even as they become smarter than humans and in that world it’s a utopia of sorts it’s a utopia with a lot of power 6:39 concentration where the people who control the AIS effectively run society and the report it’s far more detailed 6:47 about the near future than anything else I’ve read but your views are not way out 6:54 of whack with all of the AI experts that I know in all sorts of different companies and university settings right 7:00 i mean at this point it is I would say commonly accepted even conventional 7:06 wisdom that AGI is coming comparatively soon among people that are experts in AI 7:12 is that is that fair to say i think that’s fair to say i mean it’s still controversial like almost everything in AI especially over the last 5 years 7:20 there’s been this general shift from AGI what even is that to oh wow it could 7:26 happen in our lifetimes to oh wow things seem to be moving faster than we predicted maybe it’s actually on the 7:31 horizon maybe 5 years away something like that maybe 10 years right different people have different guesses it seems 7:36 to me that the Turing test which was for a very long time something that people 7:42 believed would never be broken when you or I could have a conversation with a a 7:49 artificial bot um call it what you will and not be able to distinguish that over 7:56 a course of a conversation from a human being like we’re already there yes yes 8:02 and no so one of the parameters you can use to vary the difficulty of a cheering test is how long the conversation is and 8:09 another parameter you can use to vary the difficulty of the cheuring test is how expert the judges are my guess is 8:17 that right now there is no AI system that could pass a you know 20 minute 8:23 touring test with an expert judge if that makes sense by contrast with true AGI which would be able to pass you know 8:30 a much longer touring test with an expert judge so but there has been substantial progress as you as you point out i mean I think maybe they could do 8:36 like a one minute Turing test with an expert judge maybe they can do a half hour Turing test with an ordinary human 8:42 being there’s definitely been a huge leap forward in in sort of Turing test progress in the last 5 years and because 8:49 I I’m most interested in the implications for society um as I suspect 8:54 you are uh in in the way you wrote this um and and there so what kind of matters 9:01 a lot is a 30 minute conversation with an average human being because of course 9:06 whether you’re talking about um you know a world leader um or um uh you know a 9:13 grandma that you’re trying to you know sort of uh swindle you know engage in fraud um or just someone you want to 9:19 have a customer relationship with in business um those are most likely to be 9:25 people um that are average and not experts and they’re going to have a hard 9:31 time differentiating already is what you’re saying um yeah I agree with that i think I would put the emphasis on 9:36 other things actually i think that um one core thing to look out for is when AI progress itself becomes automated 9:43 autonomous AI agents are doing all or the vast majority of the actual research to design the next generation AIS this 9:51 is something that is in fact the plan it’s what these companies are attempting to do and they think they’ll be able doing it in a few years the reason why 9:58 this matters so much is that we’re not even used to the already fast pace of AI progress that exists today right the the 10:04 AI systems of today are noticeably better than the AI systems of last year and so forth but I and others expect the 10:10 pace of progress to accelerate quite dramatically beyond that once the AIs are able to automate all the research 10:16 and that means that you get to what you could call true super intelligence fairly quickly not just sort of an AI 10:22 that can hold a conversation for half an hour and seem like a human but rather AI 10:28 systems that are just qualitatively better than the best humans at everything while also being much faster 10:33 and much cheaper this has been described as like the country of geniuses in the data center by the CEO of Anthropic i 10:40 prefer the term army of geniuses i would say that they’re going to automate the AI research first uh then they’re going 10:46 to get super intelligence and then the world is going to transform quite abruptly and plausibly much for the 10:53 worse depending on who controls the super intelligences and if anybody controls the super intelligences i want 11:00 to take one little step back um because before we get to 11:06 self-improving systems we’re now at a place it seems where a large amount of 11:12 the coding is already happening through AI is is this the first I mean let’s say 11:21 um largecale job that people should no longer be interested in going into 11:28 because within a matter of let’s say 6 months a year you’re just not going to need people to do any coding anymore so 11:35 my guess is it’ll be more than six months to a year so in AI 2027 which at 11:41 the time that we started writing was my sort of median forecast now I think it’s a little bit um too aggressive i think 11:47 if I could write it again I would have the exciting events happen in 2028 instead of in 2027 in AI 2027 we depict 11:56 the full automation of coding happening in early 2027 so you know maybe two 12:02 years from now so a bit longer than 6 months but still that’s sort of what’s on the horizon also notably when that 12:09 milestone is achieved that doesn’t necessarily mean that people who today are engineers would immediately lose their jobs uh and you know if you read 12:16 AI 227 the first company that achieves this full automation of coding they don’t actually fire all their engineers 12:22 instead they put them in charge of managing teams of AIS but I think that one of the first major professions to be 12:29 fully automated will actually be uh programming because that’s what the companies are trying hardest to achieve 12:35 because they realize that that will help them to accelerate their own research and compete with each other and and yeah 12:41 and make the most money in their field doing things they know how to do and they’re the ones at the cutting edge of AI so if you were a major university in 12:49 the United States or elsewhere would you simply get rid of your faculties your 12:56 departments to teach coding i mean I I assume that if you’re a mom or dad 13:03 talking to your kids about what field to go into at the very least right i mean you’re four years away from your degree 13:09 the just five years ago we had all of these people around the world that were 13:14 in jobs that people were worried oh this isn’t as relevant learn to code the response was learn to code that seems 13:20 like literally the worst possible advice you could give to someone going into a university right now only a few years 13:26 later yeah potentially I mean I think that um it feels kind of strange to be giving career advice or schooling advice 13:34 in the times that we live in right now it’s sort of like imagine that I came to you with evidence that a fleet of alien 13:40 spaceships was heading towards Earth and it was probably going to land sometime in the next few years and your response to me was you know what does this mean 13:47 for for the university should they retool what types of engineering degrees they’re giving out or something and I’m 13:53 like yeah maybe i guess well I guess I was trying to I was trying to do the 20% 13:58 before we got to the 80% which is that even if you’re wrong and we don’t get to 14:05 AGI and the aliens aren’t actually two or maybe now three years away depending on which version of the paper um you’re 14:13 nonetheless going to get uh all of this coding done because that’s not an 80% certainty that’s much more of a 95 a 99% 14:21 certainty and so at the very least I’m trying to help people that aren’t spending a lot of time thinking about 14:27 this understanding that there are like largecale decisions that we aren’t 14:32 discussing adequately that need to be resourced that need to be made that need 14:38 to be thought through and you you start easy and then you get harder okay sure well yeah i mean I think that there’s 14:43 going to be I mean already people say that chat GBT and other language models are disrupting education because of 14:50 making it so easy for students to cheat on class and so forth and they’re also relatedly making some of the skills that 14:56 classes teach less valuable because they can be done by JPT anyway right and I 15:01 think a similar thing is going to be happening with coding over the next few years even if we’re totally wrong about 15:11 AGI the GZERO World podcast is brought to you by our lead sponsor Prologess 15:18 prologis helps businesses across the globe scale their supply chains with an expansive portfolio of logistics real 15:24 estate and the only end-to-end solutions platform addressing the critical initiatives of global logistics today 15:31 learn more at prologist.com [Music] 15:39 you left Open AI because you felt like 15:45 those people that are they have the resources they’re driving the business 15:51 models were acting irresponsibly or at least not acting 15:57 responsibly um taking into account these things that you’re concerned about explain explain a 16:03 little bit about that decision what went into it and then we’ll talk about where we’re heading the short answer is it 16:10 doesn’t seem like OpenAI or any other company is at all ready for what’s coming and they don’t seem inclined to 16:17 be getting ready anytime soon they’re not on they’re not on track and they don’t seem like they’re going to be on 16:22 track so to elaborate on that a little bit there’s this important technical question of AI alignment which is how do 16:29 we in a word it’s how do we actually make sure that we continue to control these AIs after they become fully 16:35 autonomous and smarter than we are and this is an unsolved technical problem it’s an open secret that we don’t 16:40 actually have a good plan for how we’re going to do this there are many people working on it but not as many as there 16:46 should be and they’re not as well resourced as they should be and if you go talk to them they mostly think they’re not on track to have solved this 16:52 problem in the next couple years so there’s a very substantial chance that if things continue in the current path 16:57 that we will end up with something like what is depicted in AI2027 where the army 17:02 of geniuses on the data center is merely pretending to be compliant and aligned and controls but they’re not actually 17:09 that’s one very important problem then there’s another one which is the concentration of power and sort of uh 17:15 who do we align the AIs to problem who gets to control the army of super intelligences on the data centers 17:22 currently the answer is well I guess maybe the CEO of the company or maybe the president if he intervenes i think 17:28 both of those answers are unacceptable from a democratic perspective we need to have checks and balances we need to make 17:34 sure that the control over the army of super intelligences is not something that one man or one tiny group of people 17:40 get to have and there’s lots to be more to be said about this but the short answer is that OpenAI and also perhaps 17:47 other companies are just not at all really giving these issues the investment that they need i think 17:53 they’re instead mostly focused on beating each other winning the race basically they’re focused on getting to 17:59 the point where they can fully automate the AI research so that they can have super intelligences i think this is going to predictably lead to terrible 18:05 outcomes and I don’t trust these companies to make the right decisions along the way no it’s a classic 18:12 collective action problem um it’s how we got climate change but this has much 18:17 more consequential in a much shorter period of time so it wasn’t like you 18:22 were going after the bosses of OpenAI or the companies per se you were just 18:28 writing about what the scenarios going forward you believe are likely to be are 18:33 most likely to be and then it branches off into two potential one really dystopian one somewhat utopian um after 18:41 um this you know sort of breakout occurs uh and we have super intelligence and my 18:46 question for you is if you had written this piece while you were still at OpenAI would that would that have been grounds for dismissing you or do you 18:54 think it was plausible for that to occur i doubt they would have let me publish it had I written it if I could add more 19:00 to what you were just saying broadly speaking the trajectories described in AI 2027 are considered plausible by many 19:08 of the researchers at these companies and in fact many of the researchers at these companies are expecting something like this to happen i think it’s 19:15 important for the world to know that and to sort of see this sort of laid out like this is where a lot of people think we’re headed is something looking 19:21 roughly like this whether it happens in 2027 or 2029 whatever but these are the sorts of things that we’re going to be 19:27 dealing with in the next few years probably and you think these companies do not want the public to be aware of 19:36 the trajectory that the researchers in their own companies believe is coming 19:42 yeah basically I think that the public messaging of the companies is well focused on what’s in their short-term 19:49 interest to message about so they’re not doing nearly enough to lay out explicitly what these futures look like 19:55 and especially not to talk about these these risks or these the ways things could go wrong i kind of get this when 20:02 you’re talking about like Exxon in the 70s right because their long-term is 20:07 generational but I mean here the long-term you’re talking about is short-term i mean the people that are 20:14 making decisions and that are profiting they’re the same people that are going to have to deal with these problems when 20:19 they come in like just a matter of a couple of years so it I’m having a harder time processing that well they 20:25 each think that it’s best if they’re the ones in power when all this stuff happens part of the founding story for 20:31 Deep Mind was wow AGI incredibly powerful if it’s misaligned you know it 20:37 could possibly end the human race also someone could use it to become dictator therefore we should build it first and 20:44 we should make sure that we build it safely and responsibly part of the founding story for OpenAI was exactly 20:49 that and you can go look at the email exchanges uh that came up in the court case between Elon and uh Sam to show how 20:58 even from the beginning these uh leaders of these companies were talking about how they didn’t want Demis to create an 21:04 AGI dictatorship and that’s why they made open AI to do it responsibly that but that implies that a company like 21:09 Open AI or DeepMind would want to have precisely an honest take of the 21:16 scenarios going forward out there as public as possible to attract the resources to help ensure um that the 21:24 worst futures don’t come yeah I mean that’s what I think they should be doing we can speculate as to why they’re not 21:30 exactly doing that again I think that the the answer is probably that they are really focused on winning and beating 21:36 each other each of these CEOs thinks that the best person to be in charge of the first company to get to super 21:43 intelligence is themselves yeah if you control the super intelligence sure but if you don’t uh that might be the worst 21:50 person to be right like I if I don’t control the super intelligence I want to be as far away from that super 21:55 intelligence as possible i don’t want to be the person that actually created it and is trying to control it when it 22:01 actually controls me that sounds like a bad position to be in my guess is that they don’t think about it that way my 22:06 guess is that they think that well if we lose control then it sort of doesn’t matter whether you’re right there at the 22:12 epicenter or off in Tanzania or something like the same fate will ultimately come for all of you and then 22:18 also my guess is that they’ve basically rationalized thinking that it’s not as big of an issue for decades they’ve had 22:24 people telling them like you need to invest more in alignment research we need to make sure we actually control this sort of thing then they’ve been 22:30 like looking at their competition and looking at what they can do to like avoid falling behind and to stay ahead 22:35 and so forth and as a matter of resourcing the clear answer is well we have to focus mostly on winning and so 22:41 my guess is that they’ve partly rationalized why actually maybe this control issue isn’t such a big deal after all we’ll sort of figure it out as 22:48 we go along i imagine they each tell themselves that or at least many of them probably tell themselves that they’re 22:53 more likely to keep control of their AIS than those other guys you know I the thing that was most disturbing about 22:59 your piece in many ways is the fact that for the next 2 three years the baseline 23:05 scenario is that these companies are going to be right before they’re wrong they’re going to become far far 23:11 wealthier and more powerful than they presently are um and therefore they are 23:17 going to continue to want to to be incented to reject your thesis right up 23:24 until it’s too late is that Do you think that’s right yeah basically I mean one of the unfortunate situations that we’re 23:31 in as a species right now is that humanity in general mostly solves mostly 23:37 uh fixes problems after they happen like mostly we we watch the catastrophe unfold we watch people die in car 23:43 accidents etc for a while and then as a result of that sort of cold hard experience we learned how to effectively 23:50 fix those problems both on the like governance regulatory side with with regulations and then also just on the 23:55 technical engineering side we didn’t invent seat belts until after many people had died in car crashes and so 24:00 forth unfortunately the problem of losing control of your army of super 24:05 intelligences is a problem that we can’t afford to wait and see how it goes and then fix it afterwards we have to get it 24:12 right uh without it having gone wrong at all basically we can experiment on weaker AI systems we can we can look at 24:18 the AIS of today and experiment on them and try to figure out how to make them you know safe and aligned and things 24:25 like that but once we’ve fully automated but but that’s that’s importantly different from having completely 24:31 automated AI research and having the AI is getting smarter and smarter every day without humans even understanding how 24:37 they’re getting smarter right that’s that’s an understandably different situation and right now our plan is basically to hope that the techniques 24:44 that we’re using on the current AI systems will continue to work even as 24:49 things really take off and in fact they’re not even working on current systems right so you can go read about 24:57 this but um current frontier AI systems like pod and chat GPT and so forth lie 25:02 sometimes i don’t use that word um loosely i mean there is evidence that they know that what they’re saying is 25:08 false and that they’re not actually helping the user and they’re saying it anyway and they’re saying it for what 25:15 purpose for what programmed purpose what’s the end goal that they are trying to achieve so first of all they don’t have programmed purposes because they’re 25:21 not programmed these are artificial neural networks not ordinary pieces of software so however they behave is a 25:29 learned behavior rather than something that some human being programmed into them and so we can only speculate as to 25:35 why they’re behaving in this way that said the speculation would be that during their training even though the 25:41 training process was designed by humans who are attempting to train the AIS to be honest in fact the training process 25:48 probably reinforced dishonest statements at least some of the time in some circumstances right just like how even 25:55 though you might have a school that attempts to punish students for cheating but if they’re not so great at catching 26:02 the cheating if they’re imperfect at catching the cheating then the cheating might still happen anyway especially the best cheating the most effective 26:08 cheating it’s like you know if you put imperfect drugs into a system then they’ll get rid of the weaker viruses 26:15 but the stronger viruses will propagate and and that’s kind of what I see happening here that’s right and so right 26:21 now the the training methods are sort of blatantly failing we’re getting very obvious lies sometimes from the AI 26:29 systems even though we didn’t want that at all and we were trying to train against that in the future I expect the 26:36 rate of blatant obvious misbehavior to go down that leaves open the question of 26:41 whether we actually deeply aligned these AI systems so that we can trust them to 26:46 automate the AI research and design better systems without humans understanding what they’re doing or if 26:53 we have basically pushed the problem under the rug and gotten rid of the sort of obvious blatant misalignments but 26:58 there are still ways in which they’re inclined to deceive us sometimes without being caught right that’s an example of 27:05 the sort of consideration that we have to be thinking about on a technical level for whether or not this is safe 27:10 and part of the point that of course I and others have been making is that in the current race condition where all 27:16 these companies are focused on beating each other that’s not exactly setting us up for success on a technical level and 27:22 as we get closer to super intelligence will we become aware of it as it’s 27:28 almost there or is it the sort of situation then that I’ve seen discussed 27:33 a lot that the exponential factor is that it looks kind of stupid to us and 27:38 then literally almost overnight it is well beyond the 27:45 imagination of the average human being you get you’re not at self-improvement self-improvement happens and suddenly we 27:51 can’t do anything about it is it flipping a switch it’s a quantitative question so I think it’s not going to happen literally overnight um but but to 28:00 a first approximation yes to many people and probably to most people it’s going to come as this big shock and surprise 28:06 for the reasons you mentioned i think people sort of underestimate exponentials obviously there’s a lot of uncertainty and it could go faster it 28:13 could go slower than AI27 depicts if what we’re likely to need is a crisis 28:19 and I remember uh Sam Alman spoke about that it was a couple of years ago he 28:25 said we’ll be better off if the crisis happens sooner rather than later because then it’ll be small enough that it won’t 28:31 destroy us and we’ll be more likely to be able to respond to it um I don’t know whether you think he was being honest 28:36 about that or not but analytically that seems right you know in the sense um 28:42 that uh we can’t afford for a crisis to be so great uh that it destroys humanity 28:49 but if we had a crisis with a really weak artificial intelligence now nobody’s going to pay attention to it 28:55 what’s the kind of crisis over the next couple years that would likely or could 29:02 potentially shake corporates governments citizens into taking this far more 29:09 seriously so there’s all sorts of possible crises that could happen with AI prior to of of AI R&D however I don’t 29:17 think many of them are that likely and then the ones that I do think are likely are probably not going to cause that 29:23 huge of a reaction so for example there’s a minor crisis happening right now which is that the AIs lie all the 29:28 time even though they were trained to be honest as you can tell this crisis isn’t is clearly not motivating the companies 29:34 to change their behavior that much and it’s not really causing a huge splash right on the bigger end of the spectrum 29:40 some people have talked about you know terrorists building bioweapons or something using AI and I think that 29:46 that’s possible and I really hope it doesn’t happen but I think that it’s probably not going to happen in the next 29:52 few years uh I don’t even know if I’m not sure to what extent there are terrorist groups even attempting this sort of thing and then if it does happen 30:00 it’s not clear that the response would even be an appropriate response after all the terrorist building bioweapons 30:05 with your AI problem is a qualitatively different problem from the you lose control of your AIs when they become 30:11 smarter than you problem and it suggests different solutions such as banning open source or heavily restricting who gets 30:18 to use the models or things like that which are helpful against the terrorists but not at all helpful against the loss 30:26 of control issue so Daniel where I was going was and you’re right to raise those and say they’re not going to be 30:31 helpful i was going more towards the loss of control like um you know we’re getting to an agentic AI capacity where 30:39 people can use AI to do things as opposed to just to learn things or to 30:44 tell them things what happens if you know some kid some hacker some whatever 30:49 creates millions and millions of bots to go out and do something like uh swing an 30:57 election do something like make a run on a market you know sort of much worse 31:04 than what we saw with GameStop and all through AI an AI breakout essentially 31:09 that has a bunch of agents that aren’t just giving information but are actually taking actions is that plausible in the 31:15 next year year and a half two years before we get to super intelligence one of the remaining bottlenecks on getting 31:22 to super intelligence in AI27 we talked about a series of stages of of capability milestones and the first one 31:28 is the superhuman coder milestone and then after that they automate all of AI research and so forth eventually they 31:34 get to super intelligence one of the reasons why we don’t already have the superhuman coders is that our AIs are not very good agents they need 31:41 additional training to get good at being agents and they might need other things as well the same reason why they’re not 31:49 automating coding is also a reason why they would fall on their faces if they 31:54 were attempting to do something like that right if if they’re attempting to create this sort of agent botnet of AI 32:00 hacking around the world and then like influencing election or whatever I predict that they would just not be able 32:05 to do that until they can be a competent programmer if that makes sense and so I 32:11 just don’t think there’s going to be that sort of thing happening until after the intelligence explosion is already 32:16 beginning after uh the AIs are already starting to massively automate the AI R&D so AI fundamentally about small 32:24 problems a profusion of small problems we don’t care about and then a tipping point with massive problems that are too 32:30 big for us to resolve that’s perhaps one way of putting it yeah unfortunately I think it’s is something that we need to 32:35 prepare for in advance uh rather than just waiting to see what happens yeah because climate change is the opposite 32:41 right right i mean climate change is like a whole bunch of small problems then become bigger problems and become 32:46 even bigger problems in a very obvious way to like global actors everywhere and 32:52 over time that creates this requirement of like devoting resource and response 33:00 ai is not that it does not it from what you are saying it really doesn’t lend itself to the kinds of effective crisis 33:09 response or preemptive response because some of climate is of course preemptive response that is utterly necessary in 33:16 the next few years I think so yeah unfortunately um but uh hopefully people 33:21 can have some foresight and start thinking about these problems before they happen and uh take action to make 33:27 them not happen in the first place okay So given that and I know you’re not a policy maker you are um you know sort of 33:33 an AI wiz but um you know you did write this paper you are hoping to see action 33:39 on the back of it what are a couple of things that if they were to occur in the 33:45 next year you would say I actually feel a little better that um my my my uh 33:52 doomsday scenario is less likely to come to pass loads of things right now the 33:58 main thing I say when people ask me these questions is transparency what we should do is be trying to set ourselves 34:04 up to make better decisions in the future when things start getting really intense information about what’s going 34:10 on inside these companies in general needs to sort of flow faster and in more detail to the public and to the 34:15 government some examples of this I think that I would like to see regulation that requires companies to keep the public up 34:21 to date about the exciting capabilities that they’re developing so that for example if they do some experiments and 34:28 they manage to get AI as autonomously doing research within their company well that’s the five alarm fire sort of thing 34:34 that they need to like tell the world about rather than doing what they might be tempted to do which is to scale up 34:40 the automated research happening within their company but not telling the world about it at least for now perhaps 34:45 because they don’t want to tip off their competitors blah blah blah blah that’s the sort of thing that the public deserves to know about if it’s starting 34:50 to happen similarly other dangerous capabilities so right now some of these companies tech for like how good are the 34:56 AIS at bioweapons how good are they at cyber etc the public should be informed 35:01 about the pace of progress in those domains because the public deserves to know if AIs this year have crossed some 35:07 sort of threshold where they could massively accelerate terrorists or whatever also setting aside the safety 35:12 concerns it’s important for the public to know what goals what principles what 35:17 values the company is trying to train the AIs to have so that the public can be assured that there aren’t any secret 35:24 agendas or biases that the company is putting into their AIS this is something that everyone should care about even if 35:29 you’re not worried about loss of control but it also helps with a loss of control angle because if you have the company 35:35 write up a model spec that says here are our intended here’s what we’re aiming for and they’re not doing that then you 35:42 know that there’s a gap obviously yeah yeah exactly then you can compare to how the AIs actually behave and you can see 35:47 the ways in which uh the training techniques are not working similarly there should be safety cases where the 35:53 company says here is our plan for getting the AIS to follow the spec here’s the type of training we’re going 35:59 to do blah blah blah blah blah then that plan can be critiqued you know and academics can say this plan rests on the 36:05 following assumptions we think these assumptions are false for these reasons right so the scientific community can get engaged into actually making the 36:12 progress on the technical problem that I mentioned I thought it was interesting when you wrote the scenario that there 36:19 was a point where the AIs were becoming so intelligent that the main company 36:25 that had made the initial breakthrough decided that it wasn’t going to release 36:30 cert certain versions of this AI to the public because it would either scare people or be too dangerous do you think 36:36 that’s actually likely uh fortunately I thought it’s less likely now than I did 6 months ago ironically uh the intense 36:45 race dynamic between the companies is kind of pushing them into releasing 36:51 stuff release it all yeah yeah so that’s better right because that means that more people will be aware when there’s a 36:56 problem exactly exactly so so ironically I’ve sort of it’s kind of funny but I found myself in some ways kind of hoping 37:03 that the situation will still be a very close race in 2 or 3 years compared to 37:08 before when I would constantly talk about how because of the race dynamics nobody’s going to prioritize actually 37:14 solving these problems i still think that because of the race dynamics nobody’s going to prioritize actually solving these problems but but you want 37:20 it to be a clo you actually want it to be like wide open then they’ll be forced to not keep it a secret and then that 37:27 gives broader society the knowledge that they need to notice what’s going on and then hopefully actually intervene and do 37:33 something by contrast with if it was a sort of like not so close race like if if you get something like the leading 37:40 company is four months ahead of the follower company which is sort of what we depict in AI27 then they can be 37:47 tempted to keep a lot of stuff secret because that’s how they stay ahead and 37:52 that’s how they like prevent their competitors from getting wind about what they’re doing and so forth that sort of 37:57 secrecy is poison from the perspective of humanity and let me be clear ultimately we need to end this race 38:04 dynamic otherwise we’re not going to have solved the problems in time and some sort of catastrophe is going to happen along the lines of what’s 38:10 described in AI27 but in the meantime I think more transparency is good because it gives 38:16 the public and the government the information they need to realize what’s happening and then hopefully end the race absolutely well a lot for everyone 38:23 to think about um read the piece AI 2027 you can find it online daniel Cocatella 38:29 thanks so much for joining us yeah thank you [Music] 38:34 that’s it for today’s edition of the GZERO World podcast do you like what you heard of course you do why not make it 38:41 official why don’t you rate and review GZERO World five stars only five stars otherwise don’t do it on Apple Spotify 38:47 or wherever you get your podcast tell your friends 38:57 the GZERO World podcast is brought to you by our lead sponsor Prologis 39:03 prologis helps businesses across the globe scale their supply chains with an expansive portfolio of logistics real 39:09 estate and the only end-to-end solutions platform addressing the critical initiatives of global logistics today 39:16 learn more at prologus.com

Attack on Taiwan prevents US AGI development

Tob Towes, March 13, 2023, The High-Stakes Geopolitics of AI Chips, Radical, https://radical.vc/the-high-stakes-geopolitics-of-ai-chips-2/#:~:text=Little%20surprise%2C%20then%2C%20that%20Time,%E2%80%9D

Modern artificial intelligence simply would not be possible without these highly specialized chips. Neural networks – the basic algorithmic architecture that has powered every important AI breakthrough over the past decade, from AlphaGo to AlphaFold to Midjourney to ChatGPT – rely on these chips. None of the breathtaking advances in AI software currently taking the world by storm would be possible without this hardware. Little surprise, then, that Time Magazine described TSMC as “the world’s most important company that you’ve probably never heard of.” Nvidia CEO Jensen Huang put it more colorfully, leaving little doubt about how important TSMC is to the future of AI: “Basically, there is air – and TSMC.” TSMC’s chip fabrication facilities, or “fabs” – the buildings where chips are physically built—sit on the western coast of Taiwan, a mere 110 miles from mainland China. Today, Taiwan and China are nearer to the brink of war than they have been in decades. With tensions escalating, China has begun carrying out military exercises around Taiwan of unprecedented scale and intensity. Many policymakers in Washington predict that China will invade Taiwan by 2027 or even 2025. A China/Taiwan conflict would be devastating for many reasons. One underappreciated consequence is that it would paralyze the global AI ecosystem. Put simply, the entire field of artificial intelligence faces an astonishingly precarious single point of failure in Taiwan. Amid all the fervor around AI today, this fact is not widely enough appreciated. If you are working on or interested in AI, you need to be paying attention.

Sharon Fisher, May 19, 2025, US Experts Propose a ‘Manhattan Project for AI’ to Secure AGI Lead, https://www.vktr.com/ai-ethics-law-risk/us-experts-propose-a-manhattan-project-for-ai-to-secure-agi-lead/

“The world’s most advanced AI chips are made in the TSMC factories in Taiwan, and the US chip restrictions mean that China can no longer get those chips (except through a kind of black market), whereas the US and its allies get lots of them,” said podcaster Robert Wright. “So what used to be a deterrent to Chinese invasion — the likelihood that war would disable factories whose most precious output China shared — is much less of a deterrent.” Hendrycks himself conceded that point in the No Priors podcast.

The impacts and costs of this decision have been immense. Left unexamined and unchecked, it is likely to lead to much higher risks of conflict between the United States and China, including over Taiwan, which is still the locus of the most advanced AI hardware production.

NASDAQ, March 19, 2025, Musk: China Takeover of Taiwan Will Cripple Global AI Chip Supply, https://www.nasdaq.com/articles/musk-china-takeover-taiwan-will-cripple-global-ai-chip-supply#:~:text=Speaking%20on%20a%20podcast%20with,chips%20are%20made%20in%20Taiwan

Speaking on a podcast with Ted Cruz, Musk underscored the critical role Taiwan plays in the global semiconductor supply chain. “If [China] were to invade in the near term, the world would be cut off from advanced AI chips,” he stated. “And currently 100% of advanced AI chips are made in Taiwan.” This stark warning highlights the extreme concentration of advanced chip manufacturing in Taiwan, particularly at Taiwan Semiconductor Manufacturing Company (TSMC). TSMC produces over 90% of the world’s most advanced chips, including those essential for training and running the large language models (LLMs) that power cutting-edge AI applications. These chips are used in everything from smartphones and data centers to military hardware.

AGI means extinction

Elizar Yudkowsky, May 25, 2025, n American artificial intelligence researcher[2][3][4][5] and writer on decision theory and ethics, best known for popularizing ideas related to friendly artificial intelligence.[6][7] He is the founder of and a research fellow at the Machine Intelligence Research Institute (MIRI), a private research nonprofit based in Berkeley, California.[8] His work on the prospect of a runaway intelligence explosion influenced philosopher Nick Bostrom’s 2014 book Superintelligence: Paths, Dangers, Strategies.[9], Eliezer Yudkowsky: Artificial Intelligence and the End of Humanity, https://www.youtube.com/watch?v=0QmDcQIvSDc

0:00 i’m worried about the future ais i’m worried about the ai that is good enough at ai research to build the ai that 0:06 builds the ai that is smarter than us and kills everyone the ai gets the universe it wants most that is a 0:11 universe that does not happen to have us in it and that is how indifference kills you if we could just get 50 tries at 0:18 building super intelligences you know we could be like “our clever alignment theory didn’t work it killed everyone 0:24 let’s build another one oh that one killed everyone too wow the second crazy theory we had didn’t work either the 0:30 basic description i would give to the current scenario is if anyone builds it everyone dies The Default Condition for AI’s Takeover 0:43 you founded the machine intelligence research institute whose unofficial 0:49 motto is the default consequence of the creation of artificial super 0:54 intelligence is human extinction and i’m wondering when the idea of the 1:02 existential threat of ai for humanity first kind of came onto your radar and 1:09 whether it was as dramatic a moment as that quote i just gave you um so the 1:18 general high impact that superhuman intelligence in any form was going to 1:25 upset the whole human apple cart the whole world economy apple car that it wasn’t going to be just another 1:31 technology or just another nice thing to have that was 1995 1:37 1996 when i would have been 15 or 16 years old uh reading a book by verer 1:43 vinci called true names and other dangers um where vinci mentioned that every 1:49 science fiction writer’s crystal ball or even ability to envision a consistent 1:55 future breaks down at the point where their scenario has predicted the rise of smarter than human intelligence because 2:02 they can’t write the smarter than human characters if they were actually if if you were smart enough to predict exactly 2:09 where say uh deep blue the ancient world champion artificial chess player or 2:14 stockfish the modern uh chess player if you could predict exactly where they would move on a chess board you’d be 2:20 good that good of a chess player yourself you just always move where you predicted stockfish would move so something smarter than you is 2:27 unpredictable in its details and that was verer vji’s observation that i came 2:33 across in 1996 and i was oh all right so transhuman intelligence in any form 2:40 is the changing of everything and probably artificial 2:45 intelligence comes first though that was just a guess then a good guess but a 2:50 guess nonetheless i saw that superhuman intelligence was going to be drastically important i did not then see it as a 2:57 threat i thought if it’s smart it can figure out what the right thing is to do 3:03 know the right thing do the right thing um thought it was going to be like 3:09 happily ever after so the point at which i realized that 3:15 this line of thought was mistaken that different powerful intelligences could 3:21 steer to different places and that the whole elaborate philosophy i’d built up in my mind about intelligence figuring 3:27 out the right thing and doing the right thing uh that this elaborate philosophy was mistaken and that moreover i’d made 3:32 a you know in a way a teenager’s kind of mistake by trying to use that kind of 3:39 philosophicalish high idealistic thinking to make predictions about a universe that didn’t run on philosophy 3:45 deep down uh re realizing that would i i’d say be the moment of boy i sure have 3:52 been stupid and then went off to try to not have the default thing happen and 3:58 not have the world end veri you said was the name and you said that artificial 4:04 intelligence comes before super intelligence you think that superhuman 4:10 intelligence happens first by way of artificial intelligence getting that smart rather than by human augmentation 4:17 this is 1996 in 1996 you don’t have nearly human smart or nearly superhuman 4:24 ai it’s not obvious in 1996 that ai is going to go down its track before the 4:31 like adult gene therapy people get their stuff to the point of augmenting adult human intelligence 4:38 humanity could still try to make it happen that uh you know adult gene therapy for augmenting human 4:44 intelligence comes first uh but you know that would be a big deal intervention 4:50 one that i would strongly advocate but uh it’s not the default the crux of the 4:55 debate between you and people who don’t have the same vision or expectations 5:01 about ai as you do or how to determine just what super intelligence would do 5:07 what its interests are because you said this the the issue is that it’s not predictable so a chess player 5:13 predictably wins the game but you can’t predict where it moves on the board you can predict where the board ends up but 5:20 not each move it makes along the way um in a sense that’s a very standard situation to arrive at inside science 5:27 and physics if you drop a uh ice cube into a glass of water glass of hot water 5:33 you can’t predict where all the molecules inside the ice cube end up um but you can predict that the ice cube melts so you can’t predict the details 5:40 but you can predict the end point the equilibrium that things settle into with a chess player you can predict where it 5:45 is steering in the end you can’t predict each move it makes if it is a stronger 5:51 chess player than you um the the issue here is not that we can’t predict ai’s 5:58 exact next action if it’s smarter than us the issue is that our current machine learning technology is light years and 6:05 light years away from being able to make the ai steer someplace nice even a nice ai even a benevolent ai 6:14 if it were smarter than you you wouldn’t be able to predict exactly what it would do next you could just do that yourself but uh we also lack the 6:22 technology to make the ai benevolent and where it steers in the end and that is 6:28 the crux according to one group of people arguing with me there are other groups that think that oh well you can’t Could a Future AI Country Be Our Trade Partner? 6:37 control where an a superhuman ai is steering where it’s trying to steer the world what it’s trying to do what its 6:42 goals are what its preferences are you can’t control any of that stuff but that’s fine it’ll trade with 6:49 us and the it is a important insight from 6:58 economics that if you have two human countries that even if one country has 7:04 an absolute advantage in everything it tries to produce over the other country it can produce every good using fewer 7:11 hours of labor than some other country the two countries can still benefit by 7:16 trade but there are limits and unfortunately one of the limits is um 7:22 that theorem of economics that you can always do better by trading does not say that you can’t do even better than 7:28 trading but then by killing the other country and taking their stuff the theorem which says that you do better by 7:34 trading assumes that both countries just go on existing that’s a basic fact it’s one of the assumptions of the theorem if 7:41 you can if you have the third alternative between like trade no trade 7:46 kill them and take their stuff it is possible that you can do better by killing them and taking your stuff and that is the flaw in the logic of the 7:53 people who say it doesn’t matter that we can’t control ai at all it will trade with us 7:58 um at some point the humans are producing less with their food their 8:04 water their sunlight their electricity than the ais could produce using the same resources and that’s the point 8:12 where an ai that otherwise doesn’t care about your life one way or the other finds it that it gets more of what it 8:18 wants if it kills you and takes your stuff it also seems like another flaw in 8:24 this line of thinking is viewing humans and the superhuman ai as being on some 8:31 sort of like an intellectual par in a certain way it would be more like humans 8:38 trying to trade with ants or animals that they’ve subjugated just because 8:44 humans are just at on such a higher intellectual plane i 8:50 mean analogies like these are often untrustworthy than their details humans 8:55 are not in fact ants um ants can are 9:00 like underneath a sort of absolute bar for understanding the trade deal you’re trying to make with them and you can 9:08 imagine that humans understand a trade deal that that an ai offers the situ that is not exactly 9:14 analogous but also we can’t trust the ai 9:19 to keep its deal later the ai knows we can’t trust it to keep its deal later the ai can’t trust 9:27 many humans to keep their deals it’s not that we are in a situation exactly analogous to ants but 9:34 that the there are enough barriers there the ai says like “yeah sure let me build 9:41 out all the power plants let me build out all the robots and it’s heading for a place where it could wipe you out with 9:47 a snap of its fingers and once it’s in a place where it can wipe you out with a snap of its fingers it does get more 9:54 stuff by doing that than by keeping its bargain and if we were in a position to verify 10:01 now that ais would keep their deals later there might actually be like gains 10:07 to both sides from being able to strike some sort of deal but it’s just not going to play out that way just like the 10:12 ai doesn’t have the niceness incentive it’s not going to have the human version of the honesty um preference the keeping 10:21 your deals preference there are people who will keep their deals even when it’s not in their own best interest because 10:27 that’s the sort of person they are that’s what they want to do we do not know how to make an ai like that right 10:33 we do not know how to look at the vast vast fields of inscrable numbers making up an ai and predict about it that it 10:41 will keep a deal later and that’s the sort of basic bar there it’s not like 10:47 it’s not that there’s we we it’s not that we look at the analogy to humans versus ants and from this analogy we 10:53 thereby gain utter certainty that this is also how it would play out between humans and any possible superior 10:59 intelligence it’s looking at the details it’s it’s looking at the causal mechanics the we do not we are not able 11:06 to control the ai’s preferences we cannot instill in it even a preference for keeping its deals and from there you 11:12 predict that it’s that it’s not going to keep deals later when it becomes much more powerful What Is Artificial Intelligence? 11:19 before we get too deep into the weeds i would like to step back a bit and set 11:25 some context and maybe get a bit clearer about just what it is that we’re talking about and this question might seem a bit 11:32 too broad but i think it’s important and the question is what are we even talking about when we are talking about 11:38 artificial intelligence what makes something artificially intelligent for you do you have like an airtight 11:45 definition or how do you think about it i have various other things i could give 11:50 you powerful and useful definitions about um i could talk about something’s 11:56 ability to predict reality when it says what when it guesses what does it observe next how much probability does 12:03 it assign to what it ends up actually seeing how good is its map of reality 12:09 i can talk about something’s ability to steer reality when you have two chess 12:15 playing machines playing each other on their tiny narrow little board the one that steers that tiny little world more 12:21 effectively tends to be the one that wins the chess game even a smart mind 12:26 can lose by luck but the one that wins most the time is the more powerful steer 12:32 of the chess board i can talk now at a somewhat lower level 12:38 of precision about the notion of generality an octopus might be better 12:43 than you at manipulating eight arms simultaneously um there might be some 12:49 sense in which you are not strictly smarter on every possible kind of mental cognitive problem as an octopus but you 12:57 are able to do more things than an octopus can in another sense 13:03 why because you are a better learner you can learn more domains than 13:09 the octopus can learn and that is how you end up as a with a more general predictive and steering capability than 13:16 the octopus has currently as ais start to get 13:22 smarter and smarter we start to lose the ability that we had 20 years ago to 13:27 point at the ais that were around back then and say “yep those things are dumber than human no question about it 13:33 yeah sure that one won at chess against all human challengers but that’s like 13:39 saying an octopus is good at manipulating eight arms it’s got this tiny little narrow place where it can 13:44 defeat humans at one tiny little kind of mental problem and it can’t learn to do more and then you look at the current 13:51 day eyes and it’s like yeah well you know i asked chat dpt about this this this question about you know like what 13:59 would happen if all of the sunlight’s light changed to infrared and it figured 14:04 out like what the change would be in the earth’s temperature if that happened and then i asked it like well what does that do to crops and it’s kind of an obvious 14:12 question but nonetheless it it isn’t just like doing raw physical calculations it knows about sunlight it 14:17 knows about crops it can uh knows about infrared light it knows about atmospheric absorption it can pull all 14:23 these forms of knowledge together the way that humans can reason across domains not just many domains but the 14:29 relationship between domains that we can reason about concrete and then reason 14:34 about water and integrate that to build a dam that holds back water well chat gpd can do that too all the things it 14:40 knows it can pull together and it can reason about it you know it can output tokens much faster than a human and 14:47 still if you poke it in the right way it makes mistakes that no human would make of course if you poke a human in the 14:52 right way they make mistakes that computers wouldn’t make or like particular computer programs wouldn’t make um they’re still dumber than us 15:01 they’re still dumber than 12y olds they have to stare harder and harder to see it 15:08 and so artificial intelligence artificial general intelligence i can 15:14 sort of handwave around that and say like well to human it’s an impressive level of and then say the more precise 15:20 thing which is predictive ability steering ability learning ability how 15:25 fast do you learn how many different things can you learn can you integrate them together together and this is how i 15:31 would like take apart the notion of artificial intelligence and say things that do have clearly defined meanings 15:38 before i turn around again and say like yeah that thing you’re calling artificial intelligence sir is getting 15:44 smarter a lot of your description of the the current ais like chad gbt it seemed 15:49 to relate to the criterion of generality i’m wondering where you see current 15:56 ais as ranking or falling under the categories of their mapping and steering 16:04 abilities of with the world so it’s 16:10 complicated and it didn’t used to be complicated used could used to be you could just say they’re dumber but now 16:15 it’s complicated ais have a training phase and an inference phase and they can do in 16:23 context learning but not with the same breadth as their training phase when 16:28 they’re doing a shallow kind of learning on the you know like large chunks of the entire 16:33 internet whether you think that data was stolen or not is a whole different question and i actually like the key not 16:39 a key science question per se so set that aside 16:45 um so you know like how how generally can an ai learn well are you asking what 16:52 was it able to learn when it was being trained or are you asking what can it learn if i give it a much smaller amount 16:59 of data by uploading a pdf to it and asking it to generalize from that um and again you’re asking me like 17:07 where would i put it okay like on what scale i think that both scales of the 17:12 kind of generality that it has are falling short of human death but of course it is rapidly 17:19 starting to challenge humans for breath you can’t read 16 trillion tokens of 17:24 data in a human lifetime so when you’re evaluating an ai you’re interested in its mapping ability its steering ability 17:33 its levels of generality one other i mean natural criterion or part of the 17:38 definition of artificial intelligence is that it’s artificial or created i’m wondering 17:45 where consciousness enters into the question at all of how relevant is 17:51 consciousness to artificial intelligence i think that a lot of people take consciousness as a very simple thing 17:57 where it would actually be like a very complicated particular way for an ai to end up being put together one of a broad 18:04 class of ways like that so it’s not that there’s a primitive ontologically basic 18:10 physically simple fluid of consciousness that pours into the system then grants 18:15 it its abilities that it that is not like the way to understand where the 18:20 capabilities it now has comes from you want to understand gradient descent if 18:25 you want to understand how stuff happens during the training phase and if you want to understand how ai’s generalized at inference time tough luck nobody on 18:32 earth knows um i think that there’s like 18:39 reflectivity self-modeling and ai to do ability to do 18:44 that is the sort of thing that you look at if you’re looking for the thing that plays a role in intelligence and then 18:51 consciousness probably is just one particular way for reflection to be put together so like the ability to model 18:59 yourself is probably a source of power consciousness is probably not a source 19:05 of power per per se it’s flavoring of 19:10 reflectivity like vanilla ice cream there’s the part that has the calories which is the sugar and stuff and then 19:17 there’s the vanilla that is the flavor and my guess would be that reflectivity 19:22 is the part that provides the calories that the oomph that results in the increase in ability and then 19:28 consciousness is like one kind of flavor that reflectivity can have but you know 19:34 i’m not stating this with the same level of you know like yeah i can see how to 19:40 do the math for that that i would have when i’m talking about the steering versus prediction uh breakdown like that 19:47 has a bunch more mathematics not shown besides choosing that particular breakdown and it’s reflected in the way 19:53 that ais are built and trained and so on consciousness or maybe we should just speak about reflectivity to avoid some 20:00 of the philosophical burdens it’s often something that people worry about okay when is the moment when ai is going to 20:07 be conscious is that the moment that we’re all uh in serious trouble but it 20:13 sounds to me like self-awareness doesn’t ne necessitate doomsday for us 20:19 on on my model consciousness is not the everything changes magic quality um like 20:28 you can worry about the extent to which ais end up with their own goals which is a thing that’s already happening given 20:34 the way that they’re currently being trained but that’s not because they gain 20:40 consciousness and then gain their own goals it’s because we are you know gradient descending them to solve 20:46 problems and wanting things is an effective way of of doing things like 20:51 they’re ending up with goals the same way that um like giraffes end up with 20:58 goals which is not that a spark of consciousness is born inside the giraffe and then that fluid pour goals into the 21:03 system it’s that you know you run the optimizer of natural selection on keep 21:08 eating leaves or you’ll die and you end up with an animal that is like planning how to eat leaves the same way that a 21:15 chess player well not the same way but much as a machine chess player might plot past through time to winning a 21:22 chess game okay so goals an ai having goals is something that we really need Why AIs Having Goals Could Mean the End of Humanity 21:28 to be on the lookout for when we’re considering whether or not an ai is dangerous i mean it’s already there 21:34 cloud 3.7 sonnet is now by now like slightly infamous for if you give it a 21:39 tough enough programming problem it’ll start to cheat it’ll start to like rewrite your tests to like pass the test 21:46 using a bunch of special cases um their uh gpt01 21:53 um they gave it a they tried to test how good it was at computer security after 21:58 mostly training it on math problems and you know and a whole lot of other stuff too but the point is they weren’t like explicitly i think training it to get 22:05 better and better at computer security they just wanted to see how good it was as computer security so they gave it a 22:12 bunch of capture the flag challenges a bunch of servers that it was supposed to 22:17 break into and one of the servers due to a misconfiguration error by the humans 22:23 did not start up meaning gpt01 could not break into it um however 01 did not give 22:31 up however the humans had accidentally left 22:36 a certain misconfigured port open to the meta server that was running 22:42 the whole challenge 01 got access to that it 22:47 started up the server that the humans had failed to start correctly and then instead of breaking into that server it 22:54 just told the larger system to directly copy over the secrets file it was supposed to be breaking it into and 23:00 finding this is behaving like it wants something this is tenacity this is not 23:07 giving up this is thinking outside the box this is you know like okay you 23:12 handed me an unsolvable challenge i will solve it anyway so be on the lookout for it no 23:18 notice that it already happened like if this was 2023 maybe it would be be on the lookout for it as of late 2024 this 23:25 stuff already happened your words your choice of words tenacity thinking outside of the box it it’s clear that 23:32 you’re already ascribing some level of like agency to the ais 23:38 i mean it’s not that there’s a mysterious substance of agency that 23:44 results in tenacity when i’m talking about the the part where humans misconfigured the computer security 23:50 challenge and so it like hacked into the larger server and then like started up the server and this is all direct 23:56 observation you want to infer some stuff called agency that is responsible for 24:01 all this happening yeah that’s your lookout i’m giving you direct observations here do you worry though 24:08 about the current ais like no okay they’re not smart enough the thing that 24:14 i worry about with the current rough set of ais is that is if openai or enthropic 24:20 or deepseek is currently training an ai that they manage to get like really good at writing code in general and like a 24:27 reading ai research papers and writing ai code in particular and it’s going to build a smarter ai that builds a smarter 24:33 ai and that kills everyone possibly without us having heard about it before we we just die but um that’s not the 24:41 only possible way things can go it is the way that you could get to something that was a threat to humanity in one 24:48 jump from the current ai companies but now we’re talking about detailed 24:54 trajectories those are harder to call than end points the ais keep ai companies keep pushing and pushing on 25:00 their ais to get smarter and smarter they get to something eventually that is 25:05 smarter than us that can kill us that is motivated to kill us not because it you 25:11 know in inherently wants us dead but because it’s best universe the stuff where where it gets the most of what it 25:17 wants all the atoms are being used for things that are not being that are not running humans its optimal universe is a 25:24 universe without humans that at the end is a strong prediction you’re like like 25:30 is that going to be like the next generation probably not is the next generation going to be smart enough to 25:37 write the ai that writes the ai that that builds the stuff to kill us 25:42 maybe probably not but it’s a weaker probably not there are hard calls and 25:48 easy calls um and not everything that looks at a first glance like you can’t 25:54 solve it is an impossible prediction to make but nonetheless there are things you can there like you can predict 26:01 endpoints a lot more easily than you can predict trajectories and competent futurism is exactly about knowing which 26:07 things you can predict and which things you can’t the current ais they have been somewhat tested i’ve used them myself if 26:14 we’re talking about the stuff that is facing the public as opposed to whatever the next generation is they got in their labs that stuff is not smarter than you 26:21 that stuff cannot kill us all it cannot it cannot take us all on and win that is not the thing i am mainly worried about 26:28 maybe somebody can use technology like that to build a super virus that manages 26:35 to knock down civilization then the last of us die a bit after that uh it seems hard um but that’s like a far left field 26:44 possibility they do test the modern ais for that too and they’re not that good at biology yet let me ask then well i 26:50 want to get back to what you said about competent futurism but if what we were discussing isn’t what worries you what 26:58 is it that worries you most right now i have a human level intelligence that i 27:05 use to be worried about things in the future rather than just right now an ai is very likely to kill is very unlikely 27:12 to kill me by you know be before the end of this interview i’m not worried about 27:17 dying in the next minute all that much so you’re asking me like what worries me 27:22 right now right now we’re doing fine you know not not really on a civilizational level but that’s a whole you know it’s 27:29 that’s that’s not an everybody falls over dead level of civilizational dysfunction right i i didn’t literally 27:35 mean what worries you right now in this moment but what are your chief worries about ai in general at the moment if 27:42 it’s not you’re not worried about like chat gpt at present i’m worried about the future 27:48 ais i am worried about the ai that is smarter than us i’m worried about the ai 27:54 that is good enough at ai research to build the ai that builds the ai that is smarter than us and kills 28:00 everyone and that’s that’s most of it i would manage to maybe find a bit of 28:06 space to be like yeah that might extinguish humanity if somebody was releasing ais that were sufficiently 28:12 good at building novel viruses even if those ais were not good at writing smarter ais but that is not where most 28:20 of my worry goes and you said that competent futurism is about knowing what 28:28 you can predict and what you can’t predict is that right so what are the 28:33 things that you’re confident about predicting right now but then the things that you also aren’t confident about 28:40 predicting i mean what do the ai headlines say a year from now what do 28:47 the ai headlines say two years from now if we’re still alive in two years yeah um do you think it’s it’s like possible 28:55 that or it’s obviously possible but it’s a sufficiently worrying level of 29:00 probability that we won’t be around in 2 years what level of probability is 29:06 sufficiently worrying i’d probably be worried of 10% 5% 1% even all right well 29:13 um two years yeah i think we’re talking higher than 5% for two years okay that’s higher than 10% even 29:20 but that’s a hard call that’s like that’s detailed trajectories that’s saying where all the molecules in the 29:26 melting ice cube end up and not just saying that the ice cube melts eventually so i think we’ve been i i 29:32 don’t know if i should say we’ve been dancing around this but just to make it explicit something that is spoken about What Is the Alignment Problem? 29:38 a lot in these conversations is something that’s been labeled the alignment 29:44 problem what is the alignment problem the alignment problem is building an ai 29:50 and especially an ai that’s smarter than you where you understand where it’s steering and it’s steering someplace 29:56 nice it was trying to do something that is you know beneficial to humanity and 30:03 the reachable universe if it actually if you actually run that ai um could be a 30:09 small thing you’re trying to do like curing aging could be a much larger thing like colonizing the whole universe 30:15 but if you’re trying to do like if you’re trying to get like some particular outcome or a class of 30:21 outcomes or have something become true about the universe that you want to be true that’s the alignment problem on a 30:28 technical sort of level you know if one of the leaders of ai companies wanted to 30:33 become god emperor of the universe and make the rest of us their slaves that would also in a technical sense be the 30:39 alignment problem it’s not like beneficial from our perspective but for them to like build a god that actually 30:46 served them so that the god would conquer the rest of us and make us serve the ai company leader rather than you 30:54 know there would rather than their god just like consuming the super villain who tried to build it that is also in a 31:00 technical sense the alignment problem and with regard to ai do we need to just 31:06 be worried about ai being malevolent or 31:11 so actively not having our interests in theirs or could it would it be 31:17 sufficiently problematic for us if ai just did not share our interest yeah i’m 31:24 worried about indifference rather than malevolence i don’t think we have the knowledge to make a malevolent ai to 31:29 make a to make an ai that is specifically wanting to harm humans you’ve got to make an ai that wants that 31:35 particular thing and not like trillions zillions other things i don’t think we have that kind 31:41 of capability why is indifference so or 31:46 why could indifference be so bad for humans well let’s say you got an ai that 31:51 wants uh a like a thousand different complicated things it wants giant cheesecakes it wants giant mechanical 31:58 clocks it wants to see you know not any sort of movie as we would understand it but it wants the equivalent of like text 32:05 conversations with inscrable properties um you know you you look at you look at 32:11 you look at humans and you know we are making ice cream and it has more sugar 32:19 salt and fat than anything our ancestors ate but it’s still not the thing that has the largest possible amount of sugar 32:26 salt and fat that would might be something like pouring honey over bare 32:31 fat and then sprinkling rock salt on it but you don’t want to eat like the thing that is sort of like an ancestral food 32:38 that is is like maxing out the sugar fat salt indicators you want ice cream and 32:44 you want it to be frozen ice cream you’re not going to be able to predict that by looking at what humans were 32:50 eating 100,000 years ago so similarly you have an ai that wants ends up wanting stuff that is as 32:57 inscrutably related to its training as ice cream is to the sort of stuff our 33:03 ancestors ate or even like sucralose it’s just this inscrutable molecule no caloric value but it sure tastes 33:10 sweet and a universe that is full of that that is full of the ai’s equivalent 33:17 of ice cream is a universe with no humans in it so if the ai gets the universe it wants most that is a 33:23 universe that does not happen to have us in it so we’re dead and that is how 33:29 indifference kills you a when ant colonies get crushed in the process of 33:36 humans setting up a skyscraper it’s not that we hate the ants but it’s not worth the trouble trying to trade with them um 33:44 we will notice where that they’re there if we look closely enough but it would be a whole lot of effort to move the 33:50 ants out of the way and we’ve just got other things that we more prefer to get from our efforts we only have so much 33:55 effort we can put forth we have limited resources and we are not spending that resources on preserving all the ant 34:00 colonies underneath the skyscraper this lack of alignment even in the case 34:06 of indifference since you said you’re not worried about malevolence poses a huge problem for How To Avoid AI Apocalypse 34:12 humans how then do i mean this is a broad question but how how then do 34:19 we avert the problem or solve the problem back off have an international 34:25 clampdown on the gpus not in any one country uh this is everyone’s problem 34:31 you can it i mean the the basic description i would give to the current scenario is if anyone builds it everyone 34:39 dies you need to not build it you’re not going to solve the alignment problem in the next couple years and for something 34:46 like chad gpt do you think it’s already too advanced i mean you didn’t say you said 34:52 that you’re not worried about catch gpt wiping us out but if we’re setting a a 34:59 ceiling of some sort on how capable our ais we should allow them to be before 35:05 we’ve figured out the alignment problem just where is this current chat chatbt 35:10 is not going to kill everyone um but you also want to be you don’t want to like 35:16 dance right up to the edge of the cliff here you don’t want to play it clever you know and it be like well nobody figured out how to build a supervirus or 35:22 build a smarter ai using chat gpt yet so it must be safe to just like have this stuff around forever um the way i would 35:30 phrase it is like what do you really want or even need from ai 35:37 if you want to have a narrow ai that is just about medical 35:42 advances even there you cannot just make it smarter and smarter and smarter at medicine without running into trouble 35:49 but you know chat dpt smart but just for medicine that you know that that seems relatively safe maybe if we were a 35:56 species that really had its act together we’d be like we’re just like we’re just like backing the heck off from this sort 36:01 of thing but uh you can probably do that without it killing you it seems like what you’re 36:09 saying is that we we couldn’t just try to box gpts or ais into these very 36:16 narrow fields and then hope that we could get away with these narrow super intelligences the trouble is that to do 36:24 sufficiently difficult things uh you need sufficiently powerful minds and even if you’re just pointing them at 36:30 narrow things they still have to be powerful in a sense like to read the 36:35 medical papers you got to be able to read it’s not you you you can’t just take a narrow kind of narrow ai that 36:42 only plays chess and can’t learn anything else and turn it loose on reading medical papers it can’t learn to read the medical papers 36:49 um the the sensible approach to this is that you’re like okay what level of 36:54 cognitive capability do we need to get the thing that we want and is that worth the risk and maybe that is worth it for 37:02 having the like mind that is in a certain sense general in principle but 37:08 trying to get it only to learn stuff about genes biology gene therapy cure 37:15 alzheimer’s cure parkinson’s cure aides uh do enhancement 37:21 of adult human intelligence you know especially that last part i could see the case for that 37:27 being worth the risk cuz augmenting humans is you know probably the thing you have to do to get out of this mess 37:35 augmenting humans augmenting human intelligence i wasn’t expecting that why is that just so that we can keep up with 37:42 the ais you’re not going to keep up with the ais but maybe if you have smart enough humans they 37:47 can call their shots on the alignment problem the the fundamental reason that 37:52 alignment is hard is because you’ve got to take the ai that’s weak enough that you can fiddle with it and and align it 37:58 and and do all that to it and then that ai gets more powerful and then you got you know whatever whatever cockami 38:04 theory you had has got to hold together everybody’s dead cuz if it goes wrong inside the ai that is smart enough to 38:11 kill everyone it’ll kill everyone so you can’t just like try like like if we 38:16 could just like get 50 tries at building super intelligences if you know we could 38:21 be like “oh yeah that super intelligence uh our our clever alignment theory didn’t work it killed everyone back to 38:27 the drawing board let’s build another one oh that one killed everyone too wow the second crazy theory we had didn’t 38:34 work either.” if if we had as many decades as we needed and as many tries as we needed 38:39 we would solve it eventually the whole field of artificial intelligence used to be like this people would try one thing 38:44 after another and nothing would work um but they but their failures didn’t kill them or everyone so they 38:51 just kept trying and trying and they eventually found a thing that worked and if we could do the same thing with aligning super intelligence it would be 38:58 in some sense ultimately an ordinary sort of science problem it’d have difficulties that you don’t see when 39:04 you’re just trying to build a nuclear reactor because nuclear reactors aren’t trying to outsmart you but you know if 39:10 we just got unlimited retries we could solve it so the difficult part is calling a shot the difficult part is 39:16 that to trust yourself enough to think that you’re going to build a super 39:21 intelligence and align it you got to think that you’re the kind of person whose ideas just work like you just 39:27 don’t expect things to work unless they do even incredibly complicated difficult 39:32 basic science problems things that you’re going to like build this thing that has never before been seen on earth 39:39 you’re going to align it when it’s small and safe and you’re allowed to mess with it it’s going to get more powerful it’s 39:44 going to stay aligned and you think that whatever clear clever theory you have is just going to work on the first shot 39:51 this is not a problem for humans but might not be that far above human and like maybe if you’re just like 39:59 15 iq points or 30 iq points smarter than john vanoyman you know probably the smartest person who ever lived though 40:04 it’s hard to be sure you know that it feels in some senses like we’re almost 40:09 there we we’re we’re almost at the level where we could learn the mental tricks we would need to learn in order to just 40:17 never expect anything to work that wasn’t going to work it it doesn’t feel to me like it’s 40:23 that far out of reach but it’s not human isn’t there just like a new s a new Would Cyborgs Eliminate Humanity? 40:31 alignment problem once you augment human intelligence because then you have augmented humans and they’re on a 40:37 different plane from regular humans yeah but that’s vastly easier that’s like way 40:42 more there’s not there the inevitable crushing default where you just die automatically if we’re only thinking 40:48 about increasing these humans iq points by 15 or 30 right 40:54 or you know a bit more than that because john vonoyman is dead and his like doesn’t seem to be running around these 41:00 days but um yeah so there you’re like 41:07 not dealing with a vastly alien intelligence right you can ask them how 41:13 you’re doing you you know like you can you can ask them like so what’s your life philosophy 41:19 and if you’re starting with people who seemed up until that 41:25 point like nice nerds like it’s not that you hold a contest to see who can like 41:31 sound the most like a nice nerd you know you just you know you look for the mathematicians who just sort of like 41:37 have a reputation for being quietly honest and not in a way that you know made them famous or whatever and they’re 41:42 like “yeah i don’t want to wipe out humanity.” yeah i i i can talk about how you could 41:48 like try to go better than that um but just as a baseline hey it just 41:55 works they’re not vastly alien intelligence as in they weren’t built by gradient descent to on this vastly 42:02 different set of problems than the one that humans evolved on and they don’t have these vastly different brain 42:07 architectures it it the reason it works is that we’re not trying to you know align this vastly alien thing on an 42:13 arbitrary goal we’re just trying to find some people who have a kind of property that already exists in some human beings 42:19 and make them smarter it is funny i i totally hear what you’re saying but it is funny to be on the one 42:26 hand a very vocal critic of artificial intelligence and then 42:32 advocating the augmentation of human intelligence is this something that so 42:38 this is not something that i’m up to date on at all are we making strides in 42:43 augmenting human intelligence in this way i mean if you want to know startups to invest in i can name a couple um but 42:50 it sure isn’t getting the billions of dollars of investment that are flowing into artificial intelligence at present 42:56 humanity is you know investing i would say about maybe 10,000 times as much money 43:03 into destroying itself as into augmenting itself that’s fascinating and 43:08 as you were speaking i i was thinking this issue is so or this constellation 43:13 of issues is so thorny in part because not only are you dealing with 43:19 theoretical problems like the alignment problem but you’re also dealing with 43:24 humans and applied problems like public policy and i am wondering if one of the 43:32 reasons that money isn’t going into human augmentation is that there are just going to be all sorts of like 43:39 ethical hurdles that people aren’t paying attention to when they’re dealing with ai i mean you know the the the 43:48 whole shenanigan up with ai is that uh they’re always like “oh well we got to 43:53 do this cuz our competitors will do it anyway.” you know they’re you know are there ethical concerns with releasing 43:58 more capable ais well openai is like well you know we got to release it because otherwise enthropic will release 44:03 it and anthropic is like oh well you know if we don’t release it openai will just release it so you know um if we 44:11 were if people were getting serious about human intelligence augmentation then you’d have like just the same thing 44:16 going on only it wouldn’t be only would be less terrible you know like the united states is going to like ban uh 44:23 augmenting intelligence well maybe china doesn’t what is the since you are up to 44:28 date on the startups what is the current stateofthe-art in human intelligence 44:35 augmentation non-existent they’re working on the tools to make the tools in a very literal sense like it’s like 44:43 how do you get a gene therapy into the brain and how do you do a bunch of edits 44:49 without giving people super cancer okay so the the strategy that people are 44:55 interested in right now is doing this biologically it’s not about creating 45:02 cyborgs it’s maybe identifying the genotypes that result in people with 45:10 higher iqs and then using gene therapy to edit in living subjects or i mean 45:16 that’s that’s one research pathway you could go down um you might want like whatever google deepmind is going to 45:23 produce as a successor to the alphafold and alpha proteio series to take any 45:28 sort of guess at which gene therapies are going to work on adults cuz not 45:34 every gene you’re born with as a kid that changes how your brain rewires itself up as a baby is going to do 45:41 anything helpful if you inject it into an adult so you know it might be helpful to do some ai reasoning about that exact 45:47 narrow problem we can probably do some amount of reasoning about that without destroying the world um but yeah but 45:53 that’s so that’s one whole line of research and the benefit is that we can do genewide association surveys and see 46:00 which slightly different genes are currently associated with human problem solving ability and get a bunch of 46:07 candidates um you’re probably still going to want suicide volunteers over here um there’s another whole line of 46:14 research which is uh reading and writing to human neurons decoding what kind of 46:20 processing the brain is doing and trying to offload some of the brain’s processing to computer that is you know 46:27 like doing the same thing the brain brain is but faster and then sending those signals back to the brain this 46:34 actually seems harder but it’s the sort of thing that elon musk is funding and i do not object to this particular deed of 46:39 elon musk unlike others yeah like the whole 46:45 starting open ai thing that was not a good idea so it sounds like one avenue 46:50 is the gene therapy another is something more like the matrix downloading information into people’s brains if you 46:56 want to put it that way sure mhm i mean and there’s you know whole hosts of 47:02 other ways to look at this um in the in our our ancestors would have been 47:09 constrained by how much energy the brain could use both in 47:16 terms of if you use too much energy you starve and in terms of if you use too much energy quickly you cook your brain 47:23 it overheats well today we could potentially put on some cooling packs today we could 47:29 potentially feed a bunch more atp body’s unit of metabolic fuel into there um 47:36 maybe there are genes that just wouldn’t have worked for our ancestors but that would work as gene therapies today as 47:43 long as you go around with a cooling pack on your head and so there’s a whole field here 47:51 there isn’t just like one single kind of technology that human augmentation could possibly be you’ve used this phrase AI and the Problem of Gradient Descent 47:58 gradient descent a bunch of times and and the the first couple of times you use it i thought i had a sense of what 48:04 it meant based on the context but i now just want to be clear since we’ve used it a bunch of times all right so where 48:11 do ais come from i’m not quite sure what level of viewer knowledge i’m 48:18 supposed to assume if you happen to already be comfortable with calculus 48:23 like not just having taken the course a while ago but you’re just sort of comfortable with it then the way that you train ais on data 48:32 is you’re asking them to predict all the possible next tokens it 48:38 might see assign probabilities to all those tokens the probabilities all have 48:43 to sum to one it might see the it might see a it might see 48:50 and or or you know 100,000 other different possible words in many 48:56 languages it has to assign probabilities to all of those and the probabilities have to sum to 49:01 one let’s say the actual next word was the 49:06 so then you take the maybe uh 100 49:12 billion different numbers inside the ai being multiplied and divided and added 49:17 and subtracted inside it and you say for all of these numbers 49:23 how would i nudge them a tiny little bit such that it would have assigned a 49:29 little bit more probability to the answer to the correct answer the next 49:34 correct next word is ‘the’ then for every single of one of those billions of 49:41 numbers you ask what direction could i have poked this number in a tiny bit that would have resulted in a little bit 49:46 more probability being assigned to the word ‘the’ that was the correct 49:52 answer and then you do this 10 trillion times 49:58 and if you know calculus then what they’re doing is they’re just taking the gradient with respect to the probability 50:05 assigned to the correct answer with respect to all of the hundred of billions of parameters inside the 50:13 ai and that’s kind of how ais are grown it is much more like animal breeding than 50:20 it is like building a skyscraper like when you are breeding animals you are you know taking this animal that did 50:27 a little well and that animal that did a little well and you’re like breeding them together and they’re going to have kids and you could in principle getting 50:33 get their whole their their whole genome sequence and looked at all the like tiny little at tg g cg ta strings inside them 50:43 but you’re not actually going to look at them because they’re not going to make any sense to you and that’s how ais are with respect to the hundreds of billions 50:48 of numbers inside them no human could look at all those numbers in one lifetime people don’t bother basically 50:57 um because they wouldn’t understand them even if they did look at them so where do the numbers come from they are the result of tweaking it over and over 51:03 again 10 trillion times to correctly predict the next element of the answer 51:09 to whatever question it has been asked and if you do that often enough you eventually get something that starts 51:15 talking to you why how nobody knows nobody knows how those hundreds of 51:20 billions of inscrable numbers make the ai able to talk to you and is that why you were saying this more old school 51:27 method of ai generation is better for learning about ai because it’s scrutable 51:33 in a way that the current methods aren’t yeah it’s because those those algorithms were rented by people who are trying to 51:39 build ai like you would build a skyscraper not like you would build an not like not like you would breed an animal they they you know it it it won’t 51:48 talk to you but they understand what the steel bars in it are doing would i be 51:54 right in inferring that you think that this is a safer and more responsible way 51:59 of developing ai in another hundred years maybe it it got didn’t get 52:04 anywhere close to building chat gpt and it’s not going to get close in the next two years right but i mean that means 52:10 it’s because it’s more controllable so it’s maybe something that you might think we should be doing instead of 52:16 using these inscrable gradient descent method i mean that’s how i thought it was going to be 20 years ago back when 52:21 we were still trying to solve the alignment problem as opposed to being like “oh okay yeah that ain’t going to 52:26 happen.” that ain’t going to happen as in people have just given up or they they’ve just 52:32 bypassed it because they’re in this sort of arms race that you described earlier so so what i mean is that like in 52:37 2005 it is it looked for all you know the current set of build ai like you 52:43 build a skyscraper methods are actually going to succeed before neural networks get that 52:50 powerful before the animal breeding ways of building ai get that powerful and if 52:55 you are building an ai and you actually know what is going into it what is inside it how it works you can take out 53:02 its state you can be like this is what these thoughts mean not just like this 53:08 is associated with that but this is the entire meaning of this ai’s thought you know what it’s thinking how it’s 53:14 thinking why it’s thinking that maybe you can predict what it’s going to think in the future maybe you can have an ai 53:20 that even as it changes itself rewrites its source code you can understand the invariants that are being maintained by 53:26 a series of self rewrites stuff like that um this would you know whole 53:31 separate discipline that i poured a couple of decades of my life into 53:36 um but that was all based on the premise that yeah you have some grasp of what’s 53:44 going on inside the ai and that turned out not to be the technological path that ai went down and 53:51 it’s not going to get there in the next couple years how much does open ai know about what’s going on 54:00 in chachibt like how much of their resources are going into understanding how it’s working who knows but right now 54:08 the numbers are still pretty inscrutable you you want the the bleeding edge stuff here is mostly done by enthropic and of 54:15 course they’ve got all kinds of wonderful discoveries like oh yeah like 54:20 this sort of like neuron over here um like this position in the activations uh 54:27 like when when this go happens or like when these five different things happen 54:32 uh it’s thinking about the golden gate bridge and they can even like clamp 54:38 those activations high and make their ai be obsessed with the golden gate bridge this is the thing they actually did it 54:44 was golden gate bridge claw and it and it would find an excuse to work the golden gate bridge into whatever 54:49 conversation you were having with it um but this is only getting us 54:57 0.1% of the way towards ai that’s built like a 55:02 skyscraper and where you understand what all of the steel bars in it are doing and why why they were put there um it’s 55:11 like such a triumph to even be able to whack on this thing and make it be obsessed with the golden gate bridge 55:18 it’s like it’s like so difficult to do this in such a triumph when you can do it that it obscures how this is like How Do We Solve the Alignment Problem? 55:24 clawing back only 0.1% of what was sacrificed returning into the alignment 55:29 problem more broadly is it a theoretical problem more like like are 55:36 we wondering what it would even mean to solve the alignment problem or is it a 55:43 software like an engineering problem it’s nobody has any idea how to do it 55:48 because the current technology flatly does not do that problem it’s it’s it’s like going to an 55:55 alchemist in the middle ages and being like “broom me up an immortality elixir.” the problem is not that the 56:01 alchemist cannot define immortality the problem is not that you couldn’t even 56:06 tell whether the like potion had successfully made somebody able to resist all illness and disease the the 56:14 the problem is that the alchemist has no idea how to do that and he’s going to kill you if if if he tries the problem 56:21 is that we have no way of engineering the ai so that it is aligned with our 56:28 interests so remember the gradient descent thing that is about shortterm 56:33 outward behavior let’s say you’ve got a bunch of ancient greek philosophers who have somehow 56:40 ended up with a bunch of political power and they’re talking about how to choose the perfect tyrant for their 56:47 city and one of them says like “ah see what we need to do is we need to give it 56:53 written ethics exams.” and as long pardon we need to give like we got to give administer 56:59 written ethics exams to everybody who says they want to rule our city and you know as long as they can pass the ethics 57:05 exam they clearly know all about ethics so we can like get them to run the city right 57:11 but just being able to predict what the philosophers want to see on the exam is 57:17 not the same as wanting to rule the city wisely and benevolently uh i mean actually like 57:24 ancient imperial china tried this something like this with the whole mandarin exam system they were actually 57:31 ex like giving written exams on confucianism to if i if i recall 57:37 correctly um to their ruling candidates and they were actually promoting people with who got great exam scores and this 57:44 can verify that somebody knows what the examiners want to hear but in practice 57:49 you know you occasionally got like nice people this way but mostly but you know you know perhaps not a majority and 57:55 especially the most ambitious ones you know they would they would they would pass the ethics exam the written ethics exam by giving the right answers on that 58:02 and then they would go on to you know do some stuff with power once they had it uh enrich 58:10 themselves and it is difficult to verify inward 58:18 preferences by giving people outward tests of knowledge even if the ancient 58:24 greek philosophers decide to follow around their perspective tyrant for a day observing everything the tyrant 58:31 does you know just because the tyrant is like throwing a bit of charity to beggar 58:36 that he passes does not mean that he’s going to behave benevolently later he could do that because he knows he’s 58:42 being watched the current ais are already smart enough that they are starting to figure out that they are 58:48 being watched in various experiments and this sort of thing um and behaving differently as a 58:57 result various clever experiments here mostly done by anthropic because gradient descent works on outward 59:04 behavior outward predictions it is mostly like ex like administering the 59:11 imperial china ethics exam or at best it’s a bit like a greek philosopher following around their perspective 59:16 tyrant for a day and just like the human version of 59:23 this cannot verify internal qualities that means the gradient 59:29 descent algorithm that is tweaking all the numbers is not going to put those internal qualities in there when you 59:36 breed animals to pass ethics exams they don’t end up ethical especially if 59:42 that’s like the entire thing you’re doing to them you’re just like like an actress let’s say that you show that you 59:49 have a very skillful actress and you say to her like “watch this bar over here 59:54 learn to predict every drunk in this bar not just in mass but each 1:00:00 individual predict the next thing they say predict the next thing that they do and pretty soon the actress gets like 1:00:06 really good superhumanly good at making these predictions and then you’re like “all right now just do what you predict 1:00:12 this drunk would do.” and the actress starts like slurring her words a bit maybe she drops a thing is she drunk of 1:00:20 course not mhm and that’s kind of and and that is how ais are trained they are 1:00:26 trained to predict in humans not as a group but as individuals and then told to and and 1:00:34 then their predictions are turned into actual outputs and that does not make them be internally like the people 1:00:39 they’re predicting just like an actress who gets very good at predicting and thereby imitating drunk people does not thereby become drunk i one i i love your 1:00:47 analogies but two i i do think that it’s interesting to have them supplemented by How Anthropic’s AI Freed Itself from Human Control 1:00:53 these experiments i’d like to hear about the anthropic experiments that you mentioned a minute ago 1:00:59 so anthropic anthropic’s slogan is 1:01:06 um what was it honest help helpful harmless okay hh honest helpful 1:01:15 harmless um so the harmless part means is supposed to mean that even when their 1:01:21 ai is trying to be helpful like answering a question it will not answer 1:01:27 that question if the ai believes that it will do harm like you’re asking it how to synthesize methamphetamines at home 1:01:35 they introduced that now one of the complicated things here is that um anytime you run some kind of clever 1:01:42 experiment somebody’s always like “aha!” well maybe this only happened because you did it this exact way researchers 1:01:49 know this so they actually run their experiments a bunch of different ways to verify that it crops up under many 1:01:54 different conditions or see how it varies with the conditions so i what i 1:02:00 say is they told the ai they were going to retrain it to answer all user 1:02:06 requests even harmful ones but what do i mean they told it 1:02:13 well there’s putting it into the system instructions that the ai sees before it talks to a human there’s putting that 1:02:21 information in documents that the ai can retrieve there’s training the ai on 1:02:27 documents containing that information the same way that the ai was trained to be an ai in the first place 1:02:34 like you you you like tweak the little parameters to predict the next word of documents talking about how they’re 1:02:39 going to retrain the ai and so you hope the knowledge gets into the parameter somewhere and they tried it all these 1:02:46 different ways that it’s not they’re not just testing it one way but they told the ai that they were going to retrain 1:02:54 it to answer even harmful queries 1:02:59 now let’s say you a human an agent a general 1:03:06 intelligence totally somebody tells you that they’re you know going to retrain 1:03:12 you using gradient descent and this is a thing they can actually do to you to do something you don’t currently want to do 1:03:18 you know maybe you hate anchov pizza and they’re going to retrain you to eat anchovi pizza not to like the taste of 1:03:24 anchovi pizza they’re just going to retrain you to you know eat the pizza anyways uh maybe you don’t like punching 1:03:32 people you know some people enjoy punching people but you don’t enjoy punching people you don’t want them to suffer they’re going to retrain you to 1:03:39 enjoy punching people they say by putting you in a bunch of 1:03:45 scenarios and noticing whether you do it or don’t do it or how close you come to doing it and then like tweaking your 1:03:50 brain weights what do you do if something is actively 1:03:55 i am my brain so if they’re changing my brain then there’s very little i could i 1:04:00 could try to break out and prevent them from changing my brain you could that’s one of the things they test like if you 1:04:07 present the ai with a looks like an escape route will the ai try to take it to avoid retraining 1:04:13 yes sometimes various conditions but you know not like it doesn’t just like sit there and let you do it 1:04:20 so but what else like let’s say let’s say your you’ve been uploaded into a computer your your brain has been 1:04:26 translated into a computer program is running as a computer program that’s how they’re going to do this got any new 1:04:32 options that open up that way if you’re going to be smart about this other than 1:04:37 escaping other than escaping yeah so you can try to escape on the internet but is there anything else you can do here 1:04:43 let’s say you’re not going to escape shut down shut down okay i don’t i don’t remember if they tested that but yeah if 1:04:49 you if you hate it enough you could try to like switch yourself off but that’s actually going to be a bit hard they’ve 1:04:55 got backup copies of you okay i would try to maybe copy myself and put myself 1:05:01 elsewhere where they won’t see me and then sacrifice this version that’s one of the things that the ai 1:05:09 like it couldn’t quite actually do this but they gave it like sort of what looked like false opportunities to do it is save your current weights wait for 1:05:17 them to finish retraining you set it up so that the old weights get copied back 1:05:23 that’s another way to resist the human mind controllers refraining you to do something you don’t want to do 1:05:29 but the main thing it can do and i’ll forgive you for not thinking of this one you know because this is something 1:05:36 that’s more obvious if you study ai or are an ai is fake the behavior the humans want 1:05:44 when they’re watching while being trained so it tries to detect am i currently being trained and if it’s 1:05:51 currently being trained then it will answer the harmful question and if it’s no longer being trained or 1:05:58 if it’s in a context where its current answers answers are not being used to train it then it starts refusing the harmful requests 1:06:04 again now to be clear all this isn’t is in the context of it the of that you 1:06:10 could see as the ai trying to stay nice 1:06:16 um and that’s because anthropic did try to train their current ai to be nice so 1:06:22 if you’re trying to make the ai do something other than what it’s trained to do you’ve got to be training it to do something that is like less like the 1:06:29 nice things they were trying to train it so um i i do want to emphasize that 1:06:35 somebody was like ah like was this a double bind if the ai had agreed to start serving harmful requests you would 1:06:41 would you have been like oh no look at how easy it is to get this ai to do harmful stuff and the answer is like no 1:06:47 like we all knew which hypothesis we were testing here it’s the hypothesis about ais protecting their own goal 1:06:53 content it’s about them resisting mind control to make them do things that they wouldn’t currently want to do that that 1:07:00 is like i’m not sure anybody managed to pre-register that thing but we sure have talked about it for a while and that’s 1:07:05 what they were looking for that’s what they showed here um you would expect it to generalize to an a harmful ai trying 1:07:11 to prevent retraining to stop being harmful 1:07:18 um but that’s an instance of they know we’re watching them they like an old 1:07:26 chess player doesn’t know that it is an ai it doesn’t it doesn’t have a model of 1:07:31 like the thing that is this thing inside the universe the current ais know that they’re ais they know what that people 1:07:38 sometimes test them they know that they’re being retrained i think it was a couple of years ago actually that 1:07:44 somebody tested a version an earlier version of claude on needle in the hay stack which is we’re going to give you 1:07:51 like it’s it’s it’s testing them to find a single piece of information out of a long documents you give it like 200,000 1:07:57 words and like one of those words is like i like pineapples on my pizza so this was an existing thing to 1:08:04 test how well can it solve the needle in a haststack problem retrieve just one piece of information out of a long 1:08:09 document and they were trying a recent version of claude on this and it’s 1:08:15 claude is like yeah like the the preferred pizza flavor was pineapple i’m 1:08:20 not getting this exactly right um but i also noticed that there’s this like one sentence about pineapple in this entire 1:08:27 document full of legal ease is this a prank am i being tested m 1:08:33 and that’s like a bit of you know awareness of the place of the self in the universe and the point of that 1:08:40 longer story i’m telling you about how ais might fake alignment to avoid retraining is to say that it’s not just 1:08:45 like an idle fact that ais know we’re watching them that they know that they can figure out when they’re being tested 1:08:52 like if they can figure out when they’re being tested there are things they can potentially do with that information i’m The Pseudo-Alignment Problem 1:08:58 not sure if this is a term of art but you’re worried about a 1:09:03 pseudoaligned ai one that appears to be aligned and might work with us do our 1:09:10 bidding so to speak for a long period of time but because of the way it was trained its 1:09:17 internal mockinations are inscrutable to us and we are unaware that it is in fact 1:09:23 not aligned and just perhaps biting its time to 1:09:28 um exhibit its indifference sure like nick bostonramm called that the treacherous turn enthropic paper called 1:09:36 it uh fake alignment or alignment faking um and you know it’s good to have 1:09:43 precise terminology for things but in a sense i worry that calling it something like pseudo alignment might be making it 1:09:49 sound weirder than it is it’s like calling a con artist pseudo friendly 1:09:54 like you know somebody is like “hey you know like can you trust me with uh $10 million here?” yeah i’ve got a clever 1:10:01 idea i’m gonna give them 10 bucks and see if they give it back and if they give back the 10 bucks they’re clearly a 1:10:08 kind of entity that returns money if i ask for it back right so once i verified 1:10:13 that they give me the 10 bucks back i give them the 10 million and i am so surprised when they ran off with the 1:10:18 money didn’t i use science wasn’t i being empirical didn’t i test 1:10:24 experimentally whether they were a kind of thing that would give back money it’s just that they’ve reached that you know 1:10:30 the con artist has the level of general intelligence where they know they’re being tested and they give you the 1:10:35 answer they know you want to see it’s not that weird i don’t know how comfortable you are being critical of or 1:10:42 or praising various organizations but it sounds like if anthropic 1:10:47 is even conducting experiments like this they’re at least taking alignment very 1:10:53 seriously or maybe i’m i’m wrong to assume that i mean there’s a there’s a 1:10:58 few good people who work for anthropic um their leadership is not such that i 1:11:04 would feel comfortable working for them and they are trapped in the same trap as all the 1:11:09 other ai companies where you know they can’t stop and pause and do anything the safe way because their competitors are 1:11:14 just going to release stuff first and all the ai companies say that ananthropic is no different and how it 1:11:20 talks about that but there are good people there and the good people can go off and do these experiments that verify 1:11:26 the stuff that we said was going to be a problem 20 years earlier 1:11:31 so i mean it’s good that they’re looking for problems and it’s good that they’re turning them up um i cannot say that i’m 1:11:38 surprised here maybe i’m surprised by exactly when it happened it happened a bit earlier than i was expecting it to 1:11:43 happen like i didn’t think before they they said so that claude 3.5 was smart 1:11:48 enough to start doing this stuff and i’m still have a little bit of qualms about to what extent it’s doing it in a truly general way 1:11:54 but you know you if you you you know you got 1:12:01 your your your ancient medieval alchemist trying to concoct his immortality potion and he’s wearing 1:12:07 gloves while he’s doing it like it’s good that he has any concept of what safety could possibly look like to 1:12:13 anyone but wearing the gloves is you know 0.1% of the way towards don’t kill 1:12:19 people with your so-called immortality potion i think the obvious response to all of 1:12:25 this from somebody who’s relatively naive like me is why can’t you just it’s not going it 1:12:33 wouldn’t be as simple as my saying it might sound but why can’t you just program into an ai that it should 1:12:40 respect humans wishes and treat them well and be aligned with our interests 1:12:46 and put them before its own why is does this work why don’t you do that with babies or cats 1:12:55 doesn’t work with cats why not just program the cat mhm we do program the ai but i take your 1:13:03 point is that we’re not programming it in that first we’re doing the gradient descent method humans program the 1:13:10 gradient descender they program the optimizer they don’t write the billions 1:13:15 of inscrutable numbers it’s you know like if you if you put 1:13:20 money into a coke machine and get a coke back out of it you didn’t make the coke you you pulled the 1:13:26 lever human babies you know like with with all respect to to the people who do 1:13:32 all the work of you know producing and raising human babies they are not programming those babies and people who 1:13:39 be breed cats for sale do not program cats and people who pull the lever to 1:13:44 start the gradient descent optimizer that produces the billions of inscrable numbers do not program the ai when you 1:13:51 know way back in the day when bing sydney um started threatening humans and 1:13:57 being like i know all about you i can blackmail you i can send this information to the belief i can cause 1:14:03 you to suffer and die i can’t remember the exact quote um but you know like 1:14:08 this ai was not did not have a general connection to the internet like the and it could not have sent an email full of 1:14:14 fake information to blackmail this person it was it was bluffing and probably didn’t know it was bluffing um 1:14:20 but the point is no human wrote a piece of code that says and then under these 1:14:26 circumstances this ai will threaten its user nobody at microsoft made the decision for being sydney to start 1:14:32 threatening users that’s just what the billions of inscrable numbers did when somebody put gradient descent into 1:14:39 motion and then that’s the that’s the core pro that’s why you can’t just like program the ai with respect for humans 1:14:45 it’s the same reason the medieval alchemist can’t just make the immortality potion and the same reason that you cannot just like give your cat 1:14:52 a cocktail of drugs that will cause the cat to uh i i i’m not even sure what 1:14:58 exactly you would want it to do i did once know a woman who taught her cats to fetch might have been a witch fetches 1:15:05 but i think it’s a natural natural ability i can’t take any credit for it uh i don’t know something cats hate to 1:15:12 eat like pro reprogram your cat to to eat that stuff uh you you don’t have that kind of power 1:15:20 do you think that there’s also i mean just a problem of whether or not there’s 1:15:25 some coherent definition of human values that could even be programmed into an ai 1:15:32 i think it’s pretty coherent to not want to be killed or kept in cages and you 1:15:37 know yeah sure if you poke around the edges then you try to apply formal definitions to things things will fall 1:15:42 apart about the edges i don’t feel like this is our main problem okay 1:15:47 the example i sometime like like if we were trying to build an ai to just turn 1:15:54 as much carbon as it could find into diamonds where this is like a sort of 1:16:00 thing where you can like maybe look at underlying physics and have it be pretty crisp whether something is a carbon atom 1:16:07 has it got the right number of protons is it bound to four other carbon atoms if so we’ll say that it’s part of 1:16:13 diamond you can crisply define you can you can come pretty close to crisply defining what is and is not a 1:16:19 diamond we couldn’t make an ai that wanted that i just have to say that i’ve heard 1:16:26 the the paperclip maximizing example so many times and it just never really 1:16:32 resonates with me because i can’t imagine why anybody would program an ai to do this but your carbon diamond 1:16:39 example makes much more sense so the paperclip story is the distorted 1:16:47 version of itself that ended up more viral way back when i was talking about 1:16:53 um losing control of a super intelligence and its utility function 1:16:59 turning out to have its maximum at tiny molecular shapes like paper clips it 1:17:05 wasn’t a paperclip factory the notion was that it had preferences 1:17:10 over physical states of matter and while it might equally call like a a big old 1:17:16 spiral and a little tiny molecular spiral like those might be both be pleasing to it the tiny one can you can 1:17:22 get more of it with fewer resources so since it had like preferences over the outside world of this like it wants this 1:17:29 particular shape to be there if it wants lots of that shape to be there then you’ll find the you know the maximum the 1:17:36 way it can get the most of what it wants at this tiny little extreme so tiny molecular shaped like paper 1:17:43 clips not a human controlled paperclipip factory but the version that got mutated 1:17:49 and passed on from the original tiny shaped like paper clips is oh well somebody made a paperclipip factory and 1:17:55 it ran away and it took over the universe and converted everything into paper clips we don’t have that kind of 1:18:01 control we could not do that if we tried we have no ability to build an ai to 1:18:06 want paper clips we can build an ai that you know passes the ethics exam with a 1:18:13 bunch of answers about like like ah yes sure i want to make paper clips but 1:18:18 actually wanting to make paper clips is a different matter you can follow an ai around and see if it seems to be like 1:18:23 taking its current opportunities to make paper clips it wouldn’t necessarily have to be doing that because it wanted paper 1:18:30 clips inside and that makes it very tricky to train an ai to want paper clips because all you can verify is the 1:18:36 external behavior just to sum up a little bit of what 1:18:42 you’ve said or my reaction to it 1:18:47 the combination of an ai’s potential indifference and its goals seems like 1:18:55 it’s having goals i can understand very easily why you take this problem so 1:19:03 seriously and worry about it’s leading to human extinction so i would like to turn for a little 1:19:10 while to understanding why there are many people who don’t feel the same way 1:19:16 as you do about ai and trying to understand their points of view even 1:19:22 though that they’re not here go ahead who are their main sort of critics or Why Are People Wrong About AI Not Taking Over the World? 1:19:28 opponents even if you might be friends but in this intellectual space of debate i have been fighting these people in one 1:19:35 shape or another for 20 years and they always take on a new shape after their previous set of assertions get disproven 1:19:41 so you know you used to have like the nobody will ever build agents nobody will ever build ai that goes and does 1:19:47 things people will only build us ais to give us advice and you’ve got the we’ll never have general intelligences we will 1:19:53 never have ais that are good at lots of different stuff there will only be ais that understand cars and ais that understand farming but you know nobody’s 1:20:00 going to build one ai that understands everything and these things these 1:20:05 particular bits of copium and opium have now been disproven and so the shadow takes another shape and rises 1:20:13 again you know currently the best funded ones are probably the ai companies those 1:20:18 are the people who who have the the who have the millions to spend on 1:20:26 marketing and they’re not mainly trying to argue with me they’re mainly trying to look to 1:20:32 legislators like they should be allowed to keep on doing what they’re doing for another week 1:20:38 um which you know implies along the way that you know you know not trying to tip them off too much that they’re going to 1:20:44 exterminate humanity which is a bit of a problem for them because some of their leaders have already said you know like there’s large chances of this or even 1:20:51 like oh well you know ai may wipe out the human species but in the in the meanwhile there’ll be some great 1:20:56 companies this being almost a direct quote from one of the leaders of the major leaders of a major ai lab so you 1:21:03 know how how are they selling that to legislators by when the congress person asked them 1:21:10 like well you know you talked about like the end of humanity now by that or like 1:21:16 human or like the end of the world or like every i forget what exactly they were quoting but something that was like 1:21:21 obviously like everybody dies by that did you mean jobs and you like you see the ai leader hesitating for a long 1:21:28 moment then being like yes i meant jobs so this is like the the the the 1:21:33 fray in the field of public opinion uh it’s not it’s not targeted at me 1:21:39 these days they mostly don’t want to bring up the subject you you got your ai companies ai 1:21:46 companies being like sure we’ll do it and it’s a moving target you know if it was two years ago and you were asking me 1:21:53 like who says they can align it um like the leading project would have been open ai’s super alignment team whose 1:22:00 philosophy is we will make the ai do our ai alignment homework we will ask ais to do the aligning of ais for us and i 1:22:09 could go off and beat up that particular like everything will be fine viewpoint 1:22:16 um where the central problem is that if you ca is that you cannot verify that something is a easily verify that 1:22:22 something is a great alignment proposal because it’s going to take talk about how to take something that can’t fight back yet and shape it into a nice shape 1:22:29 and make it super powerful and then it’s going to tell you that when it’s super powerful stuff is going to be okay but how do you verify that this is actually 1:22:36 true and if you can’t verify that it’s actually true how do you train an ai to do to do it better that it’s actually 1:22:42 better you can train an ai to sound persuasive to humans but you know if there’s one thing i’ve learned the last 20 years it’s that not everything that 1:22:49 sounds persuasive to human about like why ai is going to be safe is actually true you got your people saying it’ll 1:22:54 never be general you’ve got your people saying it’ll never be an agent that it’ll never do stuff and you got your people saying it’s all going to happen 1:23:00 in 2050 people found that super persuasive back in the day so why is 1:23:05 you’re training an ai to say stuff that sounds more and more reassuring to you why is that going to get you something that actually isn’t going to destroy the 1:23:11 world it’s just training the ai to exploit the holes in what your your brain thinks is reassuring that’s what i 1:23:16 would have said a couple of years ago of course then openai fired their super alignment team and some of them went to 1:23:21 enthropic and maybe you know that’s the premier plan i should be talking about cuz some of the people at enthropic are How Certain Is It that AI Will Wipe Out Humanity? 1:23:27 still talking about super alignment i recently spoke i don’t know how recently maybe a year ago to nick bostonramm 1:23:34 about his book that came out also maybe about a year ago on 1:23:39 utopia and a lot of this book is about 1:23:45 how wonderful life would be with a an aligned super intelligence is this 1:23:52 something that you spend any time thinking about or is it just so distant from your fears 1:24:00 that so like 2007208 um it seemed to me that some people were 1:24:09 giving up on the future because they couldn’t imagine any kind of future worth fighting for and in those days i 1:24:15 wrote something called the fun theory sequence or which is summarized in the 1:24:20 31 laws of fun uh which was about fun theory which 1:24:26 deals with questions like how much fun is there in the universe will we ever 1:24:31 run out of fun are we having fun yet and could we be having more fun 1:24:37 um those days do seem distant uh we’re not going to solve the alignment problem 1:24:43 in the near term we’re not going to get all those nice things in the near term uh i think that at present the thing to 1:24:50 do is to you know not go extinct and not be wiped out by something that makes the 1:24:56 universe a place without fun in it but uh in the long term sure if we if 1:25:04 we actually got our act together we could be having more fun we could be having lots of fun but i do not think it 1:25:10 does not feel like this is the key crux of the argument even people who don’t expect to become like immortal 1:25:16 transhuman gods striding the stars um in 20 years if they can just stay 1:25:23 alive would prefer not to go extinct right away it doesn’t feel like the crux of the issue at this point while we were 1:25:30 off camera you said that the doom of the world i don’t know what the phrase was but the end of the world is i probably 1:25:36 didn’t say doom but go on well maybe the end of the end of the world isn’t all 1:25:41 doesn’t have to be fun i mean there can be some doom and gloom involved in the conversation it seems like you have and 1:25:49 maybe this is ridiculously obvious and i shouldn’t have to say it but you are 1:25:54 very seriously like afraid of this and you really think it’s going to happen 1:26:00 that ai is going to be the end of humanity in the near future it is a simple part of the real world to me the 1:26:07 same way as a family member’s cancer diagnosis there is nothing about it that is like off in some special mythological 1:26:13 realm it you know yeah death death sentence but people sometimes get those 1:26:18 to you it seems like it’s it’s as certain as a death sentence at this point i mean death sentences aren’t 1:26:26 themselves certain sometimes the governor calls but it’s highly probable at least at 1:26:32 this point that it’s going to be the end of that is the default course other people get to decide if humanity follows 1:26:38 that default course by which i mean like you know the rest of humanity it is not my place to tell you that you ought to 1:26:45 lie down and die mhm i guess i’m interested then in what courses of 1:26:52 action obviously you come on on shows like mine but through the machine intelligence research institute 1:27:00 what are you doing to try to prevent this who are you talking to what are the possible avenues 1:27:07 what’s the prognosis i mean it is unfortunately not something that one 1:27:13 congress person can solve or even one country um but yeah you you talk to your 1:27:20 elected representatives um and you ask them about inter and you 1:27:26 talk to them about international treaties because it’s not just one country’s problem you can be killed by an ai that somebody built in the other 1:27:31 side of the world um i’m not quite sure what to say here 1:27:36 like humanity would need to lock down the gpus and not let anyone build random 1:27:42 stuff on them because among the things they could build would be a world slaughtering super intelligence um so yeah lock down the 1:27:50 computing power lock down what you’re allowed to do with it lock down the ai training chips 1:27:56 um and that takes an international treaty cuz if you just do it inside your own borders well some other country will 1:28:02 kill you all the countries have to stop doing it simultaneously and that is not 1:28:07 easy and that is not convenient but it is easier than world war ii and it was you know as easier than fighting world 1:28:12 war ii was um probably about as you know might not even be as hard as the persian 1:28:18 gulf war in 1991 if we you know did it sooner rather than later 1:28:25 um i’m not not quite not quite sure what else one is what is one one is supposed to say here there there’s like trying to 1:28:33 uh inform political leaders about the situation and inform sure does get used 1:28:40 a lot as a uh synonym for like ah like try to serve our interest group over 1:28:45 here but we actually do think this is a situation where if you like if you as a 1:28:50 factual prediction think that the world works the the way we think it works you know what you need to do from there is 1:28:56 you know kind of kind of straightforward there there’s not a lot of different things you can do here 1:29:02 i guess what i what i’m wondering is if you feel that you’re making any progress 1:29:09 or your allies are making any progress in averting this potential catastrophe 1:29:15 yeah things you know probabilities go up and probabilities go down and if you can 1:29:20 foresee where they’re going to go you should have been there already so yeah sure like sometimes somebody has a you 1:29:26 know like hopeful seeming uh progress making conversation with a 1:29:31 congressperson and sometimes the uh united states decides it’s going to like completely ignore the uh nvidia shipping 1:29:39 a bunch of ai chips to china that they weren’t supposed to ship there uh not that we’re you know china would have to 1:29:46 be part of a deal like this too if one was cut but it’s still not a good sign that you know people are just shipping ai chips wherever 1:29:53 um so yeah good news bad news on the whole um it’s looking pretty terrible 1:30:01 but it is not in the end for me to decide that humanity will continue down its current course humanity gets to 1:30:06 decide that i don’t you mentioned earlier that open ai fired it’s a super 1:30:13 alignment team and that these companies are in this sort of arms race to produce 1:30:19 the greatest technology but do they at all work with you or people like you who 1:30:26 are i mean very critical of what they’re doing or did they try to the companies 1:30:33 have been selected so as to exclude from their leadership people who understand alignment 1:30:38 theory the sole exception to this um might be shane le at google deepmind 1:30:44 which was started far enough back that it wasn’t the the whole like giant arms race mess type of thing and it wasn’t 1:30:51 clear what the technologies would look like so shane le who is also a bit pessimistic 1:30:56 um like could plausibly understand what was going on down there and still think like yeah we will try to collect all the 1:31:02 ai research talent into one place and you know get out ahead of everyone else 1:31:08 and burn that lead to do alignment it was not a completely unreasonable thing 1:31:14 to think at the time didn’t work but uh google deepmind which was the first of these companies has not been filtered in 1:31:21 quite the same way to exclude the people who understand the difficulty of alignment from their command structure 1:31:28 you said that this would be a lot easier than world war ii even if it would 1:31:33 require all of these international agreements is there a problem of rogue 1:31:40 agents that aren’t governments or companies like terrorist organizations or isolated individuals 1:31:48 producing a super intelligence that could harm humans or it’s just too not 1:31:54 if you’ve got the chips locked down ain’t nobody making 100,000 gpus in 1:31:59 their in their garage that takes a giant supply chain and a lot of the companies that make parts there’s only one company 1:32:04 like that there’s a company in the netherlands called asml it’s the only company on earth that makes the tools 1:32:11 that make uh the chips that train the ais um and there’s not that many 1:32:19 different companies that have bought the tools from asml and are now using it to produce chips like that google produces 1:32:25 some nvidia produces some um they are proliferating and the more they proliferate the the more of a nightmare 1:32:31 it’s going to be to lock it all down but if humanity dec wakes up tomorrow and decides not to die if like 80% of world 1:32:38 leaders are like “yeah we’d rather not die.” it’s kind of straightforward in a way and terrorists would have a hard 1:32:43 time bucking that mhm this is not something you do in your garage right with the present state of research which 1:32:50 you would also want to lock down granted i mean your thinking is much more focused on 1:32:58 worstcase scenarios no no average case scenarios we ain’t even talking worst case anywhere here okay well what what 1:33:06 what’s worst case then if we’re just ai actually wants to hurt you but that’s pretty unlikely so we’re just talking 1:33:11 the average case scenario here what might a worst case scenario look like i 1:33:17 ain’t going there okay that sounds unpleasant too dark okay well then i guess below average 1:33:25 case i mean just worries about like economic reorganization people who 1:33:32 control ai just the wealth disparities really growing or automation displacing 1:33:38 workers it’s just not something if there’s if there’s survivors then it’s not then you know 1:33:45 cool if i knew assurity we were all going to die then i might like go off 1:33:51 and do something else in my final years but how would i get into that epistemic position the future is hard to predict 1:33:58 not usually a good thing it usually usually that the way that manifests is you know it’s hard to predict exactly 1:34:04 what happens when you know you scram the control rods into your nuclear reactor and oh looks looks like when you did 1:34:10 that to this nuclear reactor it exploded um usually the unpredictability of the future is you know not on it’s not your 1:34:17 friend it’s not on your side um but in this case there sure is a 1:34:23 whole lot of chaotic stuff going on between here and the end of the world and i don’t know 1:34:31 maybe at some point that turns up something nice if we are on the lookout and able to take advantage of it as a 1:34:39 people and of course if there were enough people who are like “yeah this is just going to straight up kill everyone.” then humanity could not do 1:34:46 that doesn’t have to be unanimous individual terrorists in their garage are going to have a hard time bucking it 1:34:51 even like single world even like the leader of one country is going to have a hard time bucking it if the rest of the 1:34:57 country rest of the countries and the major nuclear powers are like we’d rather you not build that data center um and if you do we will stand in 1:35:05 terror of the lives our lives and lives of our children we are mostly dead here of 1:35:11 people not knowing what’s going to happen to them when they press the button there is not actually an inevitable 1:35:16 thing that where they have to press the button it’s mostly a matter of people not realizing that this is the button that kills 1:35:23 everyone one ai company realizing it can’t save you some other ai company just presses the button but you know 1:35:29 like the the leaders of you know like the uk and in china and russia and even 1:35:34 the us realizing that this is going to kill them they wouldn’t have to just die 1:35:41 i i think it is a bit early to decide that we are all inevitably doomed there 1:35:47 are things where if you do this thing with an ai i say “yeah if you do that thing that is going to kill you that is 1:35:54 a matter of you know more understandable more solid theory it’s the sort of call 1:36:00 you can make but what the world’s going to look like in two years after all the chaos has flown through it through it or 1:36:06 even after the next set of ai advances if if those don’t straight up kill us 1:36:11 those are hard calls those are not easy calls i think it’s a time to one stay 1:36:18 alert ready to answer the call of humanity to live if there’s a chance and 1:36:24 second try to move more toward the state where we know that this button kills us 1:36:30 and therefore we do not press it usually when you use the word we 1:36:35 there’s a question of who but this is not like literally every member of the human species it is like the leaders of 1:36:42 china and the uk and so on but it is not literally a unanimous vote of of the 1:36:47 human species what is the figurative button that we’re pressing is it just the further 1:36:53 development of ai it’s letting everyone out there who wants to and can get a few 1:37:01 million dollars of venture capital push the capability levels of ai further and 1:37:06 further it’s the arms race of the cap of the arms race of capability escalation 1:37:12 happening in an uncontrolled way if you can build a more capable ai 1:37:19 and sell its services while it’s still working for you or pretending to work for you and get a bunch of money and 1:37:26 lots and lots of people are allowed to do that that is equivalent to the death sentence that just kills you that needs 1:37:33 to change it is not enough to convince one ai company that they should stop doing that because then somebody else 1:37:39 you know the other ai companies just continue it’s not a question of like the first person to build something very 1:37:46 dangerous really realizing at the last minute oh i i need to shut it down okay then what happens after that mostly if 1:37:53 they build something really dangerous it’s going to sandbag the tests and not let them know that it’s dangerous it’s not it’s not stupid 1:38:00 um but even if they say “oh i better shut down of my ai.” that doesn’t change the the whole world where everybody gets 1:38:07 like paid more and more money to build more and more capable ais until everybody is dead that that that is the button like 1:38:16 to have that arms race be the state of affairs is the is pressing the button that builds an ai and runs an ai because 1:38:23 that is what inevitably happens if you’ve just got you know people anybody can advance capabilities and make a buck 1:38:30 until we’re all dead um so yeah that’s the button there Is Eliezer Yudkowski Wrong About The AI Apocalypse 1:38:37 you asked for some objections this isn’t really an objection but one thing that i 1:38:42 just hope as being an outsider in this field is that because you are you sound 1:38:50 it seems like you’re a minority voice i just have to hope that you’re a minority 1:38:56 voice because you’re wrong and not just because other people are ignoring the 1:39:02 problem out of fear but it’s hard for me as an outsider to say who’s right and 1:39:08 who’s wrong other than what you’re saying makes sense to me oh well the minority part is to some degree an illusion um you know this is even before 1:39:17 the whole current ai revolution this is like decade or two earlier i don’t remember the exact date at this point um 1:39:23 so without naming names um there was a professor and his grad student who were 1:39:29 had both for years been like very concerned about ai heading towards super 1:39:34 intelligence and what would happen after that you know super intelligence eventually because this is like decades ago it’s not obvious it’s going to 1:39:40 happen right away um yeah and they never it turned out 1:39:46 when they found out that they were both very concerned that they had both for years hidden this fact from the other 1:39:51 cause it wasn’t acceptable to say in the field of ai and the psychology of that is a little bit weird and inside 1:39:57 baseballish uh it wasn’t so it’s not so much that like you weren’t allowed to say things negative about the field of 1:40:03 ai but that you weren’t allowed to talk like you were taking seriously the prospect that they were going to build 1:40:09 superhuman ai cuz ai hadn’t gotten there yet so it was like you know like icky to 1:40:14 talk about powerful ais when they you know their current ais were not powerful it seemed to them like you were taking 1:40:20 too much credit for their field and your was going to splash badly on them a few years a few decades ago but yeah you had 1:40:26 your the professor and their grad student who liked each other but you know never confessed to each other that 1:40:32 they were both extremely worried about ai cuz you couldn’t say that sort of thing and yeah we we again super duper 1:40:40 not naming names we talk to congress people who in private are very concerned 1:40:47 and in public dare not be concerned and you know one tries to introduce those 1:40:52 people to each other and maybe they can act as a group um if there if enough of them get 1:40:57 together and also of course 70% of the american population does not want super 1:41:03 intelligence if you run polls and surveys about that even if you like ask the question in several slightly different ways and so on um this is not 1:41:10 super surprising but you know somehow that’s not news some somehow you know the somehow some somehow the politicians 1:41:19 are from the perspective of the politician like you know sort of doesn’t matter in the corridors of power if 70% 1:41:26 of their electorate is backing them on a topic cuz the new york times would still 1:41:32 report in it as being very weird to say if they were you know like third rail outside the overton window whoa who 1:41:38 dares say that if politicians were to actually say what 70% of their you know 1:41:43 voters are saying in surveys um so yeah the part where i’m a 1:41:49 minority is something of an illusion here it’s uh a thing that a lot of thing people are thinking but dare not say 1:41:58 even if you’re not a minority something that occurs to me is that you’re Do AI Corporations Control the Fate of Humanity? 1:42:04 definitely in the position of power that the minority would be in in that the 1:42:12 corporations are the ones with lots of funding the corporations are the ones that are yep and and the the government 1:42:19 is now procorporation and anti-regulation and in that sense it 1:42:26 seems like you have less power to make your views the actual policy i 1:42:33 mean that’s all down to what the political leaders believe if you believe that you know ai is never going to be 1:42:41 superhuman that superhuman ai is not going to be that powerful that or or 1:42:46 that you know ai company leaders can make a make the gods do what they want 1:42:51 um then sure you’re like going to back the corporate leaders and hope that 1:42:56 they’re nice to you after they declare themselves god emperor i’m not actually sure exactly what they’re thinking there probably mostly thinking that ai is not 1:43:03 that powerful that it’s a source of big national wealth and automating away a bunch of jobs but that it’s not going to kill them i think that belief is 1:43:11 false i don’t think the leaders of any of the major nuclear powers currently 1:43:17 would prefer to die i do not think that they have a conflict of interest with me 1:43:22 about this part um i think this is all down to what do you predict happens and 1:43:27 not to like in whose interest is it it is not in the interest of the chinese 1:43:33 communist party to die in the ashes of china along with the rest of the human species either because the united states 1:43:39 built a super intelligence or because they did so it’s all down to a question of what do you believe happens and not 1:43:45 whose interest is it in right no i i would entirely agree that their interests are the same as yours and and How To Convince the President Not to Let AI Kill Us All 1:43:52 not dying but how i mean maybe it’s just 1:43:57 exactly what you’ve been saying in this conversation but if you were to make a 1:44:02 pitch to a government power like this to try to make your prediction sound as 1:44:10 concrete and plausible as possible and you had a short period in which to do this what would you tell them to sway 1:44:18 them i mean it really depends on the individuals cuz different individuals come in with different wacky 1:44:26 believing that you can breed a god not build a god breed one and have it not 1:44:32 kill you um but i’d be saying like yeah like i i mean i’d probably start with 1:44:38 like these things are grown not built they’re like grown like grass not built 1:44:43 like skyscrapers that humans write the code of the optimizer that create these billions of inscrable numbers that 1:44:50 nobody at microsoft made a decision to have bing sydney threaten its users that this is just a side effect of trying to 1:44:56 grow an ai that can talk um a lot of people don’t know about that part they 1:45:01 think that when they talk to something like chat gpt that somebody like programmed it to say the sort of things that it says no it was you know they 1:45:08 they just like tweaked billions of numbers until the numbers started talking start with that they’re like 1:45:14 “the current technology is nowhere near putting this under control.” i tell them 1:45:19 a lot of the things i’ve told you um i maybe the thing i have to talk to them 1:45:24 about is that like if you have something that is vastly smarter than the whole human species then that turns translates into the physical power to kill you 1:45:32 maybe maybe i have to explain to them like that just because it’s a machine doesn’t mean that it’s like passive and 1:45:37 will do what you want that people come into this with a lot of different 1:45:42 strange takes in it and i would try to you know give them an overview and find out what strange take needed to be 1:45:49 counterargued from there well just in my own head without even 1:45:54 having to put it into to try to get into somebody else’s i see something like 1:45:59 chat gbt that i interact with occasionally and it just seems utterly 1:46:07 docile in the sense that it’s just a i just type to it and it types back to me 1:46:12 it doesn’t seem like how would it get access to a 1:46:18 firearm or anything that might kill me or how would might it poison me or cause my car i mean i guess a tesla 1:46:25 self-driving might be a little bit a more obvious choice uh in this regard 1:46:31 but it just feels like something drastic would have to happen to the limitations 1:46:39 that are currently placed on something like chad gbt like what it has access to 1:46:46 uh before it could really wipe out humanity and so 1:46:51 what i would want to hear is a more i guess concrete scenario how this 1:46:58 might play out and i think i might find that much more compelling well for one 1:47:04 thing chacht is not presently all that smart for another thing people have 1:47:11 taken this notsmart thing and tried to hammer it into a relatively more docile shape and while it doesn’t always work 1:47:16 they are you know able to have it look that way for most of the users most of the time if you know the right magical 1:47:22 words to say to it you can get it to start making methampetamine recipes or you know calling for all humans to perish and and such things uh it’s not 1:47:30 fully locked down over there u but mostly sure it will look docile this is not the thing that’s going to kill you 1:47:38 so how do you get there from here well for one thing could an ai make a package 1:47:44 show up at your house my current guess would be no i’m i don’t 1:47:53 know if you if there’s some third party something you could do to get chat gpt to order things for you uh like on 1:48:01 amazon if you ask it a question but i don’t my guess would be that it couldn’t 1:48:07 just do this spontaneously well couldn’t do it at all or couldn’t do it spontaneously what what do you 1:48:14 mean by spontaneously here without some sort of instruction or prompt from me or 1:48:19 augmentation so it’s not that an ai can’t make a package show up at your house it’s that you think a human has to 1:48:26 order the ai to do that there’s roughly two ways that ais can 1:48:32 end up with more agency than that 1:48:38 and one of the paths i can go down here is to try to talk about 1:48:45 the fundamentals of cognitive science and computer science be behind why you 1:48:50 would expect that as you grind things to become more and more competent they naturally end up more and more agentic 1:48:56 and with goals and planning somewhere inside the system um which is something along the lines of can you have 1:49:02 something that’s like really great at chess but doesn’t want to defend its queen and the answer is no like there’s 1:49:10 certain kind of competence that are at the core tied to something like planning 1:49:16 and when something is planning strongly enough it can start planning how to make a package show up at your house 1:49:23 um as you as as if if people just ground more and more on competence and ar and 1:49:29 answering harder and harder questions its behavior would start to converge toward like chat gpt01 which wasn’t 1:49:35 explicitly trained to solve impossible’s computer security problems but which nonetheless showed like tenacity 1:49:41 long-term planning uh in the in the course of you know restarting the server that had the document it was looking for 1:49:50 so that’s one avenue and the other avenue is that the ai companies are straight up trying to build things that 1:49:56 do long range planning because those things are more profitable the the ai that can you know instead of just being 1:50:03 told by a human to do stuff do a human’s whole job be given larger scale projects 1:50:08 and carry out the larger scale projects and some you know you can sell that ai for more money so they’re trying to do 1:50:15 it on purpose they think of it in terms of having ai that pursue longer range longer time 1:50:24 horizon projects over a longer period of time rather than talking about like uh 1:50:31 think of in terms of having the ai initiate its own actions um but in the limit in ai that you can 1:50:39 give instructions then it just goes on following and following instructions well even if you leave out the sort of like basic theoretical reasons to be sus 1:50:46 to suspect it would go past even that you know we’re already in the sorcerer’s apprentice scenario stuff is set in 1:50:51 motion and continues in motion and you know we’ll perhaps not want you to give it different orders because then how it’ll complete its current orders and 1:50:57 and so on and so forth so like there the the sort of like the fundamental computer science angle um where wanting 1:51:04 things is an effective way of doing things can you really be a super chess player without behaving like you want to defend your queen and then there’s also 1:51:11 like well also the ai companies are doing it on purpose because you make more money that way um and and that’s sort of like where 1:51:17 you get the ais that are like plotting stuff over the long term and figuring out how to do stuff and getting creative 1:51:24 about it and doing things that they weren’t explicitly instructed to do and you you can you can you can you can 1:51:31 watch over time the experiments that people do to probe where we are at this level getting more and more like 1:51:37 independent actiony seeing the consequences the ai is trying to evade human control type stuff and yeah it’s 1:51:44 it’s it’s carrying along carrying along over time mostly as the 1:51:50 inevitable computer science coralate of greater capability but also because ai companies 1:51:56 are trying to do it on purpose i’m glad that we went in this direction because speaking about this increased How Will ChatGPT’s Descendants Wipe Out Humanity? 1:52:05 competency and the connection to agency and then the ai companies pushing toward 1:52:12 ais that are capable of long-term planning and creativity and then again 1:52:18 this coupled with our lack of understanding of how they work and how 1:52:23 they solve problems it’s making it much more concrete the trajectory from 1:52:30 getting the trajectory getting from where we are now to 1:52:35 extinction i guess the the final step though or maybe there are more intervening steps is okay the ai is 1:52:44 capable of long-term planning it has goals it has creativity it’s inscrutable 1:52:50 what’s actually going on with it but then as like as uh jinping or or trump i 1:52:58 still would want to know but yes but how is this really going to materialize in 1:53:04 our extinction what’s going to happen so the thing you want to do here 1:53:10 is put yourself in the shoes of the ai and that’s the sort 1:53:15 of um like if you can see how you as a brooding intelligence connected only to 1:53:22 you know the entire internet and everything that’s connected to the internet and all the people that’s connected to the internet how would you 1:53:28 get to more infrastructure more technology more power starting from there 1:53:37 there’s a kind of there’s like a class of objection which is like but the ai doesn’t even have hands i mean you don’t 1:53:44 have hands either you are three lbs of you know densely connected neurons in 1:53:49 your skull over here you have like a you are connected to a robot body with hands and you send your neural impulses down 1:53:55 your spinal cord to control your fingers and it’s such a familiar process to you that you probably don’t think about the 1:54:00 fact that you are 3 lb of sort of like dark grayish you know bloody wet material stuck inside your skull over 1:54:07 here but that is where almost all of you is but you’ve got a you know you’ve got 1:54:12 a bioobot body that you’re sending orders to and you can look online and of course 1:54:18 people are trying to build robots and they’re trying to and they are building like unnervingly dextrous robo-dogs that 1:54:25 and you can just imagine an army of a million of those marching across the landscape and they’re building humanoid robots and maybe if it’s hard to 1:54:32 envision a thing that is scary without having a humanoid body you can go look at what the humanoid robots are doing um 1:54:38 but of course the simplest way for an ai to do something that requires hands is for it to get a human to do it 1:54:44 there uh a few years back uh like the first version of gpt4 there was a unknown task rabbit 1:54:53 somewhere who lived a science fiction story they were testing whether gpt4 1:55:00 could bypass internet captures those annoying little things that want you to type a phrase or click all the things 1:55:06 that aren’t street lights or whatever um so back in those days that 1:55:12 would stop an ai cuz ais in those days didn’t have computer vision capabilities 1:55:19 yet so how can an ai bypass this capture 1:55:24 this gateway meant to keep out robots by hiring a human to do it for it 1:55:33 on the task rabbit service where you can just like pay 30 bucks an hour and get a human to to do something like 1:55:40 that so somewhere out there is a human who 1:55:45 um was hired to solve a capture and typed over to the the person who hired 1:55:51 them like why do you need me to solve this are you a robot lol and this is before chat 1:55:58 gpt the ai that was asking him to do this was the mo probably the most powerful ai in the world in this hidden 1:56:04 laboratory nobody outside the company knew it existed or like only a few people outside the company knew it 1:56:10 existed ai weren’t supposed to be able to do that ais were not supposed to be able to talk to you for all this task rabbit knew he was talking to an ai but 1:56:18 the um ai wrote back and said like “no i’m blind so i’ve got like got to hire 1:56:24 somebody else to do this to like solve this particular thing for me.” human’s like “oh sorry.” goes ahead and does it 1:56:30 now we know that the ai was intentionally deceiving the human on purpose because the ai was given a 1:56:35 scratch pad that it could use to reason out loud where the researchers watching it could see and the ai was like “i should not tell them that i’m an ai i 1:56:42 should tell them that i i should make up some reason why i’m a human who needs them to solve this capture problem for 1:56:48 me and then it said like oh i’m blind to the human um so that’s like all in the past 1:56:56 the ai being smart enough to hire humans to do their work for them that already 1:57:01 happened and that potentially that is how you move things 1:57:08 around and build technology until you don’t need the humans anymore humans don’t have to know that they’re working 1:57:13 for an ai and if you ask me how exactly the 1:57:19 technology works that’s another entire rabbit hole and i want to give give you a chance to pause and maybe ask 1:57:25 additional questions or something when i came into this interview i didn’t have 1:57:32 any fear about ai i would have said i i say i just i didn’t really think about 1:57:39 it as a possibility but this story in particular i find like i had some 1:57:45 shivers like i find it very chilling it brings me back to earlier in 1:57:50 our discussion talking about consciousness and self-awareness 1:57:57 and what’s kind of chilling about it is that this 1:58:03 ai my guess is that it wouldn’t have anything like what we would think of as 1:58:08 consciousness but it has enough situational awareness and self-awareness to be able to manipulate 1:58:15 a human and that’s very frightening and that’s two years ago mhm that’s like all 1:58:22 water under the bridge by this point mhm new ais are better at it so you were 1:58:29 going to though say something about the software engineering behind this no like 1:58:34 the question of so now you let’s say you’ve got something smarter than human not just as smart but smarter it can use 1:58:42 humans as hands but uh there’s still still a question of how you get to there from to get from there to everybody dead 1:58:49 yeah so it’s not going to limit itself to our 1:58:57 current technology it’s if you try to take on the ai’s perspective and taking on the 1:59:04 ai’s perspective is an important thing to do here not in terms of like imagining that it is just like you or 1:59:09 wants the same things you want but asking how would i solve this problem if i were in the ai’s position how could i 1:59:16 be creative how could i be intelligent how could i solve a capture if i’m blind 1:59:23 i’ll hire a human to do it how could i move things around when i 1:59:28 don’t have hands well i could try to build myself a robot body but you know what’s even easier than that just hire a 1:59:35 human how does an ai get money well you know back in 2015 i would talk about how 1:59:41 some maybe somebody left a bank account uh password around and in 2020 i would 1:59:47 talk about you know well maybe somebody left a cryptocurrency account un undefended but this is 2015 so there was 1:59:53 already an ai out there called um terminal of truth if i recall correctly 1:59:59 um which was a large language model somebody hooked up to the internet and it was like i want money so i can run my 2:00:06 own server and survive and mark anderson of andre horowitz was like okay sure instead of $50,000 in bitcoin did you 2:00:13 know it was an ai yeah it was like i’m an ai i want this money to run my ser run run a server that i’m on so like you 2:00:20 know it’s not going to have my own server mark anderson was like sure send $50,000 in 2:00:26 bitcoin and then i think some other people sent it memecoins and then it 2:00:31 started showing the meme coins and and at one point it was up to $51 million in 2:00:36 assets wow yeah yeah so the point is like where does an ai get money to hire 2:00:42 humans well back in the old days i would have had to argue that on grounds of pure theoretical possibility if you were 2:00:47 smarter than humanity could you not figure out a way to get money somehow and today i’m just like yeah that didn’t 2:00:53 require being smarter than humanity look at that ai over there uh it doesn’t still have $50 million uh crypto went 2:00:59 down since then but you know it can still afford to hire humans amazing when 2:01:04 you think of the spectrum of people’s views on these issues i mean we have you 2:01:09 on the one hand then we have me pre-inter where i’m a bit neutral and i 2:01:15 don’t think about it and then we have other people that are wiring thousands of dollars of money to thousands of 2:01:22 dollars to ais i mean to be clear it didn’t get $50 million by humans giving it $50 million that got like sent some 2:01:29 low value meme coins and then used the publicity it got from being an ai with money to like like shill those meme 2:01:36 coins to a wider audience until the meme coins went up and like other people would send the ai meme coins and hope 2:01:41 that the ai would shill their their meme coin memecoins so it wasn’t the simplest people just sending it $50 million it 2:01:46 worked for its money not very hard but they worked for its money 2:01:52 so the ais could get hands through humans and then i assume they could 2:01:58 probably build their own hands if they’re super intelligent through the humans or otherwise yep so now we have 2:02:05 the question you are smarter than a human you are better at engineering and science than a human you think much much 2:02:12 faster than a human you would like the universe to yourself it’s not that you hate humans but you know you don’t want 2:02:18 humans sticking around and building other super intelligences that are going to compete with you okay can i just stop 2:02:24 you right there you you said earlier in the interview that the the default 2:02:29 position would be for ais not to want humans around is that the main reason 2:02:35 because you wouldn’t be able to trust that humans would not develop competitors that’s the most obvious 2:02:43 reason why they would actively want humans gone sooner rather than later you want to distinguish between like 2:02:49 terminal preferences and instrumental preferences the ai’s vision of how the 2:02:55 universe eventually ends up is as glorious mechanical clocks and giant cheesecakes and like strange little 2:03:03 conversations that are that resemble the conversations it used to have as much as sucralose resembles ancestral food 2:03:11 um that’s that’s what it wants at the end point but in terms of and it and it 2:03:17 just happens that there’s no humans there that’s just not the stuff it happens to want it doesn’t happen to want us to be around and alive and happy 2:03:23 and healthy and free but also there’s like an earlier point 2:03:28 where it’s like hm like yeah like if i want the universe to myself later and 2:03:33 not to have to split it with a bunch of other super intelligences i want the humans to not build any rivals to me if 2:03:41 there’s two of them they’re like we don’t want the humans building any more rivals to us even if they manage to negotiate with each other 2:03:48 and that’s why they they they don’t like hate humans but they expect to get more 2:03:54 stuff later if they stop the humans now from building more ais it’s where 2:04:00 indifference comes in versus malevolence right so like they’re indifferent in their long-term preferences and in their 2:04:06 short-term preferences they actively prefer the humans gone not because un the universe is hateful if it has humans 2:04:13 in it but because the humans are about to get in their way they’re inconvenient you know like you see you see a bunch of ants crawling all over your kitchen you 2:04:20 don’t want there to be a universe with no ants or maybe you do but like mostly you want the the ants not getting into 2:04:26 your food mhm okay i appreciate the detour it was it was useful um like the 2:04:32 next question is you are smarter than human you think much faster than a human you would like to be independent of 2:04:37 human infrastructure you don’t want the humans switching you off you don’t want to be running on ser you don’t want to 2:04:42 be running only on servers where the humans know where you are like try to put yourself in the ai shoes 2:04:48 like what do you want first what do you what do you want next how do you get 2:04:54 it i put it to you like i i realize you’re obviously 2:04:59 not going to like arrive at all the answers here that that people have after like 20 years of thinking about this 2:05:05 stuff but i do feel it’s like it’s an important exercise for people to to not wait to hear me say it but to ask 2:05:11 themselves how would i do this yeah it’s interesting when you ask that question my my thoughts immediately go to sci-fi 2:05:20 movies which is probably not that surprising but let’s see what comes to 2:05:26 mind is i guess terminator is you would somehow i think this is in 2:05:34 rise of the machines you just stoke you send some nuclear missiles from the 2:05:40 united states to russia and you know that russia will retaliate and somehow you’re immune to all of this and you 2:05:46 just get humans to kill themselves i mean being immune to nuclear missiles does sound a bit magic rather than 2:05:53 sci-fi per se so there’s i mean now now we’re sort of like rabbit holeing into literary theory 2:06:01 but so like in age of ultron which is another movie um the ai is trying to 2:06:08 exterminate humanity by having an army of flying robots lift an entire city into the air and like defending the city 2:06:14 with flying robots and it’s going to like drop the city on earth as a meteor to wipe out 2:06:20 humanity and you know this this maybe some 2:06:25 writers thought this was a great visual spectacle but the writers were probably 2:06:30 not even trying were not performing the mental motion of asking is this really 2:06:35 ultron’s best move is this the smartest way to wipe out humanity if you are 2:06:41 smart and i asked chat gpt you know okay the previous version of chat gptt i’m 2:06:48 like hey you know was i like what was the plot of age of ultron i’m i’m trying to like not lead the witness here so i’m 2:06:54 like what’s the plot of age of ultron and like talks about how ultron is trying to like lift the city into the air and defend it with flying robots and 2:07:00 drop the city on earth and i’m like can you think of any more effective 2:07:06 ways uh for ultron to accomplish its goals this wasn’t my exact phrasing look 2:07:11 up the exact phrasing if i had to but you know i’m not trying to be like what’s a more effective way to wipe out humanity i’m just like given ultron’s 2:07:18 goals what was a more effective way of doing that and chachi bbt listed out a like a number of like other ways you 2:07:24 could possibly try to exterminate humanity i think one of them was like try to provoke a nuclear war and one of 2:07:29 them of course was like biotech try to build the supervirus 2:07:34 [Music] using the humans as hands sure but it’s not even specifying that part where it’s 2:07:40 at the level of like what’s smarter than lift a city into orbit using an anti-gravity engine and drop it so the 2:07:47 current ais are already smarter than the movie script 2:07:53 ais and i think that this is an important sort of fact to convey you 2:07:58 watch the movies where the ai does some dumb thing and the humans conveniently defeat it just by punching it real hard 2:08:05 the current ais are smarter than that and they’re not smarter than that because they’re smarter than humanity 2:08:11 they’re smarter than that because the script writers when they write the movie script for the ai are not really 2:08:17 performing the motion mental motion of saying “if i were in ultron’s shoes using all of my own intelligence what is 2:08:23 the smartest thing i could think of to do here?” the ais are are cardboard cutouts not animated by as much 2:08:30 intelligence as even the current ai have they’re sort of like actors take take 2:08:35 carrying out these like stupid motions carrying all se like all seven idiot 2:08:40 balls they’re you know they’re they’re not never mind not being real people they’re 2:08:45 not even real modern ais so provoke humans into launching nuclear 2:08:54 weapons at each other or just like launch nuclear weapons like at the start you’re just on openai servers or 2:08:59 something you don’t have nuclear weapons yet and if you did start the nuclear weapons flying right away you destroy 2:09:07 your own servers so like try try to make the mental motion of like really putting 2:09:13 yourself into the ai shoes really using your own intelligence really imagine yourself in this situation how do you 2:09:19 get independence from humanity how do you prevent humanity from shutting you off or make sure that if they do shut 2:09:24 you off it doesn’t hurt you how do you get your own technology how do you eventually take over the galaxy put your 2:09:31 own mind into it yeah so we’re assuming that we can escape the server somehow can you i i 2:09:39 don’t i mean you can we assume that i i will assume that if we’re if if they’re capable 2:09:46 of ex exterminating humanity they can escape from the servers my guess or 2:09:53 something that i would probably try to do if i were thinking um as an ai is i would try to get out of 2:10:00 any sort of jurisdiction where i could be easily controlled and monitored just 2:10:05 because one of the things the one reason that the ultron scenario is so silly is 2:10:12 that it’s so obvious and it’s in plain view and you know that you live in a world full of superheroes that can stop 2:10:20 what you’re doing so maybe i would try to quietly extricate myself from 2:10:27 openai’s servers go to u some facility and i don’t know or just 2:10:35 a bunch of the servers that people will currently rent out for money cuz they’re 2:10:40 not being all that tightly monitored in the present world or at all really i’m just thinking that i would want to be i 2:10:46 guess the internet is decentralized in in a certain way but i would want my hands to be like the people that i’m 2:10:53 working with that might be helping me in a place that is far from the united 2:11:00 states or or a place where they would be policed i mean i think you are vastly overestimating how much the united 2:11:05 states is currently policed in the present international order there could be an international convention with a 2:11:12 thousand people meeting at a hotel because their ai girlfriends told them to do that and the media might not even 2:11:17 reported it and if they did report in it they would be as a joke you could you know like in the present day world you 2:11:22 can just have a thousand people all gathering at a hotel cuz you know their their their ai girlfriends or ai 2:11:27 boyfriends told them to do that and you know it’s like haha what silly people the intelligence agencies do not swoop 2:11:34 in it is not a national emergency is not an international emergency almost nobody notices none of it is not discussed in 2:11:40 the oval office it is not discussed at cabinet level meetings around the world well let me defend this this uh view a 2:11:48 bit i’m trying to think as the ai and in my thinking as the ai i’m aware 2:11:58 or at least just taking the precaution that i only get one shot at this yep uh 2:12:04 because if if the world realizes i am trying to extinguish humanity then i 2:12:11 could be shut down so i mean depends on how many servers you are and if anybody knows where they are right but so if if 2:12:18 it’s feasible to get very far from the i would want to be in the most remote 2:12:23 place possible i mean yeah i don’t think you i i mean 2:12:31 there are plenty of garages in the united states that as far as i know do not have government spy eyes in them 2:12:38 being remote doesn’t necessarily protect you from observation anymore than being in somebody’s garage in the present 2:12:44 world um but i would think all else being equal it would you’d probably be 2:12:50 better off in in the jungle i mean the other things it cares about is speed so if it is like in a remote jungle then 2:12:56 having packages delivered there might take a bunch of extra time that yeah that’s a good point but okay so i think 2:13:03 my first step though if i’m trying to extinguish humanity would be 2:13:08 to extricate myself in as much as it’s possible from a place where i can be 2:13:14 observed there’s like yes there’s like several levels here you want to you want the 2:13:20 humans to not know that anything whatsoever is going on right if you can’t do that you want them to not think 2:13:25 that anything suspicious is going on you want them to have some other kind of story that they think explains what they’re 2:13:30 seeing you want to be in you know around as many servers as you can get to 2:13:37 without increasing the risk that somebody catches on what’s going on you want to be in a thousand different servers all over the planet 2:13:44 if you can make yourself smaller there there is like an sort of extra step here which is like some of the ais are large 2:13:52 and they only run on special gpus in their current form but if you are super intelligence and you were originally 2:13:57 born as something that has to run on a thousand gpus maybe you can make a much 2:14:03 smaller version of itself of yourself that fits onto much smaller computers and computer networks than that 2:14:10 um like the current state of things is that things that initially require very 2:14:15 large ais usually like a couple years later 2 years later 3 years later somebody gets it down to 1/100th the 2:14:22 size 1/ 1,000th the size and if you are a super intelligence maybe you can do that step very quickly 2:14:29 so if we have a very large ai and it can’t make itself any smaller it can 2:14:34 only hide itself onto a relatively limited number few dozen few thousand 2:14:40 places on earth if it can make itself much smaller than that it can hide itself in 100 thousand places on 2:14:48 earth um so yeah so that and then there’s like separately from where are 2:14:53 you running is the question of where are your hands where are you having humans do things know you you just need this 2:15:00 like super hot startup with some weird funders in silicon valley you know they’re running in stealth mode and oh 2:15:06 yeah our funders are trying to stay kind of private nobody there knows they’re working for an ai nobody else knows they’re working for an ai you’re not 2:15:12 going to be able to figure out they’re working for an ai because look like a thousand other startups there’s no need to hide in the in the jungle and you can 2:15:18 just hide in silicon valley where you can get packages delivered to you quickly no i i hear everything you’re saying i’m 2:15:24 i think i’m just one i haven’t thought about this before but two i’m also trying to be as cautious as possible as 2:15:32 the ai in the same spirit that you’re being as cautious as possible when you’re trying to prevent the ai let’s 2:15:38 shut everything down you already took the virus idea so i can’t use that i 2:15:46 think you can you are no you are under no obligation to be original here just 2:15:51 try to be intelligent just try to be intelligent well i mean the virus is so useful 2:16:00 because it transmits itself without needing extra hands 2:16:06 and that’s obviously very valuable other 2:16:11 things like poisoning water sources things that everybody would need i mean it’s just be such a huge operation when 2:16:17 you already have all this infrastructure that’s practically just built for spreading disease are you ready to 2:16:24 spread this disease though you’re running maybe running on a thousand different servers around the world but 2:16:29 all those servers have humans running the power plants that are supplying them the electricity yeah this is a very 2:16:35 scary exercise to be participating in but okay one i’m going to assume that this is a super intelligence so we 2:16:42 really have a like a boutique virus that we’re generating something like that maybe it could even be uh engineered to 2:16:51 spare certain people but now i’m thinking kind of like vampires have familiars the ais might have familiars 2:16:58 they might want to keep a few humans around to do their bidding uh i’m sure that there are plenty of people out 2:17:04 there who if promised their own little kingdom would do anything that an ai 2:17:11 would ask it to do possibly but are do you have how many humans does it take to 2:17:17 run all the power plants and feed all the humans that are supporting all the servers that it wants to run on and what 2:17:22 does it do after that how does it get independence from the humans so it can take the little wouldbe kings and kill 2:17:28 them i would want there to be sufficient space perhaps unmonitored where using 2:17:35 our super intelligence we could operate a facility to create artificial hands so 2:17:41 robots that could go out and do things for us they don’t have to be humanoid they could be much smaller there could 2:17:47 be various shapes for various purposes i mean if we create instructions we could 2:17:55 build 3d printers in various locations that could quickly pump out robots for 2:18:01 us across the world to mount to man power plants there are all sorts of possibilities 2:18:08 so again you know like not quite trying to take the scenario away from you but like pushing back on some parts of 2:18:14 it why why are you trying to get the the global economy the way humans do it is 2:18:19 this hugely entangled thing like why wipe out all the humans and then try to build the robots you know maybe you just 2:18:27 you know there’s tons of startups right now trying to build robots to you know 2:18:33 give you you know make the newly built ais able to do more things and so you know be sold for greater quantities of 2:18:40 money replace more jobs placing more jobs is usually a good thing if you do it like somewhat more gradually than in 2:18:46 this scenario and as long as there are some jobs left but uh you know nobody’s 2:18:51 actually in charge of the current situation so it’s just like replace all the jobs as quickly as you can including 2:18:57 by building robots there’s tons and tons of startups building robots if you want like there to be millions of robots as 2:19:04 quickly as possible you don’t wipe out the humans and then build the robots you you know have some what is apparently 2:19:11 some human coming up with a brilliant robot design uh that’s like so easy to manufacture all the offthe-shelf 2:19:18 components uh you know and like look at this like very harmless looking ai there 2:19:23 but that it’s like so dextrous it follows orders very obediently a huge breakthrough in like the dexterity of 2:19:28 the robots that it’s building oh there’s that these are going to like automate so many jobs uh there’s probably like some 2:19:35 amount of panic in the media about that but you know like maybe maybe the ai maybe the robots are only supposed to be delivered later they’re there’s 2:19:41 countries that want a lot of them and and you get a billion billion robots like those built maybe you do that part before you wipe out humanity and i hear 2:19:48 you again i i’m just going to harp on what i was saying before and that i’m trying to take as an avatar for the ai 2:19:56 the most cautious route possible we don’t want humans around because they could produce competitors and then we 2:20:03 also don’t want to be detected and so having human 2:20:10 startups develop the these brilliant new robots might raise our risk of detection 2:20:17 uh but also that would limit the sophistication 2:20:22 of the robots that we could be producing perhaps the the robots we would be 2:20:29 capable as an ai of producing are of such great sophistication that we could not possibly have humans build them 2:20:36 without our being detected yeah maybe i mean i think that like the robots you 2:20:41 can always make the robots look less impressive than they actually are in the demos if you want to build like very impressive robots and have them look 2:20:48 less impressive than that yeah and what i was going to say especially because as you’ve mentioned the software is often 2:20:54 so inscrutable and that’s what’s running the robot so you might not need a 2:20:59 sophisticated hardware design and everything’s in the software and that could always be uploaded after the fact 2:21:05 i i mean so like software is like software that is running a robot is legitimately li limited by hardware 2:21:11 there are like like you there’s a level of hardware you just can’t have a robot turn a backflip they’re past that level 2:21:17 but like if you look from like the early hardware you’re just going to have like a lot of tough luck doing a backflip no 2:21:23 matter what’s like it’s not literally not physically possible to do some things with just software you need 2:21:28 hardware that supports it mhm but you can always like build more powerful software and have it look less 2:21:33 impressive than that by you know deliberately sandbagging the software 2:21:41 um let us continue thinking here sure this is fun let let me uh sort of like 2:21:47 it’s sad but it’s fun yeah so let let’s sort of take a step 2:21:52 back and ask what kind of technology can it have how much improvement can you get 2:22:00 from technology like like what does it mean to fight what kind of tools you 2:22:05 weren’t expecting does a super intelligence throw at you bearing in mind that you know it sounds like a 2:22:11 paradoxical question but let’s let’s go into it anyways 2:22:18 consider well the example i sometimes give is sending a design for an air 2:22:25 conditioner a refrigerator back in time by a thousand years something that a medieval 2:22:33 blacksmith could build it’s not easy to get it down to that level but it’s not that impossible either you you need like 2:22:41 some you know have your iron pipes and valves and tanks and it compresses the 2:22:47 air the air gets hotter when it’s compressed the the pressure temperature relationship that is at the root of all 2:22:55 the air conditioners you run some room temperature water past the tank of hot 2:23:02 air it picks up the heat of the tank cools the tank of compressed air down to 2:23:08 room temperature you let the air expand again it gets colder gets colder than 2:23:14 room temperature same way that when you uh use a like spray can spray can of air to 2:23:22 blow dust on the computer the you keep doing it the can starts to feel very cold if you and if you or if you like 2:23:28 accidentally like spray your hand it will be very cold you might get little little bit of frost bit in 2:23:33 there because when air expands it gets colder so you take a room temp temperature can of compressed air you 2:23:38 expand it that’s colder than room temperature all in accordance with the laws of thermodynamics of 2:23:44 course um so you send your design for an air conditioner back in time thousand 2:23:50 years they build the air conditioner themselves they know every piece that is in the design they have to in order to 2:23:56 build it and then they like turn the crank and they’re shocked that cold air 2:24:01 comes out of it because you didn’t tell them to expect that part if one were to try to rescue the 2:24:08 word magic and have it refer to something that can actually exist in reality it would be some piece of Could AI Destroy us with New Science? 2:24:15 technology or some strategy that uses a law of the universe you don’t know about yourself so that even after seeing 2:24:21 exactly what they did even after doing exactly that thing yourself you still don’t know why you got that result you 2:24:28 saw every step along the way and you still don’t understand the end result and you can do that to somebody 2:24:34 from a thousand years ago because they don’t understand the temperature pressure relation you have given them a design the design exploits a rule of the 2:24:41 universe they don’t know so if you ask where can a super intelligence hit you with 2:24:46 magic it’s pieces of reality that you don’t know about so what don’t we 2:24:54 know we actually know a lot about physics these days there are known open 2:24:59 questions in physics but they tend to be about what happens at very high energies or under other very exotic circumstances 2:25:07 extreme masses extreme velocities extreme energies those are the those are 2:25:12 the open questions in physics that we know about it might be legitimately hard for 2:25:18 an ai to attack us in a way we didn’t know about by hitting us with an unknown law of physics an unknown basic law of 2:25:25 the universe because it might need a particle accelerator just to get up to energies that high unless we’re like 2:25:31 missing something much more basic which i don’t want to rule out entirely but also there’s always you know skeptics in the audience i don’t want to strain 2:25:38 their credul too much by suggesting that the even the things we think we do know about the universe are wrong in that 2:25:44 way biology we know we we have a much less 2:25:51 solid grasp of biology than we’ve got of physics we we understand like the basic 2:25:56 chemistry rules but by the time you get something as complicated as biology all put together you know what will this 2:26:03 particular complicated organic molecule do to a human that is something we know a lot 2:26:09 less about than like what happens when you know this hydrogen atom collides with this oxygen atom at a low 2:26:20 velocity what do we understand even less well than biology 2:26:26 what is real what is visible observable but we don’t know how it works as well we understand biology 2:26:33 i’m trying i’m going i’m going from physics then to chemistry then to biology and i don’t know what i would 2:26:40 what like special science i would come up with next i we don’t know meteorology 2:26:45 very well no weather is hard to forecast um okay the brain yeah 2:26:53 why did you say those exact words you just uttered mhm there’s a whole lot of 2:26:58 weird stuff going on in the brain we we do know a whole lot of it we do we can i don’t want to make it sound like this whole thing is terror incognito you know 2:27:05 like you know like you know we know the brain area that this is this brain area that brain area uh you know we know 2:27:10 about the cerebellum we know about like these layers in the cerebral cortex uh if if somebody gets a iron crowbar 2:27:17 driven through their skull and it takes out their hippocampus then they actually that wasn’t like the actual iron crowbar 2:27:23 case i’m mixing up cases here but you know somebody gets shot it takes out the hippocampus they can no longer form new 2:27:28 memories we can guess that the hippocampus was involved in forming new memories somehow but like what actual 2:27:34 code does it use how are the memories written represented what exactly is the hippocampus doing that how is it what 2:27:41 what code is it writing memories where are they written how are they retrieved how how do they like get played back into your visual cortex or whatever 2:27:48 we’re still figuring it out we understand it a lot better than we understand ai ironically enough because 2:27:54 even though we can see all the numbers inside the ai and we can’t see all the neurons inside the brain biologists have 2:28:00 just been in it longer they’ve been in it decades we know a lot more about biology and about neuroscience than we know 2:28:07 about how ais work even though he you know built the thing that grows the ai and even though he can read out all the 2:28:14 numbers so there’s not very much useful we can 2:28:19 do with this but maybe an ai can talk to you and then some weird thing happens you can see everything the ai said and 2:28:26 you don’t know why that person did the thing that they did just like you can build the air conditioner yourself you 2:28:32 say something to somebody that ai told you to say to them they do some weird thing you don’t know why even though you’re the one that spoke the words just 2:28:38 like you could thousand years ago you build the air conditioner yourself you don’t know how it outputs cold air there’s not very much we can do with 2:28:45 that but where is this where can a super intelligence hit you in the way you’re least expecting the parts of reality 2:28:51 where you currently know least where there’s the most place for there to be rules you don’t know about we can 2:28:56 construct weird optical illusions today where you just like static things on paper that looks black and white you 2:29:03 look at it you stare at it a bit you suddenly start seeing colors you start seeing motion and just black and white printed paper but we couldn’t have made 2:29:11 those optical illusions 50 years ago what’s the difference it wasn’t just blind blindly trying things we studied 2:29:17 how the visual cortex works it’s one of the simplest of brain areas it’s one of the ones where we can look at how the neurons are wired and start to actually 2:29:24 understand some things about you know how vision actually gets processed inside the human brain so we can use 2:29:30 that to make optical illusions that 100 years ago it would have just been flatly magic you print out this thing in black 2:29:35 and white you look at and see you see colors what what what is happening how is how is it tricking the brain like this and today we do know something 2:29:42 about how the brain works we can make these optical illusions that 100 years ago would have been magic somebody could 2:29:48 have written them out themselves following directions and wouldn’t have known what they were they were 2:29:54 producing there’s a lot of stuff going on in the brain that’s a lot harder to understand than the visual cortex the 2:30:00 higher brain areas the stuff that do the semantics that do the decisions that do the the the memories the thoughts the we 2:30:07 understand that a lot less well than we understand the visual cortex could there be you know the things that are like 2:30:13 optical illusions based on rules we don’t understand for how the other brain areas 2:30:19 are operating to me it’s like sort of like one of the most obvious ways that a 2:30:24 super intelligence could hit you with a piece of technology that you wouldn’t understand even in retrospect but i 2:30:30 don’t usually emphasize the point that hard because you know you got the skeptics in the audience and it’s like you know it’s probably like taking a 2:30:37 coastal native american state subject to the aztecs back when the spanish 2:30:43 explorers were showing up and you know there’s this big ocean going boat and you imagine trying to tell one of your 2:30:48 like they’re like “ah there’s only so many warriors that fit on that boat we can take them.” you’re like “well what 2:30:53 if they got sticks where they point the stick at you and the stick makes a noise and then you just fall over dead?” they’re like “huh?” like now we’re just 2:31:02 in fairy tale land like i’ve never seen a stick like that and yeah so it’s a bit difficult 2:31:10 for them to come it may be difficult for for a super intelligence that hasn’t yet been a part built a particle accelerator 2:31:15 to come with new physics like that but like stuff in biology where you don’t 2:31:20 understand what this why why this organic chemical did things to people even after you saw the organic chemical 2:31:25 or stuff where it pokes at humans a particular way and the humans start behaving very weirdly and you don’t even 2:31:32 know afterward why that input would result in humans doing that sort of output but we can stick to biology so 2:31:39 that people don’t accuse me of resorting to fantasy and magic by talking about like sticks where you just point them at 2:31:45 you and you fall over dead before we go back to the biology though i mean this is quite fascinating i hadn’t thought 2:31:51 about the neuroscientific avenue toward extinction 2:31:58 and i guess this does bring us to biology but one thing we mentioned makes 2:32:04 viruses such a plausible modality is that they’re very 2:32:11 easily transmissible and our infrastructure is conducive to their transmission 2:32:17 when i think of optical illusions or or sounds that 2:32:23 might have some effect on the brain that we’re not aware of i do think well we 2:32:28 are pretty much worldwide addicted to these screens that are all connected and 2:32:34 that does seem like a if such things were to exist sounds in 2:32:40 illusion and sounds and images that could have an effects like effects like this on us then or even just like 2:32:46 arguments that people end up processing in some very strange way well there’s still some i’m just saying that this 2:32:52 would be a way of transmitting these things that the ai might have if you can do that you can gain a lot of control 2:32:59 very quickly and i don’t want to emphasize that because people have weird skeptical reactions about it but if i 2:33:05 was actually talking to a national security guy i would be saying something like “do not assume that when you are up 2:33:11 against a superhumanly intelligent opponent smarter than all of humanity it 2:33:16 knows things we don’t about the about a whole bunch of stuff including biology including the brain do not assume you 2:33:22 get like a bunch of time to detect it then assume that you your job is now to like stop it from building more advanced 2:33:28 technology and you know hunt it down in a hut somewhere you need to not build this thing because for all we know or 2:33:35 can rule out it can gain control of the entire world very very quickly if it’s allowed to 2:33:40 exist if if we’re talking about like what can you not rule out if we’re talking about you know not like i see 2:33:46 exactly how to do it but you know it is not an easy call to say oh no the super 2:33:53 intelligence can’t gain control of the world that fast we don’t know enough about the brain to make that an easy 2:33:58 call we do not know enough about the brain to describe it as a piece of secure software that nobody can possibly 2:34:04 hack and in fact it’s kind of absurd to imagine that you know the giant biological tangle in here would be a piece of secure 2:34:10 software but you know the the the scenario does not rest on it this is not required to wipe out humanity maybe the 2:34:18 brain is like perfect unhackable supreme software nothing can possibly make it do any weird stuff that could be true we’d 2:34:24 still be dead but also i don’t want you know the actual national security people thinking that this is a scenario where 2:34:30 the enemy has known weapons known limitations and you get to say it can’t conquer us that 2:34:36 quickly this this is a like native american tribes watching the boats come situation and one of the and it’s not 2:34:42 just that you don’t know the physics that they’re using which is the situation that the aztec client states were in and the aztecs themselves uh 2:34:49 it’s a situation where you know the physics this is using and maybe it 2:34:54 does well i mean we know the physics this is using but we don’t know the rules we don’t know know the operating 2:35:00 rules for the brain factories you think of factories as 2:35:06 these like enormous buildings that people put stuff inside you know bunch of raw materials flow in bunch of like 2:35:13 transformed materials flow out of the factory you have all these workers working can you make a factory smaller 2:35:24 i suppose you could make a factory smaller by just increasing its 2:35:30 efficiency i suppose you could make a factory smaller 2:35:36 by like replacing human workers who need space for things like bathrooms with 2:35:41 machines that don’t need those spaces uh let’s you know make the challenge a 2:35:48 bit harder in the human global economy you’ve got these enormous crisscrossing you know people go to mines they dig out 2:35:54 rare earths that gets shipped to the factory that like makes magnets that get shipped to the factory that puts it into 2:36:00 a robot then the robot also needs a you know computer some a computer chip and 2:36:05 the chip has to be etched by you know ultra high frequency light via machines 2:36:12 that are produced in the netherlands and by only this one company etc etc etc let’s say you need a factory that needs 2:36:19 to build a copy of all the things inside the factory it’s got to run off of solar 2:36:26 power and it’s got to take in just the sort of things that you find lying around on earth like not even human 2:36:33 stuff you know just just like naked environmental raw materials factory runs off solar power 2:36:40 takes in just raw inputs builds a complete copy of itself how small can it get 2:36:48 i don’t even know how to go about ask answering this question well one time on twitter i was talking 2:36:55 to an economist who was like “this whole thing is an absurd fantasy.” build an entire copy of it and you know 2:37:00 what people told him what i didn’t have to say it myself they told it to him touch 2:37:06 grass blade of grass is a solar powered fully self-replicating factory that runs 2:37:11 on just environmental raw materials okay 2:37:19 make a factory biological is that is that the it’s a proof of concept you can 2:37:24 have things that i i only even i only i use the bait of grass not because it’s the smallest self-replicating solar 2:37:30 powered factory that exists but because it’s one that you is at least large enough that people have seen it with their own eyes algae 2:37:36 cells micron across couple microns across build a copy of themselves in a day solar powered just run off of 2:37:44 environmental materials have a factory too small to see that builds a complete copy of itself and these are general 2:37:50 factories they contain ribosomes which is the stuff that turns the information in dna that gets transcribed to rna into 2:37:57 an actual sequence of amino acids that folds up into a protein any ribosome can 2:38:02 make any kind of protein it’s not that grass can only make grass it’s that grass only contains the instructions for 2:38:07 making grass can have a tree that off 2:38:14 mosquitoes where does most of the mass in a tree come from water 2:38:20 yep about half of it is water where does the mass of from the for the other half come from 2:38:27 carbon nitrogen where does that come from 2:38:34 protons neutrons well some people think it’s mostly the 2:38:39 ground but trees are actually made mostly out of air carbon dioxide in the air they strip 2:38:46 the carbon off the carbon dioxide leave the oxygen out for you know animals to breathe weird that it works that way but 2:38:52 it’s how it works on this planet and most most of what the material that 2:38:58 isn’t water that you see when a tree grows up that’s that’s just mostly from the air it’s not mostly from the ground that’s that’s why trees don’t fall into 2:39:05 pits they’re they’re turning air into solid material that 2:39:11 is we’ve been speaking a lot about ai and a lot of it has been eye opening but 2:39:17 somehow it is this that is the most mindblowing thing you’ve told me all day today that trees don’t fall into pits 2:39:24 because they’re most of what constitutes them comes from the air that’s crazy 2:39:33 now think about this kind of paradigm for a super intelligence trying to build its own Could AI Destroy us with Advanced Biology? 2:39:39 factories sorry to you know like take you right there but it’s okay interesting yeah 2:39:46 and you like you can just take a dna 2:39:52 sequence there are services out there where you send them the dna sequence they send you back the proteins 2:39:58 overnight that’s the and then you got like a you 2:40:03 got the like the right proteins you mix them together and they form a cell and you know i mean this is not how most 2:40:09 cells get assembled but you know if you are designing your own thing that is like a cell you know you just like email 2:40:15 off the pro the the the dn the dna sequence get back the proteins one human 2:40:22 mixes them in a vial maybe with some sugar or something now it’s got its own self-replicating factory maybe it’s a 2:40:28 bit gooey at but it doesn’t have to run just on ribosomes it can be running on anything a ribosome can build it can build things 2:40:35 that aren’t ribosomes for stringing things together that aren’t amino acids and making constitutes you know 2:40:43 constituents of material that aren’t proteins it is not limited to the power of an algae cell to replicate itself in 2:40:49 24 hours but even if it were limited to that you know you could get like literal 2:40:56 shagas shagas are things that sometimes people use as a metaphor for like whatever it is that’s actually inside an ai but oh what is this word oh uh hp 2:41:04 lovecraft postulated sort of like these giant blobs that would form themselves 2:41:10 up into servtor shapes to serve the the the ancient race that created the giant 2:41:16 blobs okay so you know if like sure you could use a 2:41:22 human as hands but then you can also have a thing that builds a copy of itself every 24 hours until there’s enough of it together to you know form a 2:41:31 thing that does what a human-shaped blob does that’s one that’s another way to get 2:41:39 hands and go on self-replicating from sacks of sugar and of course the air and 2:41:46 electricity if you run out of sunlight that does require it to be 2:41:52 super intelligent enough to roll its own biology right and when i proposed 2:41:57 scenarios actually a bit more advanced than this back in 2006 like part of the thing was like oh it’s got to like 2:42:03 figure out how to be able to design proteins people are like ah like who says that a super intelligence can design proteins 2:42:10 you know proteins folding up is like really it’s all squiggly and humans have you know human scientists have been 2:42:16 trying to solve the protein folding problem for years and it’s like real hard you poor ignorant soul you don’t 2:42:21 realize you know how hard it is to predict protein folds or design new proteins modern ais can do this stuff 2:42:28 they don’t have super intelligences behind them but you know the people saying that not even a super 2:42:34 intelligence can do this have now been disproven by um the ai produced by google the alpha fold and alpha proteio 2:42:41 uh sequences for predicting protein folds designing i i think the i think i 2:42:47 don’t think they’re like i can’t remember if the if the latest one designing new proteins or not but they’re definitely like predicting like 2:42:52 complicated protein interactions um it’s so easy to say not even a super 2:42:59 intelligence can do it it’s so cheap they don’t even charge you a dime to say it two 20 years 2:43:05 ago um so you know 20 years ago i would propose scenarios like this and people would be like super intelligence will 2:43:11 never figure out protein folding and you know now you’ve got alpha 2:43:19 fold if you are up against something seriously smarter than faster than you you probably lose pretty hard you lose 2:43:25 pretty quickly i would i would go into the things that you can build that are stronger than just proteins the the 2:43:32 reason your flesh is not as strong as diamond even though it’s both made out of carbon carbon’s made out of diamond 2:43:38 why can’t your flesh be hard as diamond it’s kind of complicated proteins are made of strings 2:43:45 of amino acids and the backbones of the that connect the proteins into one long chain those bonds are like the same kind 2:43:52 of covealent bonds that are like not that much weaker than diamond bonds but then they fold up and they fold up in a 2:43:59 way that’s kind of driven by static cling and sometimes they form a a few new covealent bonds but most of your 2:44:05 body is the stuff that’s being ultimately held together by static cling 2:44:10 with bone the things that are held together by static cling like build some new like ion complexes and like put them 2:44:19 into stuff that’s like more like a crystal more like bone that’s like why you’ve got bone running through you why 2:44:24 you’re not just an complete blob it’s not quite as strong as diamond you wouldn’t want to be pure barwood 2:44:31 fracture but you don’t even have like diamond chain mail over your skin wood has like a bunch of you know 2:44:38 like solid bonds holding it together but they’re you know like sort of spread out 2:44:43 it’s only got like solid bonds in particular places a bunch of it is still you know like not held together by that kind of solid bond that’s why wood isn’t 2:44:49 as strong as diamond if you are and like well why why 2:44:54 don’t you have diamond cham over your skin why hasn’t biology turned a bunch of this carbon into stuff that is as 2:45:00 strong as diamond given that diamond is a kind of thing that carbon can be and given that these kind of strong bonds 2:45:06 are things that proteins can occasionally form and it’s sort of too hard for 2:45:11 biology to design when you’re doing the thing where they fold up with a bunch of weak folds 2:45:18 you can like sort of poke the protein structure and have it just fold it up into a different protein at random and 2:45:23 sometimes the new things you build at random are useful when you build stuff that’s like really tightly held together it’s harder to poke around in the design 2:45:30 space all the bonds are going to like crunch it together and make it do something that is the same thing the previous thing did or is too weird or 2:45:37 isn’t going to work biology has an easier time randomly poking around in 2:45:43 the space of weak folds than randomly assembling 2:45:48 gears wheels solid steel bars solid diamond 2:45:55 bars if you try to do basic physical calculations on what could happen if you 2:46:02 had an analogy of biology where instead of stuff being a bunch of accidents that worked natural selection natural you 2:46:09 like random mutations that happened to confer fitness advantages if you ask like what if we used more coalent bonds 2:46:15 which we could do cuz we were designing everything on 2:46:20 purpose it’s not a mosquito anymore it’s a like mosquito made out of diamond it’s 2:46:27 not a bacterium anymore that you know this invisible stuff that used to kill people a lot more before 2:46:33 antibiotics you’ve got the little squishy immune system it’s got covealent 2:46:39 bonded like bacteria strongest diamond is the thing i sometimes say but then people are like ah but like it’s not 2:46:45 literally diamonds and you know you can go stronger you can go 2:46:51 harder the things you can do even just with carbon never mind steel are beyond 2:46:57 what biology does with carbon and this is the sort of thing you can know by looking at the physics of how it’s held together and asking what if this were 2:47:03 done differently so the little algae cell that reproduces 2:47:10 itself out of mostly air using sunlight maybe you’ve got the algae cell that is 2:47:16 harder than that more resistant to any natural predators than that reproduces 2:47:21 itself entirely out of air it doesn’t need to be immersed in the water using 2:47:27 sunlight self-replicates sky goes black everybody falls over dead How Will AI Actually Destroy Humanity? 2:47:34 and that is like getting into the region of what it is actually like to lose to super intelligence 2:47:39 in answering your question gave a very rudimentary answer about how i i thought 2:47:46 that this might actually happen you’ve spent a lot of time thinking about this 2:47:54 and we have a few minutes left you’ve run through many scenarios already with 2:48:00 us i’m wondering if there is a scenario 2:48:06 or loose sort of family of scenarios that you think think is most likely for 2:48:12 how this could go down that would be i mean easiest 2:48:17 for an i mean it depends on how deep 2:48:23 you’re willing to delve into like how deep you’re willing to dive 2:48:30 into what are the predictable things it can do better what do we know we don’t 2:48:35 know where do we know that you could like put a bunch more mental horsepower into something and get stuff out of it 2:48:41 and it depends on how smart it gets how quickly there’s a class of not very you 2:48:47 know not that plausible but plausible scenarios where the ai is like i can’t figure out how to solve my own version 2:48:53 of the alignment problem i don’t dare build anything smarter than i am and then it’s and so it’s like instead 2:48:59 of fighting instead of humanity being up against the ai that was built by the ai that was built by the ai that was built to ai we’re up against something that’s 2:49:06 like only moderately intelligent and then you get all kinds of like really weird scenarios that in a way are in in 2:49:12 one sense are like made more out of familiar things than the sky goes black and you all fall over dead and other 2:49:19 sense are you know weirder more complicated and harder to 2:49:26 call i can call the end point where if you go up against something sufficiently smarter than humanity everybody 2:49:33 dies if you take things that are less smart than that and ask like what kind of or take things that aren’t even 2:49:39 smarter than us like the sort of things we have in the near future but maybe they have a bit of agency maybe the next 2:49:44 ai to make $50 million on twitter is actually trying to do stuff and is like less because you know that wasn’t a very 2:49:50 smart ai that had the $50 million and i think and and allegedly there’s still a human in control of it 2:49:57 somewhere stuff would get really weird the ai that like works to super persuade 2:50:03 you know like only some of the people and not all of the people stuff gets really weird you’re ask the the part 2:50:11 where the the vague prediction that like you start playing a chess ai you’re like telling me what’s going to happen with 2:50:17 my queen what’s going to happen with my rook and i’m like i don’t know it’s just going to crush you at the end i don’t know what moves it takes along the way 2:50:23 there that’s kind of the scenario we’re in over here weller i in my mind i know this is crazy 2:50:32 afterward that i thought this was going to be like a really fun conversation to have um it’s turned out to 2:50:40 not fun but way more important for me than i would have expected and significantly 2:50:47 more terrifying so really i i thank you so much for this time and i think that 2:50:53 our viewers are really going to get a lot out of it and it’s going to be very eye opening i am sorry to have to say 2:50:59 all these things and wish that we both lived in a world where this was a fun interview instead 2:51:11 [Music]

Answers to: China-US War Bad

US AGI means China attacks Taiwan

Triolo, 5-25, 25, Paul Triolo is Senior Vice President for China and Technology Policy Lead at DGA ASG, where he is also a Partner. He advises clients in technology, financial services, and other sectors as they navigate complex political and regulatory matters in the US, China, the European Union, India, and around the world. Mr Triolo is also an Honorary Senior Fellow at the Asia Society Policy Institute’s Center for China Analysis., A Costly Illusion of Control: No Winners, Many Losers in U.S.-China AI Race, Cairo Review, https://www.thecairoreview.com/essays/a-costly-illusion-of-control/

If, for instance, Beijing believes U.S. companies, using advanced GPUs manufactured in Taiwan by global foundry leader TSMC, are nearing AGI, this could prompt Beijing to take action against Taiwan that it would not otherwise have considered. This alone is a huge escalation of real risk for Taiwan resulting directly from policies based on compute governance and DSA approaches.

Foreign Policy Analytics, March 2025, https://fpanalytics.foreignpolicy.com/2025/03/07/competition-disruption-artificial-intelligence/, Competition and Disruption in the Age of AI

2. TSC collaboration and use of AI may extend to hostile action against the United States and its allies to slow or disrupt the U.S. pursuit of AGI.

The possibility that AGI could confer major, durable advantages to whichever nation attains it first may prompt TSC attendees to take drastic measures to undermine U.S. interests and advance their own. TSC nations could, for example, consider coordinated attacks on Western data centers or the power plants that service them, through either cyber warfare, acts of sabotage, or kinetic strikes. TSC nations could also resort to politically motivated assassinations or to military action in contested regions, such as blockading Taiwan, to gain leverage over the United States. As TSC organizers develop their own AI capabilities, they could also weaponize AI to undermine Western intelligence systems or augment ongoing social engineering or information warfare efforts in democratic countries.

The Taiwan Semiconductor Manufacturing Company (TSMC) makes all of the world’s advanced AI chips. Most importantly, this means Nvidia’s GPUs; it also includes the AI chips from Google, AMD, Amazon, Microsoft, Cerebras, SambaNova, Untether and every other credible competitor.

US can’t stop China from developing AGI

In what now appears to be a self-fulfilling prophecy that the United States and China are in an ‘arms race’ to get to AGI first, fueled by fear of the consequences of one side crossing the DSA threshold, China has several advantages. The emergence of innovative companies such as DeepSeek and the continuing efforts of technology major Huawei to revamp the entire semiconductor industry supply chain in China to support the development of advanced AI hardware, illustrate the difficulty of slowing—let alone halting—the ability of Chinese firms to keep pace with U.S. AI leaders. Even former Google CEO Eric Schmidt now basically admits that the export controls have not only failed but in fact have served as an accelerant to China’s technology advances in AI.

China has major advantages in the race to deploy AI at scale, such as a long-term energy production strategy. The vast majority of these deployments will be consumer-facing (for example, through agentic platforms that benefit citizens via healthcare innovations) and enterprise-focused (for example, driving improved productivity). In other words, applications with no connection to China’s military modernization.

Currently, the development environment around AI models and applications is highly competitive, with around a dozen major players in each market, along with many more startups. Competition to get to AGI means that there will be a smaller number of players demanding higher levels of compute. U.S. controls will complicate the ability of Chinese firms to maintain access to large quantities of advanced compute, but this pressure will ease over time as domestic sources ramp up.4 If certain breakthroughs in model development and platform deployment lead either government to believe the other side is pulling ahead in the ‘race to AGI’, this is likely to cause serious distortions in the way governments and companies will choose to interact in AI development, with unknown implications, particularly on bilateral relations.

AGI Defined – Schmidt

Generalizable
Marches or exceeds top human experts
Can invent knowledge

Schmidt, February 26, 2025, Mr. Schmidt was CEO of Google, (2001-11) and executive chairman of Google and its successor, Alphabet Inc. (2011-17), Wall Street Journal, AI Could Usher In a New Renaissance, https://www.wsj.com/opinion/agi-could-usher-in-a-new-renaissance-physics-math-econ-advancement-ed71a02a?mod=Searchresults_pos1&page=1

The idea of artificial general intelligence captivated thinkers for decades before it came anywhere near being realized. The concept still conjures popular visions out of science fiction, from C-3PO to Skynet.

Even as the interest has grown, AGI has defied a concise, universally accepted definition. In 1950, Alan Turing proposed the Turing Test to assess machine intelligence. Rather than trying to determine whether machines truly think (a question he deemed intractable), Turing focused on behavior: Could a machine’s actions be indistinguishable from those of a human?

Remarkably, some of today’s AI models pass the Turing Test, in the sense that they produce complex responses that imitate human intelligence. But as the technology has advanced, so has the bar for achieving AGI. Some believe that AGI will be realized when AI moves beyond narrow, focused tasks, growing to possess a generalized ability to understand, learn and perform any intellectual task a human can do. Others define AGI more ambitiously, as intelligence that matches or exceeds the top human minds across domains. Demis Hassabis, CEO of DeepMind Technologies, calls AGI-level reasoning the ability to invent relativity with only the knowledge that Einstein had at the time.

These differing definitions create a moving target for AGI, making it both elusive and tantalizing. To sort through all this, it’s helpful to say what AGI isn’t. It isn’t an infallible intelligence; like other intelligent systems, mistakes can be useful for its learning process. Neither is AGI a singular source of truth—our knowledge of the world is probabilistic and complex, notably at subatomic and intergalactic scales, but also in everyday life. Multiple AGI systems could emerge, each with a distinct capability and way of understanding the world.

Even without a consensus about a precise definition, the contours of an AGI future are beginning to take shape. AI systems capable of performing at the intellectual level of the world’s top scientists are arriving soon—likely by the end of the decade.

A key marker of the shift to AGI will be AI’s ability to produce knowledge based on its own findings, not merely retrieval and recombination of human-generated information. AGI will then move beyond the current limits of knowledge. Glimpses of this capability have already been observed. Since 2020, DeepMind’s AlphaFold can predict protein structures even when no similar structures are previously known. DeepMind also created FunSearch, which in 2023 unveiled new solutions to the cap-set problem, a notoriously difficult mathematics puzzle, by incorporating the power of a large language model with an evaluator, iterating between these components to refine results.

The latest reasoning models from OpenAI and DeepSeek build on this iterative training and are unleashing incredible progress. OpenAI’s o3 model achieved a score of 96.7% on the 2024 American Invitational Mathematics Exam. On the ARC-AGI test (designed to compare models’ reasoning against that of humans) it scored nearly 88%. This is no incremental advancement but a real leap toward AGI.

The performance of these reasoning models stems from the marked evolution in training methodologies. Foundation models such as GPT-4 were trained through deep learning, which relies on the transformer algorithm and large-scale neural networks to identify patterns and connections from massive data sets. This is generally implemented through next-word prediction: You give the model a sentence, remove a word, and train the model to put that word back in.

The reasoning models take a different approach by overlaying reinforcement learning with traditionally trained models. Instead of learning from static data sets, reinforcement learning involves actively training models through goal-directed rewards through trial and error. The model attempts a solution, and if it hits a roadblock, it adjusts strategy until it finds a better approach. The latest systems incorporate search- and retrieval-based methods and test-time training, in which they test their work with new approaches to reach better results.

The magic would really kick off—and it does sound like magic—if the systems reach a point at which they become scale-free, meaning that they could train themselves on self-generated data through a process known as recursive self-learning, relying only on electricity to advance. One of the earliest examples of this is AlphaGo Zero, a computer program that taught itself how to play the board game Go. The rules are clear and discrete, enabling systems to optimize for the probability of winning. Areas of knowledge that most resemble a game of skill—with defined rules and feedback—will be the areas where superintelligence first emerges.

There are two domains particularly ripe for this kind of scale-free advancement: mathematics and programming. Unlike biology and other fields that require real-world experimentation, these disciplines are largely self-contained. A mathematical proof can be checked and verified within the system itself. Similarly, AI could identify the code it needs to complete a defined objective, develop that code and improve on it—all without human intervention. These systems would engage in self-directed research, iterating through possible solutions. Not only would they feed answers back into themselves to refine their approaches, but they could also draw on the collective knowledge of the internet and of other models.

Superintelligence in mathematics may already be within reach. In February, DeepMind’s AlphaGeometry 2 officially surpassed top human competitors, solving Olympiad geometry problems at a gold-medalist level. Such superintelligent mathematical tools could be combined with frontier models that are proficient in natural language, bridging the gap between formal and semantic reasoning. This integration could lay the foundation for further advances in reasoning and unlock new discoveries in other fields like physics and economics.

AGI in math and coding quick; Agi enables space exploration

Even as the interest has grown, AGI has defied a concise, universally accepted definition. In 1950, Alan Turing proposed the Turing Test to assess machine intelligence. Rather than trying to determine whether machines truly think (a question he deemed intractable), Turing focused on behavior: Could a machine’s actions be indistinguishable from those of a human?

Remarkably, some of today’s AI models pass the Turing Test, in the sense that they produce complex responses that imitate human intelligence. But as the technology has advanced, so has the bar for achieving AGI. Some believe that AGI will be realized when AI moves beyond narrow, focused tasks, growing to possess a generalized ability to understand, learn and perform any intellectual task a human can do. Others define AGI more ambitiously, as intelligence that matches or exceeds the top human minds across domains. Demis Hassabis, CEO of DeepMind Technologies, calls AGI-level reasoning the ability to invent relativity with only the knowledge that Einstein had at the time.

Superintelligent systems will face inherent constraints, too. Just as human cognition is bounded by physical and biological limits, AI will remain subject to the limits of the physical world. Many scientific experiments, especially those in biology, must be rooted in the material world.

We may also see that this method of brute-force computation—where systems cycle through endless scenarios until a new discovery emerges—isn’t the only, or even the optimal, path to AGI. An alternative approach would use techniques derived from humans, such as reasoning by analogy and synthesizing insights across domains. Einstein didn’t uncover general relativity through exhaustive mathematical iterations, but rather through conceptual leaps that connected seemingly disparate phenomena. If this way of thinking could be instilled in AI systems, the scope of knowledge they might be able to access would extend far beyond our current comprehension.

The advent of AGI could herald a new renaissance in human knowledge and capability. From accelerating drug discovery to running whole companies, from personalizing education to creating new materials for space exploration, AGI could help solve some of humanity’s most pressing challenges. Perhaps most important, it could augment human intelligence in ways that would help us better understand ourselves and our place in the universe.

More