Writing
Longer essays exploring the AGI transition: what’s happening today, what’s coming next, and what do we do about it?
China Still Trails the US on Existential Risk
None of the US frontier labs manage existential risk as well as one would hope, but the Chinese labs are all doing significantly worse. And although China has a well-developed system of AI “safety” regulation, it focuses on political control and mundane harms rather than existential risk.
There’s some reason for optimism: the Chinese labs have been making progress on safety, and there’s some evidence that the government might be starting to pay more attention.
I spend several hours a day trying to keep up with what’s going on in the parts of AI that I’m interested in. It’s a ridiculous amount of work: I don’t recommend it unless you’re doing something silly like writing a newsletter about AI.
But if you’d like to keep up with AI without spending your entire life on it, I have advice about who to follow. My recommendations center on the areas I’m most interested in: AI safety and strategy, capabilities and evaluations, and predicting the trajectory of AI.
Don’t Cut Yourself on the Jagged Frontier
A conversation with a friend on the bus to Bodega Bay today made me realize that there are some holes in my thinking about safety and superintelligence. I’ve assumed that superintelligence is by definition robustly better than humans at all the things, but there are some cases when that’s not the case.
Without further ado, for your edification and discomfort, The Strawman Players present:
A Disquieting Conversation on a Bus
I expect it’ll take another week or two for everyone to fully digest the significance of Claude Mythos Preview. In the meantime, here are my initial thoughts.
I see a lot of AI safety strategies that don’t fully engage with the complexity of the real world—and therefore are unlikely to succeed in the real world.
To take a simple example: many strategies rely heavily on government playing a leading role through regulation and perhaps even nationalization. That’s a reasonable strategy in the abstract, but the recent conflict between DoW and Anthropic raises serious questions about the real-world viability of that approach. Too many people are stuck thinking about some idealized government they’d like to have, rather than the government we actually have in 2026.
My thinking about AI safety strategy is anchored by six foundational beliefs about the world in which that strategy has to operate:
- Timelines are probably short
- Many open questions have been resolved
- The future is high variance
- We need a portfolio of strategies
- It’s all about the game theory
- Expect tough tradeoffs
My AI editor is essential to my writing flow and has made me a stronger and more consistent writer. I get a lot of questions about my setup, so I’m going to talk about how I think about the role of AI, how I set up my editing workflow, and how to set up your own editor. Not sure if that would be useful to you? The final section of this post is the feedback Claude gave me on my first draft, so you can assess for yourself.
How to Watch an Intelligence Explosion
The cleanest metric for understanding the rate of recursive self improvement (RSI) is AI Futures Project’s R&D progress multiplier, which measures how much AI is speeding up its own development. It’s the right tool for measuring an intelligence explosion, but it doesn’t tell us which capability thresholds carry the greatest risk from misaligned AI.
Ajeya Cotra steps into that gap with an elegant taxonomy of 6 milestones for AI automation. Together, those two concepts let us measure the how fast RSI is proceeding, how close we are to a fully automated economy, and when a misaligned AI would be most likely to betray us.
Contra Anil Seth on AI Consciousness
There’s broad (though not universal) agreement that present day AI is probably not conscious, but very little agreement about whether consciousness is likely to emerge as we move toward AGI. This isn’t an abstract question: AI consciousness has major implications for alignment. Further, a conscious AI might have moral rights that complicate our ability to control it, put it to work, or turn it off.
The debate about AI consciousness has two factions:
- Biological naturalists believe that consciousness is deeply coupled to neurobiology and cannot readily be replicated by a computer.
- Computational functionalists believe that consciousness is the result of computation, which can be performed by a computer just as well as by a brain.
Many biological naturalists argue that because consciousness is inextricably linked to neurobiology, AI consciousness is highly improbable. I’m here today to argue that they’re wrong: biological naturalism may be correct, but the arguments in favor of it aren’t nearly strong enough to confidently rule out AI consciousness.
People have lots of opinions about Anthropic’s Super Bowl ads making fun of the new ads in ChatGPT.
While I agree that the ads were somewhat unfair, they raise valid concerns about OpenAI’s direction. What OpenAI is doing today is entirely ethical, but there are very strong incentives for them to become less ethical over time. Unfortunately, the history of tech doesn’t make me optimistic about their long-term trajectory.
A Closer Look at the “Societies of Thought” Paper
Today I’m going to take a deep dive into an intriguing paper that just came out: Reasoning Models Generate Societies of Thought by Junsol Kim, Shiyang Lai, Nino Scherrer, Blaise Agüera y Arcas and James Evans. Here’s how co-author James Evans explains the core finding:
“These models don't simply compute longer. They spontaneously generate internal debates among simulated agents with distinct personalities and expertise—what we call "societies of thought." Perspectives clash, questions get posed and answered, conflicts emerge and resolve, and self-references shift to the collective "we"—at rates hundreds to thousands of percent higher than chain-of-thought reasoning. There's high variance in Big 5 personality traits like neuroticism and openness, plus specialized expertise spanning physics to creative writing.”
Wearable AI Pins: I’m Skeptical
AI-focused personal devices are back in the news: OpenAI has announced that they’re working on some new AI-focused devices with Jony Ive and rumor has it that Apple is working on something similar.
I love gadgets and I love AI, and I’m very open to the idea that an AI-first device would look very different from anything that currently exists. But I’m deeply skeptical about the pin form factor.
