Inkhaven
I'm attending the Inkhaven writing residency during April 2026.
During that time I’ll be publishing daily, discussing rationality and writing as well as my usual AI topics. The most AI-relevant pieces will also be on the main writing page.
Agricultural Bioweapons, Part One
Although I’ve long worried about AI biorisk, I’ve come to realize that I was underestimating the scope of the problem. Like most people, I’d equated biorisk with bioterrorism: the potential for AI to empower a nihilistic group or individual to unleash a doomsday plague. I still worry about that, but a recent article by Abishaike Mahajan made me realize that we we also need to worry about AI-enabled agricultural bioweapons.
This week’s big story is the limited release of Claude Mythos Preview. The headline is that Mythos is alarmingly good at cybersecurity, with the ability to find and exploit critical vulnerabilities en masse. Anthropic is handling that responsibly, but the next year or two will be challenging for security. If you haven’t already, now is a good time to review and improve your personal security practices.
Cybersecurity isn’t the only story here: Mythos appears to be the first of the next generation of much larger models. Early data suggest it represents another acceleration of the rate of capability progress, although that’s hard to assess while it’s still in limited release. And from a safety perspective, Anthropic says this is simultaneously the most aligned model they’ve ever created and the most dangerous.
Today’s Inkhaven post is a preview of the Mythos content from tomorrow’s newsletter.
This week’s big story is the limited release of Claude Mythos Preview. The headline is that Mythos is alarmingly good at cybersecurity, with the ability to find and exploit critical vulnerabilities en masse. Anthropic is handling that as responsibly as one could hope for, but the next year or two will be challenging for security. If you haven’t already, now would be a good time to review and improve your personal security practices.
Cybersecurity isn’t the only story here: Mythos is probably the first of the next generation of much larger models. Early data suggest it represents another acceleration of the rate of capability progress, although that’s hard to assess while it’s still in limited release. And from a safety perspective, Anthropic says this is both the most aligned model they’ve ever created and also the most dangerous.
I’ve found the current discourse about AI and math deeply confusing: for those of us who aren’t mathematicians, it’s hard to figure out what’s hype and what’s substantive. Does solving an Erdős problem represent a meaningful breakthrough, or does it just mean the AI tracked down a previously-published answer to a problem nobody ever cared enough about to investigate?
The answer turns out to be complicated but interesting: frontier models are impressively good at math—and getting better fast—but they’re a long way from putting mathematicians out of work. In many ways, math is like coding: AI is getting quite good at doing many of the mundane things that mathematicians spend their time doing, but it lacks the taste and high-level understanding required to do genuinely novel work.
I expect it’ll take another week or two for everyone to fully digest the significance of Claude Mythos Preview. In the meantime, here are my initial thoughts.
I see a lot of AI safety strategies that don’t fully engage with the complexity of the real world—and therefore are unlikely to succeed in the real world.
To take a simple example: many strategies rely heavily on government playing a leading role through regulation and perhaps even nationalization. That’s a reasonable strategy in the abstract, but the recent conflict between DoW and Anthropic raises serious questions about the real-world viability of that approach. Too many people are stuck thinking about some idealized government they’d like to have, rather than the government we actually have in 2026.
My thinking about AI safety strategy is anchored by six foundational beliefs about the world in which that strategy has to operate:
- Timelines are probably short
- Many open questions have been resolved
- The future is high variance
- We need a portfolio of strategies
- It’s all about the game theory
- Expect tough tradeoffs
Writing With Robots, Part Three
This is the final part of my series about how I use AI as an editor. It covers my voice, how to assess each essay as a whole, and details about my writing style. I also include detailed information about bad habits I’m trying to break, and a final checklist for Claude to use when evaluating a piece.
I’m publishing this as its own piece for Inkhaven, but you should probably just read the final essay, which combines all three pieces.
This is the second of three pieces about how I use AI as an editor. Part One discussed my the purpose of my style guide and explains Claude’s role and the purpose of my writing. Today I’ll walk through the high level goals laid out in the style guide. Part Three will discuss voice and low-level stylistic decisions.
Cybersecurity capabilities have crossed a threshold: frontier models can now find important vulnerabilities at scale. Open source projects are being deluged with high-quality bug reports, and we’re seeing an increasing number of serious exploits in the wild. Today’s capabilities are already alarming, but we’re also seeing rapid progress, with doubling times of less than six months. Things are moving fast, and we’re beginning to run out of useful benchmarks.
Many key decision makers have powerful incentives to favor rapid AI development even if that entails a significant risk of human extinction. Therefore, any pause strategy that relies on convincing those people that rapid AI development is dangerous is doomed to failure.
In the AI safety community, I see lots of good discussion of why a pause would be a good idea, how it might be implemented, and how to convince people that it would be a good idea. But I’d love to see more engagement with the game theory of AI politics. Achieving a useful pause requires overcoming some perverse incentives that don’t get enough attention.
How to Watch an Intelligence Explosion
The cleanest metric for understanding the rate of recursive self improvement (RSI) is AI Futures Project’s R&D progress multiplier, which measures how much AI is speeding up its own development. It’s the right tool for measuring an intelligence explosion, but it doesn’t tell us which capability thresholds carry the greatest risk from misaligned AI.
Ajeya Cotra steps into that gap with an elegant taxonomy of 6 milestones for AI automation. Together, those two concepts let us measure the how fast RSI is proceeding, how close we are to a fully automated economy, and when a misaligned AI would be most likely to betray us.
My AI editor is essential to my writing flow and has made me a stronger and more consistent writer. I get a lot of questions about my setup, so I’m going to talk about how I think about the role of AI, how I set up my editing workflow, and how to set up your own editor. Not sure if that would be useful to you? The final section of this post is the feedback Claude gave me on my first draft, so you can assess for yourself.
Ezra Klein Interviews Jack Clark, Part 1
Ezra Klein and Jack Clark? Shut up and take my money.
Jack always has interesting thoughts about the larger social impact of AI as well as the trajectory of the frontier models. The whole interview is great, but I want to focus on six topics I found especially interesting and/or surprising:
- Model personality
- Claude’s moral preferences
- Excellent but awkward life advice
- Jobs and employment
- Public policy
- Where we’re headed
Does the Future Need Programmers? Part 1
There’s a common concern that AI may break the programmer pipeline, with junior developers becoming unemployable but senior developers more in demand than ever. I think that’s unlikely: if AI replaces junior developers, it will soon after replace their senior colleagues.
