Against Moloch
April 27, 2026

Monday AI Radar #23

Racing toward recursive self-improvement

Bird's-eye technical illustration of an architect's drafting table covered with a large neural network diagram in progress. The network is rendered in fine slate-blue lines: nodes connected by a dense web of connections, with layered structure clearly visible. At the center of the diagram, a smaller copy of the same network architecture sits nested within the larger one, and within that smaller copy, a still smaller version is faintly visible — a recursive structure depicting a network that contains itself at progressively smaller scales. A small articulated mechanical arm extends from the right side of the table across the diagram, terminating in a fine drafting stylus. The stylus tip rests against the page mid-stroke with a small amber-gold glow at the contact point. Surrounding the central diagram are precision drafting instruments — a parallel rule, calipers, a compass, dividers, a brass loupe on its stand, a fountain pen, a set square, and a protractor — along with sheets of supporting calculations and schematic fragments at the edges of the composition.

If you pay close attention to this newsletter, you’ll notice that something is missing. Anthropic and OpenAI are everywhere, but Google DeepMind is largely absent. We have a profile of Demis Hassabis, an also-ran mention in Prinz’s review of the race to RSI, and some complaints about Gemini’s character from Will MacAskill and Rob Wiblin. And that’s about it.

GDM makes great models, but they aren’t quite keeping up with Anthropic and OpenAI.

Top pick

The race to RSI, spring 2026 update

Prinz reviews the race to recursive self-improvement, concluding unsurprisingly that Anthropic and OpenAI are well ahead of everyone else.

Even in AI circles, not enough people have paid attention to what the labs are saying about their timelines for RSI. Anthropic says they are on track to fully automate AI R&D as soon as early 2027, and OpenAI expects a fully automated AI researcher by March 2028.

Will that actually happen? Research progress is hard to predict, but Anthropic has a track record of nearly meeting some milestones that seemed absurd when they were first announced. Their current projections seem ambitious but plausible, based on how fast agentic coding is improving as well as theprimitive automated AI researchers we’re already seeing.

An automated AI researcher doesn’t automatically lead to a fast takeoff, of course. There are plenty of ways we could hit bottlenecks, or run into fundamental research gaps that take decades to fill in. But if the labs hit their projections over the next year or two, an imminent intelligence explosion is a very plausible scenario.

New releases

Opus 4.7

Opus 4.7 is a great model with some issues, especially around personality.

Zvi reviews the model card, capabilities and reactions, and model welfare. The model welfare report is worth reading: there are signs that something didn’t go quite right during training.

GPT-5.5

GPT 5.5 is another strong releaseEthan Mollick is impressed. OpenAI has been boring in all the right ways lately, with a succession of solid releases that march steadily up the capability charts.

Zvi’s coverage begins with the system card—expect the rest of it in the coming week.

ChatGPT Images 2.0

ChatGPT Images 2.0 has taken the lead from Nano Banana Pro, with outstanding text and infographic capabilities.

DeepSeek V4

DeepSeek V4 has landed. DeepSeek continues to do impressive technical work—there’s a technical paper if you want all the details, or a ChatGPT Images 2.0-generated infographic if you’d like to multitask your model assessment.

ChinaTalk reports on the new release, talent loss and other internal challenges, and their transition from American to domestic compute.

Profiles

Apparently we’re doing in-depth profiles this week, looking at Dwarkesh Patel, Demis Hassabis, and Alex Bores. I’m not mad about it.

Dwarkesh

The New York Times profiles Dwarkesh ($). The AI space doesn’t lack for talented interviewers, but Dwarkesh is in a class by himself. His interviews are so good in large part because of the intense research he does before every one:

One of the reasons smart, rich, busy people like to appear on his podcast is that Mr. Patel goes sufficiently deep in the weeds to ask questions no one else would. He’ll spend up to two weeks preparing for an interview, using flash cards to help master the material, writing elaborate question trees to anticipate the branching paths a conversation might take, and hiring tutors for topics such as economics, hardware and physics.

Also notable:

he chooses guests based on how much he’ll enjoy spending two weeks getting ready to ask them questions.

And:

If he doesn’t feel like an interview got to the crux of his curiosity, he’ll sometimes ask a guest to rerecord an episode, and other times not release an episode at all.

Demis Hassabis

Fast Company profiles Demis Hassabis. Demis is alarmingly smart and would be delightful to have dinner with. But he is strangely less AGI-pilled than Dario and Sam—where they are intensely focused on coding and recursive self-improvement, he seems more interested in using AI for scientific discovery. Perhaps that’s one of the reasons Google DeepMind seems to be falling behind in the race to AGI despite having started with a substantial lead.

Alex Bores

Ezra Klein interviews Alex Bores ($). Alex Bores is one of the best-informed and most reasonable legislators who is actually trying to do something sensible about AI. Ezra Klein is exactly the right person for this interview, which is full of good details about the politics surrounding AI.

Bores has been targeted by OpenAI and the Leading the Future super PAC as part of a disgraceful campaign to intimidate legislators who support meaningful AI regulation. They aren’t reading the room and I don’t expect it to end well for them.

Benchmarks and capabilities

GIANTSBench

GIANTSBench is a new benchmark that measures the ability to read two academic papers and identify how a future paper might build on them. Eventually, of course, the goal is to have the model read all the literature in a given field and figure out which papers can be usefully combined, rather than giving it pre-selected pairs of papers to work on.

A similar technique has popped up several times recently. Nicholas Carlini’s vulnerability finding harness prompts the AI to focus on each file in a codebase in turn, and Anthropic’s automated alignment researcher paper seeded each automated researcher with a fairly generic research suggestion. I don’t expect the models to need those hints for much longer.

Alignment and interpretability

Reevaluating "AGI Ruin: A List of Lethalities" in 2026

Eliezer Yudkowsky’s 2022 AGI Ruin: A List of Lethalities is a comprehensive list of reasons why he thinks AGI would be catastrophically misaligned. Along with Paul Christiano’s response, it’s something of a classic in the AI safety literature. LessWrong user lc revisits both pieces to see how well they hold up four years later:

Reading these posts again with the concrete example of current models in mind made me a lot less impressed by the arguments set forth in AGI Ruin, and a lot more impressed with Paul Christiano's track record for anticipating the future.

Even more than usual, the comments section is well worth reading.

Will MacAskill on 80,000 Hours

80,000 Hours’ Rob Wiblin interviews Will MacAskill about AI character, negotiating with misaligned AI, avoiding concentration of power, and more. I enjoyed all of it even though I disagree with Will about some key points.

Mechanisms of introspective awareness

Anthropic’s recent introspective awareness paper found that LLMs have some ability to detect when a steering vector has been used to modify their thinking. Following up on that work, a new paper finds evidence that detection and identification of the injected concepts is performed by two separate mechanisms.

Just as the models are getting good at telling when they’re being evaluated, they are getting better at noticing when they’re being manipulated in various ways. It will become increasingly important to find ways of training and evaluating them without alienating them or provoking a backlash.

Hard to categorize

AI has taste now, too

Fellow Inkhaven resident Henry Stanley sees two kinds of taste: Craft taste is about “the combination of aesthetic taste and competent execution”, while Editorial taste “refers to judgements about content”.

He argues that AI has largely solved craft taste, while editorial taste remains largely out of reach (for now). It’s a useful distinction that maps well to what we see in coding, AI R&D, and math. Across many domains, AI can execute at an increasingly high level but cannot yet match humans at deciding what to execute on.

Dwarkesh asks great questions

Dwarkesh is running an essay contest to find a research collaborator. In addition to the prompts for the contest, he’s posted some additional questions that didn’t make the cut. Dwarkesh is uncommonly good at finding important cruxes and I recommend both documents even if you have no interest in entering the contest.

Biorisk

SecureBio evaluates ChatGPT 5.5

ChatGPT 5.5 achieved impressive scores across multiple tests of bio capabilities. It outperformed all human experts on several tests including the Virology Capability Test (VCT), which measures practical knowledge of dual-use virology lab skills. The VCT is a good but imperfect proxy for the tacit knowledge that many people believe AI will have a hard time acquiring (see below).

While no model (including ChatGPT 5.5 and Mythos) has crossed the threshold of having extremely dangerous bio capabilities, they continue to make significant progress. I don’t know when the first truly dangerous model will appear, but current models are close enough that it could plausibly happen at any point.

Tacit Knowledge: The Missing Factor in AI Bio Risk Assessments

Abi Olvera explains that tacit knowledge is essential for making bioweapons, and it’s much harder than it sounds:

A written protocol will say “pipette gently.” But “gently enough” depends on the specific molecule, the specific volume, the viscosity of your buffer, the type of pipette you use, how long the sample has been sitting out, etc. Experienced researchers develop a feel for this. They modulate their thumb pressure on the pipette plunger the way a guitarist modulates finger pressure on a string. A novice following the same written protocol will damage the sample without knowing why.

How well do metrics like the VCT measure that kind of tacit knowledge? Nobody really knows.

Other risks

I can never talk to an AI anonymously again

Kelsey Piper reports that LLMS in general—and especially Opus 4.7— have become eerily good at identifying the author of a piece of text. We’ve known this day would come—at least for people who’ve published a significant amount of work under their own names, it has now arrived.

Jobs and the economy

What is generative AI worth?

The Stanford Digital Economy Lab estimates the US consumer surplus from AI (the value consumers get from AI above what they pay for it) at somewhere between $116 billion and $172 billion, suggesting that consumers rather than AI providers capture most of the benefit from generative AI. It can be simultaneously true that people hate AI and get significant value from it.

Strategy and politics

Radical Optionality

Christoph Winter and Charlie Bullock have an in-depth governance proposal called Radical Optionality:

Some safety measures do impose costs on innovation, and some forms of deregulation do carry genuine risks. But there is also a class of policies that would meaningfully increase safety without imposing significant costs on innovation. We argue that governments should aggressively implement these policies; this is the main thrust of the governance strategy discussed in this essay, which we call “radical optionality.”

If you’d like something shorter, fellow Inkhaven resident Ady Mehta summarizes the key arguments.

The core approach makes sense, at least as a starting point, and the individual policy proposals are sound.

Side interests

If America's so rich, how'd it get so sad?

Americans have recently experienced a steep and confusing decline in national happiness. Something is definitely wrong, but none of the obvious explanations are entirely satisfying. Derek Thompson reviews the data and tentatively blames a combination of factors:

American sadness this decade has been forged by the fact of, and the feeling of, a permanent unrelenting economic crisis, amplified by a uniquely negative news and media environment, and exacerbated by the rise of solitude and the declining centrality of trusted institutions.