Against Moloch

Monday Radar #3

December 10, 2025

First, some housekeeping: I’ve started Monday Brief, which is a shorter and less technical version of Monday Radar. You can get the email newsletter here if you’re interested.

There was only one big new release last week, but there’s still lots to catch up on. We’ll look at a couple of new metrics from CAIS and Epoch as well as progress reports on AI-powered science, coding productivity, and autonomous cars. Plus some great pieces on cyberwarfare, the vibecession, alignment, and AI companions.

Top pick

Benjamin Todd has a great piece on how AI might get weird in a hurry:

But there are other feedback loops that could still make things very crazy – even without superintelligence – it’s just that they take five to twenty years rather than a few months. The case for an acceleration is more robust than most people realise.

This article will outline three ways a true AI worker could transform the world, and the three feedback loops that produce these transformations, summarising research from the last five years.

New releases

DeepSeek-V3.2

DeepSeek just released DeepSeek-V3.2, an extremely capable open weights model. It isn’t as capable as the frontier models, but it’s probably less than a year behind. As always, Zvi has a full analysis of the release. I have three questions, only one of which is rhetorical:

  1. Chinese open weight models continue to fast-follow the big labs, with DeepSeek and MoonshotAI both within a year of the frontier. Will they catch up? Fall behind? Continue to fast-follow?
  2. DeepSeek’s models seem to be significantly behind the frontier in some important but intangible ways. How much does that matter, and how hard will it be to close that gap?
  3. DeepSeek has provided almost no safety documentation for this release, and it seems easy to get dangerous output from the model. If the frontier labs achieve truly dangerous capabilities within a year AND the open models stay less than a year behind them AND the open models continue to have almost no meaningful safeguards, how do we think that’s going to go?

Code Red at OpenAI

The Information reports that Sam Altman was concerned enough about Gemini 3 and other competitors to declare “code red” at OpenAI, shifting resources from projects like advertising and shopping to focus on improving ChatGPT.

Crystal ball department

The CAIS AI Dashboard

The Center for AI Safety has a new AI Dashboard, which does a great job of summarizing capabilities and safety metrics for the leading models. This is now my top pick for a single place to keep an eye on capabilities.

The Epoch Capabilities Index

In a similar vein, Epoch has come out with the Epoch Capabilities Index, a synthetic metric that combines performance on multiple evaluations. In addition to creating a single “overall” metric of capability, this attempts to create a metric that will be useful over long periods of time. Most evaluations can’t measure progress over a long period of time because they saturate quickly (i.e., top scores go from about 0% to about 100% over just a few years). By combining multiple evaluations, Epoch hopes to produce a metric that will produce useful results over a much longer time period.

Dwarkesh on AI Progress

Dwarkesh’s latest piece on the state of AI progress is well worth reading, especially the section on “Economic diffusion lag is cope for missing capabilities”.

The cost of intelligence is in free fall

We find that the price for a given level of benchmark performance has decreased remarkably fast, around 5× to 10× per year, for frontier models on knowledge, reasoning, math, and software engineering benchmarks.

Algorithmic progress is data progress

“Algorithmic progress” is frequently cited as a major contributor to capabilities growth, alongside increases in available compute. Here, Beren argues that much of what’s attributed to algorithmic progress is actually due to improvements in the quality of the data used for training.

Robots at work

AIs are getting pretty good at science

Some of you are old enough to remember September of 2025, when Scott Aaronson reported that ChatGPT had provided significant help with his most recent paper. Upping the ante, Steven Hsu reports of his paper in Physics Letters B that “the main idea in the paper originated de novo from GPT-5.”

The Medical Case for Self-Driving Cars

Jonathan Slotkin has an opinion piece about autonomous cars in The New York Times. Short version: Waymos are so much safer than human-driven vehicles that accelerating their deployment is a public health imperative. He argues that if this was a medical trial, medical ethics would require immediately ending the trial and canceling the human-drivers arm of the trial.

How AI Is Transforming Work at Anthropic

For a look at the bleeding edge of AI deployment, here’s Anthropic with a report on how their programmers use AI. Note that this relies on data from Opus 4: the consensus opinion is that Opus 4.5 is a major step forward for coding.

Employees self-reported that 12 months ago, they used Claude in 28% of their daily work and got a +20% productivity boost from it, whereas now, they use Claude in 59% of their work and achieve +50% productivity gains from it on average.

Alignment and interpretability

Alignment remains a hard, unsolved problem

evhub shares an adaptation of an internal Anthropic document about why alignment is hard.

How confessions can keep language models honest

Some nice proof of concept work from OpenAI on training models to honestly confess when they misbehave. A classic pitfall of many training techniques is that if you aren’t careful, you end up training the model to covertly misbehave rather than to behave well. This work takes some clever measures to minimize that problem.

 It’s their job to keep AI from destroying everything 

The Verge has a nice profile of Anthropic’s social impacts team.

How Can Interpretability Researchers Help AGI Go Well?

Following up on their recent pivot toward more pragmatic approaches, the Google DeepMind interpretability team have some thoughts on useful directions for interpretability.

Are we dead yet?

Can’t we just pull the plug?

So if we need to shut down a rogue AI, we can just turn off the internet or something, right? Rand looks at various extreme options including detonating 150 nuclear weapons in space to destroy telecommunications, power, and computing infrastructure with a giant EMP blast. Spoiler: don’t plan on humanity winning that fight.

Disrupting the first reported AI-orchestrated cyber espionage campaign

Anthropic reports on a Chinese cyber espionage campaign that used Claude for large-scale semi automated cyber attacks. This is the least effective that AI cyberwarfare will ever be.

Strategy and politics

Middle powers ASI prevention

Anton Leicht and others have written about the challenges facing middle powers in the Artificial SuperIntelligence age. Here, a team of folks from Conjecture and Control AI propose a treaty framework for middle powers to unite against the development of ASI. It’s an interesting framework, but I just don’t see that the middle powers have the power to pull this off, even if they could solve the probably unsolvable coordination challenges.

Reverse centaurs

I have a lot of respect for Cory Doctorow—he’s an insightful thinker, and his concept of enshittification is vital to understanding the modern internet. He’s got another really good concept here, which I could imagine becoming part of the canon:

Start with what a reverse centaur is. In automation theory, a "centaur" is a person who is assisted by a machine. You're a human head being carried around on a tireless robot body. Driving a car makes you a centaur, and so does using autocomplete.

And obviously, a reverse centaur is machine head on a human body, a person who is serving as a squishy meat appendage for an uncaring machine.

That excellent and insightful term comes from an essay that is otherwise profoundly misguided—Daniel Miessler does a good job of summarizing where it falls short.

Philosophy department

What if AI ends loneliness?

I really enjoyed this long but excellent piece by Tom Rachman on AI companions and loneliness. Obvious prediction: AI will give us the option of getting exactly what we really want in companions, without the reciprocity requirement of human companions. Cover your eyes—it’s gonna be gruesome.

Side interests

Scott Alexander investigates the vibecession

Are the youth succumbing to a “negativity bias” where they see the past through “rose-colored glasses”? Are the economists looking at some ivory tower High Modernist metric that fails to capture real life? Or is there something more complicated going on?

I still don’t know the answer after reading Scott’s investigation, but I am confused on a deeper level than before, and I’ve substantially updated my understanding of some of the the core economic facts.