Monday AI Radar #13
This week’s newsletter in a word: “velocity”. We’ll take a deep look at last week’s big models drops (just a few months after the previous big drops), and try to figure out if they’re reached High levels of dangerous capabilities. Nobody’s quite sure, because capabilities are outrunning evaluations.
We also check in on the country of geniuses in a data center (still 2028, according to Dario), contemplate what we should align AI to (assuming we can figure out how to align it to anything), and catch up on the Chinese AI industry.
Subscribe by email RSS feed Shorter version
Top pick
Something Big Is Happening
Matt Shumer’s Something Big Is Happening has been making the rounds this week. It’s a great “you need to wake up” piece for anyone you know who doesn’t understand the magnitude of what’s happening right now.
But it’s time now. Not in an “eventually we should talk about this” way. In a “this is happening right now and I need you to understand it” way. [...]
The experience that tech workers have had over the past year, of watching AI go from “helpful tool” to “does my job better than I do”, is the experience everyone else is about to have. Law, finance, medicine, accounting, consulting, writing, design, analysis, customer service. Not in ten years. The people building these systems say one to five years. Some say less. And given what I’ve seen in just the last couple of months, I think “less” is more likely.
My writing
Ads, Incentives, and Destiny
OpenAI has started showing ads in some tiers of ChatGPT. They’re fine for now, but I worry about where those incentives lead.
New releases
Zvi reports on Claude Opus 4.6
Opus 4.6 is a pretty big deal—it’s a substantial upgrade to Opus 4.5, which was probably already the best overall model (and which just shipped 2 months ago). Not surprisingly, Zvi has lots to say about it.
Claude Opus 4.6 Escalates Things Quickly. It’s a very good model.
System Card Part 1: Mundane Alignment + Model Welfare Key takeaways:
- Anthropic’s system cards are far better than any other lab’s
- But also, they aren’t good enough
- We are increasingly flying blind: our evaluations simply aren’t able to usefully measure the safety (or lack thereof) of 2026 frontier models
- Like OpenAI, Anthropic is very close to ASL-4 thresholds on multiple fronts
System Card Part 2: Frontier Alignment
I want to end on this note: We are not prepared. The models are absolutely in the range where they are starting to be plausibly dangerous. The evaluations Anthropic does will not consistently identify dangerous capabilities or propensities, and everyone else’s evaluations are substantially worse than those at Anthropic.
Zvi looks at ChatGPT-5.3-Codex
Does Zvi sleep? Nobody knows. ChatGPT-5.3-Codex is an excellent model, and this is a significant upgrade.
GPT‑5.3‑Codex‑Spark
Intriguing: GPT‑5.3‑Codex‑Spark is a less capable version of Codex that can do more than 1,000 tokens / second, which is fast. Like, really fast. Sometimes you need maximum intelligence, but for many applications, model speed is an important rate limiter for productivity. A super-fast, good-enough model might be a game changer for many tasks.
Cursor Composer 1.5
Cursor have upgraded their Composer, their in-house agentic coding model, to version 1.5.
Gemini 3 Deep Think
There’s a significant update to Gemini 3 Deep Think, focusing on science, research, and engineering. Simon Willison reports that it raises the bar for bicycle-riding pelicans.
Agents!
We Just Got a Peek at How Crazy a World With AI Agents May Be
Now that the frenzy over OpenClaw and Moltbook has died down, Steve Newman takes a look at what just happened (not all that much, actually) and what it means (a sneak peek at some aspects of the future).
OpenClaw, OpenAI and the future
Well, that didn’t take long. Peter Steinberger (the creator of OpenClaw) is joining OpenAI. OpenClaw will be moving to a foundation.
Benchmarks and Forecasts
Dario Amodei does interviews
Two really good interviews with Dario this week:
- With Dwarkesh Patel. Characteristically long and in-depth, with some really good discussion of exponentials and the timeline to the fabled country of geniuses in a data center. Zvi shares his thoughts
- With Ross Douthat ($) (who’s been slaying it lately). This one is shorter and more philosophical.
AI Is Getting Scary Good at Making Predictions
AI is getting very good at almost everything, including complex cognitive tasks that require deep understanding and judgment. The Atlantic reports on AI forecasters at recent Metaculus tournaments ($):
Like other participants, the Mantic AI had to answer 60 questions by assigning probabilities to certain outcomes. The AI had to guess how the battle lines in Ukraine would shift. It had to pick the winner of the Tour de France and estimate Superman’s global box-office gross during its opening weekend. It had to say whether China would ban the export of a rare earth element, and predict whether a major hurricane would strike the Atlantic coast before September. […]
The AI placed eighth out of more than 500 entrants, a new record for a bot.
What the hell happened with AGI timelines in 2025?
2025 was a wild year for timelines: exuberance early on, then a substantial lengthening in the middle of the year, and another round of exuberance at the end of the year. Rob Wiblin explores why those shifts happened, with insightful analysis of the underlying trends. It’s a great piece, though it largely ignores the most recent shift.
Takeoff speeds rule everything around me
Much of the timelines discussion focuses on how long it takes to get to AGI, but Ajeya Cotra thinks takeoff speed is the most important crux (i.e., how fast we go from AGI to whatever happens next).
Grading AI 2027’s 2025 Predictions
The AI-2027 team calculate that the rate of AI progress in 2025 was about 65% of what they predicted.
AI is getting much better at hands
Andy Masley checks in on how well AI can draw hands.
Using AI
Tracking the “manosphere” with AI
Very often the question isn’t “how does AI let us do the usual thing cheaper”, but rather “what can we now do that wasn’t practical to do before?”. Nieman Lab reports on a slick tool at the New York Times:
When one of the shows publishes a new episode, the tool automatically downloads it, transcribes it, and summarizes the transcript. Every 24 hours the tool collates those summaries and generates a meta-summary with shared talking points and other notable daily trends. The final report is automatically emailed to journalists each morning at 8 a.m. ET.
Alignment and interpretability
There’s been some good discussion lately of what we should align AI to (which is separate from and almost as important as how to align it to anything at all).
Oliver Klingfjord believes integrity is a critical component:
Integrity isn’t everything in AI alignment. We want models with domain expertise, with good values, with the wisdom to enact them skillfully. Integrity doesn’t speak to the goodness of values. But it does speak to how deeply they run, how stable they are under pressure. It’s what lets us trust a model in situations we never anticipated.
Richard Ngo goes in a somewhat different direction, arguing for aligning to virtues.
I like that both Oliver and Richard emphasize the importance of generalizing well to unforeseen circumstances, which is a shortcoming of more deontological approaches like OpenAI’s.
Cybersecurity
Claude finds 500 high-severity 0-day vulnerabilities
In a convincing demonstration of AI’s ability to find vulnerabilities at scale, Anthropic uses Opus 4.6 to find more than 500 high-severity zero day vulnerabilities. The accomplishment is impressive, and the account of how it went about finding them is very interesting. If you’re wondering why both OpenAI and Anthropic believe they’re reaching High levels of cyber capabilities, this is why.
Lockdown Mode in ChatGPT
There is a fundamental tension between capability and security: technology that can do more will necessarily have a larger attack surface. OpenClaw was a great example of going all the way to one extreme, enabling an immense amount of cool capability by taking on a staggering level of risk. At the other end of the spectrum, OpenAI is rolling out Lockdown Mode for ChatGPT. Much like Lockdown Mode on the iPhone, this significantly reduces ChatGPT’s attack surface at the cost of significantly curtailing some useful capabilities. It’s meant for a small number of people who are at elevated risk of targeted cyberattacks.
Jobs and the economy
AI Doesn’t Reduce Work—It Intensifies It
This won’t come as a shock to anyone who’s felt the exhilaration (and compulsion) of having AI superpowers. Aruna Ranganathan and Xingqi Maggie Ye find that hours worked often increase when people get access to AI, with much of the pressure being self-imposed. Their analysis of the issue is great, but I’m less sold on their proposed solutions.
AI and the Economics of the Human Touch
Adam Ozimek argues that concerns about AI’s impacts on jobs are overstated because many jobs require a human touch: we prefer to have humans do those jobs even though we already have the ability to automate them. It’s a good and thoughtful piece, but I think it largely misses the point. We haven’t automated supermarket cashiers not because people love interacting with human cashiers, but because the automated replacements aren’t yet good enough. That will change soon.
Strategy and politics
Dean Ball On Recursive Self-Improvement (Part II)
Dean is characteristically cautious about writing regulations before we understand what we’re regulating. He proposes a system of third-party safety audits (much like our existing system for auditing corporate finances), where certified private auditors perform regular inspections of whether AI developers are following their own safety guidelines.
Did OpenAI violate California’s AI safety law?
Directly related to Dean’s piece, The Midas Project argues that when OpenAI released GPT-5.3-Codex, they appear to have violated California’s SB 53. Briefly: SB 53 takes a light touch to safety regulation, but requires that labs publish and adhere to a safety framework. Midas believes that OpenAI is treating GPT-5.3-Codex as having High capability in cybersecurity, but hasn’t activated the safeguards they said they would activate when that happened. OpenAI is pushing back—it’ll be interesting to see what California decides.
In the meantime, Steven Adler takes a detailed look.
China
Is China Cooking Waymo?
If you live in the US, you likely aren’t aware of how well China is doing with electric vehicles and autonomous vehicles. ChinaTalk takes a deep look at autonomous vehicles, diving into deployments in both the US and China, how the international market is shaping up, and how the supply chain works.
Is China falling behind?
Teortaxes argues that based on the WeirdML benchmark, the Chinese open models are falling further behind the frontier.
China and the US Are Running Different AI Races
Poe Zhao at AI Frontiers looks at the very different economic environment facing AI companies in China (much less private investment, and much less consumer willingness to pay for AI). Those factors shape their strategic choices, driving a focus on international markets, and a heavy emphasis on inference cost in both model and hardware design.
AI psychology
The many masks LLMs wear
One of the big surprises of the LLM era has been how strangely human-like AI can be. (The frequent occasions when it’s shockingly un-humanlike are perhaps stranger but less surprising). Kai Williams at Understanding AI explores character and personality in LLMs.
Industry news
More on ads in ChatGPT
Zoë Hitzig has an opinion piece in the New York Times:
This week, OpenAI started testing ads on ChatGPT. I also resigned from the company after spending two years as a researcher helping to shape how A.I. models were built and priced, and guiding early safety policies before standards were set in stone.
I once believed I could help the people building A.I. get ahead of the problems it would create. This week confirmed my slow realization that OpenAI seems to have stopped asking the questions I’d joined to help answer.
The Anthropic Hive Mind
Steve Yegge talked to a bunch of Anthropic employees and shares some thoughts about their unique culture.
Technical
microgpt
Wow. Karpathy has built a complete GPT engine in 200 lines of code.
Training compute matters a lot
Really interesting paper on the importance of training compute relative to algorithmic improvements:
At the frontier, 80-90% of performance differences are explained by higher training compute, implying that scale--not proprietary technology--drives frontier advances.
How persistent is the inference cost burden?
Toby Ord has recently made a good case that reinforcement learning has scaling challenges that present a significant obstacle to continued rapid improvement in capabilities. Epoch’s JS Denain isn’t entirely convinced:
Toby’s discussion of RL scaling versus inference scaling is useful, and the core observation that RL gains come largely with longer chains of thought is well-taken. But the picture he paints may overstate how much of a bottleneck this will be for AI progress.
Rationality
What Kind Of Apes Are We?
David Pinsof continues his excellent conversation with Dan Williams regarding human nature, the enlightenment, and evolutionary misfit. I love the way this conversation is happening, and I’m learning a lot from it: I’ve significantly updated some key beliefs I hold about how humans are not well evolved to handle the modern environment.
So my response to Dan might be something like, “Yea, maybe humans are kind of confused and maladapted sometimes, but *it’s also really insightful to see humans as savvy animals strategically pursuing their Darwinian goals.*” And Dan might say something like, “Yea, it’s pretty insightful to see humans as savvy animals strategically pursuing their Darwinian goals, but *it’s also really important to recognize that humans are confused and maladapted sometimes.*” It’s basically a disagreement over where to put the italics.
