Monday AI Radar #4
It’s the time of year when people start publishing retrospectives—we have a great review of Chinese AI in 2025, an in-depth review of technical developments, and a report on the state of enterprise AI deployment. Stand by for more of these over the next few weeks.
If you’re looking for data, we have overviews of on when prediction markets think AGI might arrive (hint: soon) and safety practices at the big labs (hint: not great). Plus AI crushes another major math contest, some guidance on integrating AI into education, and lots more. But let’s ease into it with a fun conversation about model psychology.
As always, Monday AI Brief is a shorter and less technical version of this newsletter.
Top pick
One of many things that makes Anthropic unique is their thoughtful approach to model psychology. Here's a great interview with Amanda Askell, a philosopher at Anthropic who works on Claude's character. Lots of good stuff here, including how you train a model to have good "character" and whether the models are moral patients (i.e., whether they deserve moral consideration).
Until recently, most people—including me—would have said it was pretty unlikely that “model psychology” would be a real thing. But recent frontier models are starting to show some early features that sure seem analogous to human psychology. The correct amount to anthropomorphize current AI is less than 100%, but also more than 0%.
Buckle up, kids. Things are starting to get weird.
New releases
OpenAI releases GPT-5.2
GPT-5.0 in August, GPT-5.1 in November, and now GPT-5.2 in December (plus rumors that 5.3 is scheduled for January). It’s an excellent model, especially for hard thinking and coding, although it isn’t winning any awards for personality. As usual, Zvi has all the details.
Crystal ball department
When Will We Get AGI?
GoodHeart Labs has a nice page that aggregates prediction markets for the arrival of AGI (spoiler: 2031). Per custom, I must now remind you that just a few years ago, “short timelines” meant 20 years.
Gradually, then suddenly
Andy Jones thinks it's gonna happen fast. He has a very insightful discussion of how gradual changes in engine technology led to very abrupt changes in the usefulness of horses.
I very much hope we'll get the two decades that horses did. But looking at how fast Claude is automating my job, I think we're getting a lot less.
Insights into Claude Opus 4.5 from Pokémon
One area where Claude trails the competition is Pokémon: Google and OpenAI beat it months ago, but Claude still hasn't made it all the way through. Opus 4.5 does much better, however—here's an interesting look at where it does well and what it still struggles with.
CORE-BENCH is solved
Yet another evaluation falls: CORE-Bench has been declared solved after being almost completely solved by Opus 4.5 + Claude Code. Sayash Kapoor has lots of interesting details including the surprising importance of scaffolding and why it’s so hard to avoid grading errors in complex evaluations.
AxiomProver crushes the Putnam math contest
Speaking of the sound of benchmarks shattering, AxiomProver just crushed the 2025 Putnam math contest, solving 8 out of 12 problems (plus one more after the time limit). Human scores won't be released until next year, but that score would have been in the top 5 last year (out of 4,000ish contestants).
AI in 2025: gestalt
This overview of 2025 by technicalities is dauntingly long, but full of great information.
Robots at work
We are in the era of Science Slop
Here's a cautionary follow up to last week's note that Steven Hsu had a paper accepted to Physics Letters B whose key insight was from ChatGPT. Further investigation suggests the key insight had been found 35 years ago, and that the paper contained significant mistakes. Jonathan Oppenheim has the details, plus some thoughts about science slop.
Guidance on AI-Integrated Education & Training
Convergence Analysis has some solid guidance for education in the age of AI. Lots of good ideas here, but no clear answers. Zvi sums the situation up nicely:
- AI is the best tool ever invented for learning.
- AI is the best tool ever invented for not learning.
- Which way, modern man?
If you're a student (hint: you are, or should be), there’s lots of alpha behind door number 1. If you're an educator, you have to grapple with the unfortunate fact that most humans choose door number 2.
Alignment and interpretability
Do We Want Obedience or Alignment?
Beren breaks down one of the fundamental questions of alignment: should an aligned AI do what we tell it to, or should it do what is right? This question seems hard on the surface, and gets harder the closer you look at it. If you want AI to do what it's told, have you thought carefully about who specifically is telling it what to do (hint: not you)? And if you want it to do what is "right", have you thought about the extent to which you’ve come to rely on ethical “flexibility” in yourself and others?
An Ambitious Vision for Interpretability
We've previously talked about GDM's pivot toward a more pragmatic approach to interpretability. Leogao makes the case for the importance and feasibility of ambitious mechanistic interpretability. The feasibility is above my pay grade, but the importance seems beyond doubt and I'm glad there's still active research in this area.
AI Evaluation Should Work With Humans
From a paper by Jan Kulveit, Gavin Leech, Tomáš Gavenciak, and Raymond Douglas:
the AI community should pivot to evaluating the performance of human–AI teams.
This seems important: as AI gets more capable, evaluations need to shift from simple multiple-choice questions to more complex assessments of real-world utility. One important part of that is the ability to augment humans. Obviously, it’s not trivial to produce high quality evaluations that measure the performance of human-AI teams on complex tasks.
We argue that this collaborative shift in evaluation will foster AI systems that act as true complements to human capabilities and therefore lead to far better societal outcomes than the current process.
If only it were so simple. If capability growth stays on track, we're going to speedrun the transition from augmentation to replacement, regardless of what evaluations we're using. I'm afraid we aren't many years away from the point where these evaluations will do nothing more than carefully document the fact that solo AIs outperform human-AI teams.
AI psychology
The Evidence for AI Consciousness, Today
I don’t think current AIs are meaningfully conscious, but I’m no longer certain that’s the case and I expect to become much less certain soon. Cameron Berg considers what we do and don’t know:
Researchers are starting to more systematically investigate this question, and they're finding evidence worth taking seriously. Over just the last year, independent groups across different labs, using different methods, have documented increasing signatures of consciousness-like dynamics in frontier models.
Are we dead yet?
AI Safety Index Winter 2025
The Future of Life Institute just released their AI Safety Index Winter 2025. Key takeaways:
- Anthropic leads with a C+ overall and the best score in every category
- Anthropic, OpenAI, and Google get C’s
- Meta, xAI, and the Chinese labs get D’s
- The highest grade for existential risk is a D
This is fine.
AISafety.com
AISafety overhauled their website. It's a great resource for getting involved in AI safety (professionally or casually), with a guide to relevant organizations, events and trainings, communities, and more. For something similar but more focused on professionals, 80,000 Hours remains an excellent resource.
Blood in the Machine
Here are some grim first-person accounts of copywriters losing their jobs to AI. Being mindful that this is a collection of anecdotes and not a rigorous study, I thought it did a good job of capturing the flavor of what has happened to a few people so far, but is about to happen to many more. Expect a lot more of this in the public discourse very soon.
The Normalization of Deviance in AI
This article is not about what I was expecting based on the title.
But it's good nonetheless. Short version: LLMs have serious security challenges (most notably, prompt injection attacks), but we are normalizing the process of deploying them without appropriate safeguards. This is unlikely to end well.
Strategy and politics
Lightcone Infrastructure’s annual fundraiser
Lightcone Infrastructure supports some of the most impactful projects helping humanity navigate the transition to superintelligence. Most of our 2026 giving is going to Lightcone and I'd encourage you to give to them also.
Selling H200s to China Is Unwise and Unpopular
Zvi explains why selling H200s to china is unwise and unpopular. Preach.
The AI Whistleblower Initiative
Whistleblower protections are an important tool for increasing transparency around safety practices at frontier labs. We've seen some good progress with both legislation and internal policies lately; the AI Whistleblower Initiative is a new project that promises to provide further support.
Early US policy priorities for AGI
Here’s a guest post by Nick Marsh on the AI Future Project blog (they’re the folks who did AI-2027). Lots of good ideas here, although like almost every other proposal, I think this underestimates the challenges facing any kind of meaningful international coordination right now.
Industry news
Agentic AI Foundation (AAIF)
A group of the big players have come together to create the Agentic AI Foundation (AAIF), which will take over ownership of a couple of core technologies including MCP (Model Context Protocol). This seems unequivocally good, though not game-changing.
A review of Chinese AI in 2025
ChinaTalk provides consistently strong coverage of what's going on in China and their summary of Chinese AI in 2025 is excellent.
2025: The State of Generative AI in the Enterprise
Menlo Ventures has a report on the state of generative AI in the enterprise. No big surprises, but lots of data about who's buying what, and how they're using it.
Rationality
Principles and Generators of a Rationality Dojo
DaystarEld shares some insights from teaching at rationality summer camps:
When I think of the people I've met who actually seem to be rationalists, rather than just people who like the ideas or the community, there are specific things that stand out to me. Traits and behaviors, yes, but deeper than that. Values, philosophies, and knowledge that's embodied and evident across a variety of actions.
I call these “generators,” and I think they’re more important than any specific beliefs or techniques. If there's a "spark" that makes someone a rationalist, or proto-rationalist, or aspiring rationalist, or whatever, I think these generators (or ones very much like them) are the bits that make up that spark.
Side interests
Derek Thompson’s 26 most important ideas for 2026
Derek Thompson has a great list of 26 important ideas for 2026. I particularly recommend #1 (The end of reading), #6 (Get ready for a wave of anti-AI populism), and #22 (Negativity bias rules everything around me).
Light reading
How to party like an AI researcher
Jasmine Sun went to NeurIPS 2025 (perhaps the most important machine learning conference) and has a fun piece about the vibe of the event.
