Against Moloch

Monday AI Brief #4

December 15, 2025

Welcome to the shorter and less technical version of Monday AI Radar. We’re focusing on model psychology this week with pieces on training “character” at Anthropic, what we do and don’t know about the possibility of AI consciousness, and some hard questions about whether AI should prioritize obedience or virtue. Plus grading the big labs on their safety practices, copywriters talk about losing their jobs to AI, and a lighthearted look at a recent big AI conference.

Model psychology at Anthropic

One of many things that makes Anthropic unique is their thoughtful approach to model psychology. Here's a great interview with Amanda Askell, a philosopher at Anthropic who works on Claude's character. Lots of good stuff here, including how you train a model to have good "character" and whether the models are moral patients (i.e., whether they deserve moral consideration).

Until recently, most people—including me—would have said it was pretty unlikely that “model psychology” would be a real thing. But recent frontier models are starting to show some early features that sure seem analogous to human psychology. The correct amount to anthropomorphize current AI is less than 100%, but also more than 0%.

Buckle up, kids. Things are starting to get weird.

AI Safety Index Winter 2025

The Future of Life Institute just released their AI Safety Index Winter 2025. Key takeaways:

This is fine.

Do We Want Obedience or Alignment?

Beren breaks down one of the fundamental questions of alignment: should an aligned AI do what we tell it to, or should it do what is right? This question seems hard on the surface, and gets harder the closer you look at it. If you want AI to do what it's told, have you thought carefully about who specifically is telling it what to do (hint: not you)? And if you want it to do what is "right", have you thought about the extent to which you’ve come to rely on ethical “flexibility” in yourself and others?

The Evidence for AI Consciousness, Today

I don’t think current AIs are meaningfully conscious, but I’m no longer certain that’s the case and I expect to become much less certain soon. Cameron Berg considers what we do and don’t know:

Researchers are starting to more systematically investigate this question, and they're finding evidence worth taking seriously. Over just the last year, independent groups across different labs, using different methods, have documented increasing signatures of consciousness-like dynamics in frontier models.

Blood in the Machine

Here are some grim first-person accounts of copywriters losing their jobs to AI. Being mindful that this is a collection of anecdotes and not a rigorous study, I thought it did a good job of capturing the flavor of what has happened to a few people so far, but is about to happen to many more. Expect a lot more of this in the public discourse very soon.

How to party like an AI researcher

Jasmine Sun went to NeurIPS 2025 (perhaps the most important machine learning conference) and has a fun piece about the vibe of the event.