Monday AI Radar #8

January 12, 2026

People continue to lose their minds about Claude Code. We’ll begin this week’s newsletter with a look at what people are using it for and where they think it’s headed. Here’s my short take: Claude Code’s present usefulness is 30% overhyped. A lot of the amazing things people are reporting are genuinely amazing, but they’re quick working prototypes of fairly simple tools. But…

Sometime in the past couple of months, AI crossed a really important capability threshold. By the end of 2025, it was clear to any programmer who was paying attention that our profession has completely changed. By the end of 2026, I think that same thing will be true for many professions. Most people won’t realize it right away, and it may (or may not) take a few years for the changes to really take hold, but the writing is now very clearly on the wall.

You can get this by email or in a shorter and less technical version.

Top pick: How AI Is learning to think in secret

Nicholas Andresen’s piece on how AI Is learning to think in secret is long, but it’s really good. It does a great job of explaining multiple important AI safety concepts in detail but without excessive technical jargon.

Chain of Thought (CoT) reasoning is the reason AI became so much more capable in late 2024, and through an incredibly lucky happenstance it also provides us with one of our best tools for monitoring AI for misbehavior. Andresen explains how CoT works, how it’s used for monitoring, and why we’re in danger of losing that capability.

Losing our minds about Claude Code

Many of the people who’re most excited about Claude Code aren’t using it for coding at all—it’s a really powerful agentic tool for doing almost any kind of knowledge work.

Zvi Mowshowitz: Claude Codes

Pro tip: Zvi is super smart and full of good insights. If he’s written about something, it’s likely to be one of the best and most comprehensive pieces on that topic. He’s also astonishingly prolific and you can go insane trying to read everything he writes. I am here to give you permission to skim his writing and not feel guilty if you stop halfway through.

Here’s Zvi’s excellent piece on Claude Code.

Deal Ball: Among the Agents

Dean has previously suggested that Claude Code + Opus 4.5 counts as AGI, which I just don’t see. Here he proposes the term “infant AGI”, which I think is perfect. I don’t think we’re quite there yet—I’m holding out for continual learning, but I think we’re at a point where reasonable people can disagree about that. As always, Dean’s thoughts are well worth reading.

Steve Newman: Software Too Cheap to Meter

Steve Newman believes we’re approaching the era of software too cheap to meter.

Ethan Mollick: Claude Code and What Comes Next

Ethan Mollick has some helpful thoughts on how non-coders can get started using the desktop app instead of the command line version.

DHH: Promoting AI Agents

Add DHH to the list of people who’ve completely changed their minds about AI coding since mid 2025.

You gotta get in there. See where we're at now for yourself. Download OpenCode, throw some real work at Opus or the others, and relish the privilege of being alive during the days we taught the machines how to think.

Shakeel Hashim: Claude Code is about so much more than coding

Shakeel Hashim:

I have absolutely zero coding experience. But in the past two weeks, I’ve had Claude Code go through my bank statements and invoices to prepare a first draft of my tax filing. (It got everything right.) I asked it to book me theater tickets: it reviewed my calendar, browsed the theater’s website for ticket availability, and picked a date that had good availability and suited my schedule. It built me a series of automation tools that will collectively save the Transformer team about half a day of work each week. It planned a detailed itinerary for a forthcoming vacation, including extracting hundreds of restaurant recommendations from my favorite influencer’s Instagram highlights.

New releases

Cowork: Claude Code for the Rest of Your Work

Just in time for the current frenzy, Anthropic is releasing a research preview of Cowork, which is essentially Claude Code for non-coding work. In addition to a more accessible interface, it includes some nice sandboxing features that reduce but don’t eliminated the safety concerns associated with running powerful agents on your computer. This looks great and I’m excited to take it for a spin. Note that it’s currently only available to Claude Max subscribers.

Simon Willison shares some early thoughts.

ChatGPT Health

OpenAI just announced (but hasn’t released) ChatGPT Health, a new “space” in ChatGPT designed to help answer health questions. It will connect with services like Apple Health as well as your medical records, and is designed to isolate and protect your health information. This seems like a very obvious thing to do, and I expect OpenAI will likely do a pretty good job with it. Electronic medical records in the US are legendarily hard to interface with, and it’ll be interesting to see how much traction OpenAI can get with that.

Crystal ball department

Raising the floor

François Chollet:

GenAI will not replace human ingenuity. It will simply raise the floor for mediocrity so high that being "pretty good" becomes economically worthless.

The second sentence nails it: the floor is going to rise, and there will be a moment when human ingenuity is worth more than ever, but being pretty good is economically worthless. The first sentence is pure cope: obviously the floor will keep rising, until even the most capable and ingenious humans are economically worthless.

2025 in AI predictions

Jessica Taylor continues her tradition of collecting and evaluating predictions about 2025, as well as predictions made during 2025 about future years. This is the way.

Capabilities and impact

Advancements In Self-Driving Cars

If you haven't been paying close attention, you may not realize just how good self-driving cars have gotten. Zvi’s roundup is great: 10/10, no notes. The same is not true, unfortunately, for much of the discourse in the mainstream press.

(Semi) autonomous combat drones

From the New York Times, a look at partial autonomy in combat drones in Ukraine.

Behind the scenes with METR’s time horizon benchmark

The METR time horizons benchmark is possibly the most important single metric in AI right now. Making that metric is much harder than it sounds, especially as time horizons extend from minutes to hours and beyond. METR’s David Rein appears on the AI X-Risk Research Podcast to discuss what the metric does and doesn’t measure, how it was created, challenges with measuring very long horizon tasks, and some interesting digressions on METR’s mission.

The “ChatGPT moment” has arrived for manufacturing

Like self-driving cars, industrial robots have been on the cusp of being great for years without number. And like self-driving cars, it turns out that robots were not a hardware problem, but an AI problem. Now that AI is taking off, expect dramatic advances in robotics. The Economist reports on recent progress.

Are we dead yet?

You will be OK

Boaz Barak offers some reassurance for young people that you may or may not find helpful. I did like this framing:

I do not want to engage here in the usual debate of P[doom]. But just as it makes absolute sense for companies and societies to worry about it as long as this probability is bounded away from 0, so it makes sense for individuals to spend most of their time not worrying about it as long as it is bounded away from 1.

Strategy and politics

China's rare earths chokehold

China's dominance of rare earth elements continues to be a significant strategic liability for the US, and for US technology firms in particular. Here’s ChinaTalk with a primer on where things stand. Of particular relevance: they believe China’s dominance is time-limited, and for that reason they expect China to wield it for maximum advantage while they’re still able to.

The Next Three Phases of AI Politics

2026 promises to be the year AI transitions from being something that lots of people are vaguely grumpy about to being a major political issue. Anton Leicht has been closely tracking the political trends and argues that the most likely time for substantive AI legislation is during a brief window after the midterm elections and before primaries start.

What sort of post-superintelligence society should we aim for?

Will MacAskill:

Viatopia is a waystation rather than a final destination; etymologically, it means “by way of this place”. We can often describe good waystations even if we have little idea what the ultimate destination should be. A teenager might have little idea what they want to do with their life, but know that a good education will keep their options open. Adventurers lost in the wilderness might not know where they should ultimately be going, but still know they should move to higher ground where they can survey the terrain. Similarly, we can identify what puts humanity in a good position to navigate towards excellent futures, even if we don’t yet know exactly what those futures look like.

Yes.

Philosophy department

The technology of liberalism

How to keep superintelligence from killing us all is the most important question we face in the next decade, but it’s not the only important question. Rudolf Laine considers the tradeoffs between utilitarianism and liberalism and argues for the importance of preserving both:

So what we also need are technologies of liberalism, that help maintain different spheres of freedom, even as technologies of utilitarianism increase the control and power that actors have to achieve their chosen ends.

Two mechanisms of decadence

Beren considers the question of decadence: why do companies or civilizations decay over time instead of riding an eternal cycle of compounding returns?

The first mechanism is that success tends to bring rigidity and diminished exploration due to higher global opportunity costs. […] The second mechanism is inherently increasing communication, coordination, and internal misalignment costs which grow with scale and also over time in the form of increasing defection, parasitism, and ultimately cause a form of organizational cancer.

Technical

The inaugural Redwood Research podcast

Redwood Research just put out their first podcast, with Buck Shlegeris and Ryan Greenblatt. It’s dauntingly long (4 hours, or 45,000 words), but super interesting. They cover the history of Redwood, what makes research projects successful (or not), strategies for surviving superintelligence, pros and cons of mechanistic interpretability, weird stuff like acausal trade, and tons more. If this is the kind of thing you like, you’re gonna like this one a lot.

A field guide to sandboxes for AI

Extremely interesting to a small number of people. Agentic coding tools are amazing, but they bring whole new classes of security vulnerabilities to the forefront. Keeping dangerous code in a secure sandbox is more important than ever, but that isn’t as easy as it sounds. Here’s Luis Cardoso with a deep technical guide to sandboxing your AI.

Side interests

Increasing returns to effort are common

Oliver Habryka has been publishing a series of internal memos he wrote to guide the staff at Lightcone Infrastructure. They’re all good, but I particularly enjoyed his thoughts on the increasing returns to effort.

Dan Williams’ top ten essays of 2025

Dan Williams at Conspicuous Cognition is a thoughtful writer about philosophy, politics, and rationality. Here he collects his 10 most popular essays from the past year—I found a couple that I’d previously missed but look forward to digging into.