Audio narrations of LessWrong posts. Includes all curated posts and all posts with 125+ karma.If you
This is a link post. Summary: We propose measuring AI performance in terms of the length of tasks AI
I have, over the last year, become fairly well-known in a small corner of the internet tangentially
Note: this is a research note based on observations from evaluating Claude Sonnet 3.7. We’re sharing
Scott Alexander famously warned us to Beware Trivial Inconveniences.When you make a thing easy to do
There's this popular trope in fiction about a character being mind controlled without losing aw
This research was conducted at AE Studio and supported by the AI Safety Grants programme administere
We study alignment audits—systematic investigations into whether an AI is pursuing hidden objectives
The Most Forbidden Technique is training an AI using interpretability techniques.An AI produces a fi
You learn the rules as soon as you’re old enough to speak. Don’t talk to jabberjays. You recite them
Exciting Update: OpenAI has released this blog post and paper which makes me very happy. It's b
LLM-based coding-assistance tools have been out for ~2 years now. Many developers have been reportin
Background: After the release of Claude 3.7 Sonnet,[1] an Anthropic employee started livestreaming C
Note: an audio narration is not available for this article. Please see the original text. The origi
In a recent post, Cole Wyeth makes a bold claim:. . . there is one crucial test (yes this is a crux)
This isn't really a "timeline", as such – I don't know the timings – but this is
This is a critique of How to Make Superbabies on LessWrong.Disclaimer: I am not a geneticist[1], and
This is a link post.Your AI's training data might make it more “evil” and more able to circumve
I recently wrote about complete feedback, an idea which I think is quite important for AI safety. Ho
First, let me quote my previous ancient post on the topic:Effective Strategies for Changing Public O
In a previous book review I described exclusive nightclubs as the particle colliders of sociology—pl