LessWrong (Curated & Popular)

“OpenAI:” by Daniel Kokotajlo

1d ago

Exciting Update: OpenAI has released this blog post and paper which makes me very happy. It's b

“How Much Are LLMs Actually Boosting Real-World Programmer Productivity?” by Thane Ruthenis

3d ago

LLM-based coding-assistance tools have been out for ~2 years now. Many developers have been reportin

“So how well is Claude playing Pokémon?” by Julian Bradshaw

4d ago

4 chapters

Background: After the release of Claude 3.7 Sonnet,[1] an Anthropic employee started livestreaming C

“Methods for strong human germline engineering” by TsviBT

6d ago

Note: an audio narration is not available for this article. Please see the original text. The origi

“Have LLMs Generated Novel Insights?” by abramdemski, Cole Wyeth

2025/3/6

In a recent post, Cole Wyeth makes a bold claim:. . . there is one crucial test (yes this is a crux)

“A Bear Case: My Predictions Regarding AI Progress” by Thane Ruthenis

2025/3/6

3 chapters

This isn't really a "timeline", as such – I don't know the timings – but this is

“Statistical Challenges with Making Super IQ babies” by Jan Christian Refsgaard

2025/3/5

6 chapters

This is a critique of How to Make Superbabies on LessWrong.Disclaimer: I am not a geneticist[1], and

“Self-fulfilling misalignment data might be poisoning our AI models” by TurnTrout

2025/3/4

This is a link post.Your AI's training data might make it more “evil” and more able to circumve

“Judgements: Merging Prediction & Evidence” by abramdemski

2025/3/1

4 chapters

I recently wrote about complete feedback, an idea which I think is quite important for AI safety. Ho

“The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better” by Thane Ruthenis

2025/2/26

4 chapters

First, let me quote my previous ancient post on the topic:Effective Strategies for Changing Public O

“Power Lies Trembling: a three-book review” by Richard_Ngo

2025/2/26

4 chapters

In a previous book review I described exclusive nightclubs as the particle colliders of sociology—pl

“Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs” by Jan Betley, Owain_Evans

2025/2/26

2 chapters

This is the abstract and introduction of our new paper. We show that finetuning state-of-the-art LLM

“The Paris AI Anti-Safety Summit” by Zvi

2025/2/22

8 chapters

It doesn’t look good.What used to be the AI Safety Summits were perhaps the most promising thing hap

“Eliezer’s Lost Alignment Articles / The Arbital Sequence” by Ruby

2025/2/20

3 chapters

Note: this is a static copy of this wiki page. We are also publishing it as a post to ensure visibil

“Arbital has been imported to LessWrong” by RobertM, jimrandomh, Ben Pace, Ruby

2025/2/20

13 chapters

Arbital was envisioned as a successor to Wikipedia. The project was discontinued in 2017, but not be

“How to Make Superbabies” by GeneSmith, kman

2025/2/20

26 chapters

We’ve spent the better part of the last two decades unravelling exactly how the human genome works a

“A computational no-coincidence principle” by Eric Neyman

2025/2/19

Audio note: this article contains 134 uses of latex notation, so the narration may be difficult to

“A History of the Future, 2025-2040” by L Rudolf L

2025/2/19

17 chapters

This is an all-in-one crosspost of a scenario I originally published in three parts on my blog (No S

“It’s been ten years. I propose HPMOR Anniversary Parties.” by Screwtape

2025/2/18

On March 14th, 2015, Harry Potter and the Methods of Rationality made its final post. Wrap parties w

“Some articles in ‘International Security’ that I enjoyed” by Buck

2025/2/16

A friend of mine recently recommended that I read through articles from the journal International Se

Episodes

“OpenAI:” by Daniel Kokotajlo

“How Much Are LLMs Actually Boosting Real-World Programmer Productivity?” by Thane Ruthenis

“So how well is Claude playing Pokémon?” by Julian Bradshaw

“Methods for strong human germline engineering” by TsviBT

“Have LLMs Generated Novel Insights?” by abramdemski, Cole Wyeth

“A Bear Case: My Predictions Regarding AI Progress” by Thane Ruthenis

“Statistical Challenges with Making Super IQ babies” by Jan Christian Refsgaard

“Self-fulfilling misalignment data might be poisoning our AI models” by TurnTrout

“Judgements: Merging Prediction & Evidence” by abramdemski

“The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better” by Thane Ruthenis

“Power Lies Trembling: a three-book review” by Richard_Ngo

“Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs” by Jan Betley, Owain_Evans

“The Paris AI Anti-Safety Summit” by Zvi

“Eliezer’s Lost Alignment Articles / The Arbital Sequence” by Ruby

“Arbital has been imported to LessWrong” by RobertM, jimrandomh, Ben Pace, Ruby

“How to Make Superbabies” by GeneSmith, kman

“A computational no-coincidence principle” by Eric Neyman

“A History of the Future, 2025-2040” by L Rudolf L

“It’s been ten years. I propose HPMOR Anniversary Parties.” by Screwtape

“Some articles in ‘International Security’ that I enjoyed” by Buck