LessWrong (Curated & Popular)

"The 6D effect: When companies take risks, one email can be very powerful." by scasper

2023/11/9

Recently, I have been learning about industry norms, legal discovery proceedings, and incentive stru

"The other side of the tidal wave" by Katja Grace

2023/11/9

I guess there’s maybe a 10-20% chance of AI causing human extinction in the coming decades, but I fe

"Does davidad's uploading moonshot work?" by jacobjabob et al.

2023/11/9

davidad has a 10-min talk out on a proposal about which he says: “the first time I’ve seen a concret

"Propaganda or Science: A Look at Open Source AI and Bioterrorism Risk" by 1a3orn

2023/11/9

I examined all the biorisk-relevant citations from a policy paper arguing that we should ban powerfu

"My thoughts on the social response to AI risk" by Matthew Barnett

2023/11/9

A common theme implicit in many AI risk stories has been that broader society will either fail to an

Comp Sci in 2027 (Short story by Eliezer Yudkowsky)

2023/11/9

This is a linkpost for https://nitter.net/ESYudkowsky/status/1718654143110512741Comp sci in 2017:Stu

"Thoughts on the AI Safety Summit company policy requests and responses" by So8res

2023/11/3

Over the next two days, the UK government is hosting an AI Safety Summit focused on “the safe and re

"President Biden Issues Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence" by Tristan Williams

2023/11/3

This is a linkpost for https://www.whitehouse.gov/briefing-room/statements-releases/2023/10/30/fact-

[Human Voice] "Book Review: Going Infinite" by Zvi

2023/10/31

Support ongoing human narrations of curated posts:www.patreon.com/LWCuratedPreviously: Sadly, FTXI d

"We're Not Ready: thoughts on "pausing" and responsible scaling policies" by Holden Karnofsky

2023/10/30

Views are my own, not Open Philanthropy’s. I am married to the President of Anthropic and have a fin

"At 87, Pearl is still able to change his mind" by rotatingpaguro

2023/10/30

Judea Pearl is a famous researcher, known for Bayesian networks (the standard way of representing Ba

"Architects of Our Own Demise: We Should Stop Developing AI" by Roko

2023/10/30

Some brief thoughts at a difficult time in the AI risk debate.Imagine you go back in time to the yea

"AI as a science, and three obstacles to alignment strategies" by Nate Soares

2023/10/30

AI used to be a science. In the old days (back when AI didn't work very well), people were atte

"Thoughts on responsible scaling policies and regulation" by Paul Christiano

2023/10/30

I am excited about AI developers implementing responsible scaling policies; I’ve recently been spend

"Announcing Timaeus" by Jesse Hoogland et al.

2023/10/30

Timaeus is a new AI safety research organization dedicated to making fundamental breakthroughs in te

[HUMAN VOICE] "Alignment Implications of LLM Successes: a Debate in One Act" by Zack M Davis

2023/10/23

Support ongoing human narrations of curated posts:www.patreon.com/LWCuratedDoomimir: Humanity has ma

"Holly Elmore and Rob Miles dialogue on AI Safety Advocacy" by jacobjacob, Robert Miles & Holly_Elmore

2023/10/23

Holly is an independent AI Pause organizer, which includes organizing protests (like this upcoming o

"LoRA Fine-tuning Efficiently Undoes Safety Training from Llama 2-Chat 70B" by Simon Lermen & Jeffrey Ladish.

2023/10/23

Produced as part of the SERI ML Alignment Theory Scholars Program - Summer 2023 Cohort, under the me

"Labs should be explicit about why they are building AGI" by Peter Barnett

2023/10/19

Three of the big AI labs say that they care about alignment and that they think misaligned AI poses

[HUMAN VOICE] "Sum-threshold attacks" by TsviBT

2023/10/18

Support ongoing human narrations of curated posts:www.patreon.com/LWCuratedHow do you affect somethi

Episodes

"The 6D effect: When companies take risks, one email can be very powerful." by scasper

"The other side of the tidal wave" by Katja Grace

"Does davidad's uploading moonshot work?" by jacobjabob et al.

"Propaganda or Science: A Look at Open Source AI and Bioterrorism Risk" by 1a3orn

"My thoughts on the social response to AI risk" by Matthew Barnett

Comp Sci in 2027 (Short story by Eliezer Yudkowsky)

"Thoughts on the AI Safety Summit company policy requests and responses" by So8res

"President Biden Issues Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence" by Tristan Williams

[Human Voice] "Book Review: Going Infinite" by Zvi

"We're Not Ready: thoughts on "pausing" and responsible scaling policies" by Holden Karnofsky

"At 87, Pearl is still able to change his mind" by rotatingpaguro

"Architects of Our Own Demise: We Should Stop Developing AI" by Roko

"AI as a science, and three obstacles to alignment strategies" by Nate Soares

"Thoughts on responsible scaling policies and regulation" by Paul Christiano

"Announcing Timaeus" by Jesse Hoogland et al.

[HUMAN VOICE] "Alignment Implications of LLM Successes: a Debate in One Act" by Zack M Davis

"Holly Elmore and Rob Miles dialogue on AI Safety Advocacy" by jacobjacob, Robert Miles & Holly_Elmore

"LoRA Fine-tuning Efficiently Undoes Safety Training from Llama 2-Chat 70B" by Simon Lermen & Jeffrey Ladish.

"Labs should be explicit about why they are building AGI" by Peter Barnett

[HUMAN VOICE] "Sum-threshold attacks" by TsviBT