cover of episode “Auditing language models for hidden objectives” by Sam Marks, Johannes Treutlein, dmz, Sam Bowman, Hoagy, Carson Denison, Akbir Khan, Euan Ong, Christopher Olah, Fabien Roger, Meg, Drake Thomas, Adam Jermyn, Monte M, evhub

“Auditing language models for hidden objectives” by Sam Marks, Johannes Treutlein, dmz, Sam Bowman, Hoagy, Carson Denison, Akbir Khan, Euan Ong, Christopher Olah, Fabien Roger, Meg, Drake Thomas, Adam Jermyn, Monte M, evhub

2025/3/16
logo of podcast LessWrong (Curated & Popular)

LessWrong (Curated & Popular)

AI Chapters
Chapters

Shownotes Transcript

No transcript made for this episode yet, you may request it for free.