back
“Reducing LLM deception at scale with self-other overlap fine-tuning” by Marc Carauleanu, Diogo de Lucena, Gunnar\_Zarncke, Judd Rosenblatt, Mike Vaiana, Cameron Berg
12:22
Share
2025/3/17
LessWrong (Curated & Popular)
AI Chapters
Transcribe
Chapters
What is the Summary of the Research?
How Was the LLM Experimental Setup Designed?
What Were the LLM Experimental Results?
What Impact Did SOO Fine-Tuning Have on LLM Capabilities?
How Did the SOO Method Generalize Across Different Scenarios?
What Are Some Example Outputs of the SOO Fine-Tuned LLMs?
What Conclusions Were Drawn from the Research?
Shownotes
Transcript
No transcript made for this episode yet, you may request it for free.