Published On Jul 4, 2023
"Eliciting Latent Knowledge" is a research problem introduced by Paul Christiano, Ajeya Cotra and Mark Xu as part of work conducted at the Alignment Research Center.
Timestamps:
00:00 - Eliciting Latent Knowledge
00:46 - The Alignment Research Center (ARC)
02:25 - The Eliciting Latent Knowledge Problem
04:53 - Ontology identification
07:26 - Toy Scenario: The SmartVault
08:07 - How the SmartVault AI works
10:38 - What could go wrong?
12:57 - Addressing the problem by asking questions
16:18 - A counterexample
20:20 - The direct translator
22:25 - The human simulator
23:51 - Will gradient descent learn the direct translator or the human simulator?
25:14 - Closing thoughts
Link to google doc: https://docs.google.com/document/d/1W...
Topics: #safety #ai #alignment #ELK
For related content:
- Twitter: / samuelalbanie
- Research lab: https://caml-lab.com/
- personal webpage: https://samuelalbanie.com/
- YouTube: / @samuelalbanie1
- TikTok: / samuelalbanie
- Instagram: / samuelalbanie
- LinkedIn: / samuel-albanie
- Discord server for filtir: / discord
(Optional) if you'd like to support the channel:
- https://www.buymeacoffee.com/samuelal...
- / samuel_albanie