Eliciting Latent Knowledge
Samuel Albanie Samuel Albanie
18.7K subscribers
1,844 views
0

 Published On Jul 4, 2023

"Eliciting Latent Knowledge" is a research problem introduced by Paul Christiano, Ajeya Cotra and Mark Xu as part of work conducted at the Alignment Research Center.

Timestamps:
00:00 - Eliciting Latent Knowledge
00:46 - The Alignment Research Center (ARC)
02:25 - The Eliciting Latent Knowledge Problem
04:53 - Ontology identification
07:26 - Toy Scenario: The SmartVault
08:07 - How the SmartVault AI works
10:38 - What could go wrong?
12:57 - Addressing the problem by asking questions
16:18 - A counterexample
20:20 - The direct translator
22:25 - The human simulator
23:51 - Will gradient descent learn the direct translator or the human simulator?
25:14 - Closing thoughts

Link to google doc: https://docs.google.com/document/d/1W...

Topics: #safety #ai #alignment #ELK

For related content:
- Twitter:   / samuelalbanie  
- Research lab: https://caml-lab.com/
- personal webpage: https://samuelalbanie.com/
- YouTube:    / @samuelalbanie1  
- TikTok:   / samuelalbanie  
- Instagram:   / samuelalbanie  
- LinkedIn:   / samuel-albanie  
- Discord server for filtir:   / discord  

(Optional) if you'd like to support the channel:
- https://www.buymeacoffee.com/samuelal...
-   / samuel_albanie  

show more

Share/Embed