Eliciting Latent Knowledge

18.7K subscribers

1,844 views

About
Share

Published On Jul 4, 2023

"Eliciting Latent Knowledge" is a research problem introduced by Paul Christiano, Ajeya Cotra and Mark Xu as part of work conducted at the Alignment Research Center.

Timestamps:
00:00 - Eliciting Latent Knowledge
00:46 - The Alignment Research Center (ARC)
02:25 - The Eliciting Latent Knowledge Problem
04:53 - Ontology identification
07:26 - Toy Scenario: The SmartVault
08:07 - How the SmartVault AI works
10:38 - What could go wrong?
12:57 - Addressing the problem by asking questions
16:18 - A counterexample
20:20 - The direct translator
22:25 - The human simulator
23:51 - Will gradient descent learn the direct translator or the human simulator?
25:14 - Closing thoughts

Link to google doc: https://docs.google.com/document/d/1W...

Topics: #safety #ai #alignment #ELK

For related content:
- Twitter:   / samuelalbanie
- Research lab: https://caml-lab.com/
- personal webpage: https://samuelalbanie.com/
- YouTube:    / @samuelalbanie1
- TikTok:   / samuelalbanie
- Instagram:   / samuelalbanie
- LinkedIn:   / samuel-albanie
- Discord server for filtir:   / discord

(Optional) if you'd like to support the channel:
- https://www.buymeacoffee.com/samuelal...
-   / samuel_albanie

Published On Jul 4, 2023

Share/Embed

Video Link