Published On Premiered Apr 29, 2024
This video shares a novel method by Meta that uses another LLM, called the AdvPrompter, to generate human-readable adversarial prompts in seconds, ∼ 800× faster than existing optimization-based approaches. We train the AdvPrompter using a novel algorithm that does not require access to the
gradients of the TargetLLM.
▶ Become a Patron 🔥 - / fahdmirza
#advprompter #advpromptertrain #targetllm
PLEASE FOLLOW ME:
▶ LinkedIn: / fahdmirza
▶ YouTube: / @fahdmirza
▶ Blog: https://www.fahdmirza.com
RELATED VIDEOS:
▶ Paper https://arxiv.org/pdf/2404.16873
All rights reserved © 2021 Fahd Mirza