Can a Random Reinforcement Learning Agent Maximize its Score? Soft Actor Critic (SAC) in Tensorflow2

41K subscribers

10,249 views

About
Share

Published On Feb 18, 2021

The Soft Actor Critic Algorithm is a powerful tool for solving cutting edge deep reinforcement learning problems involving continuous action space environments. It's a variation of the actor critic method that leverages a maximum entropy framework, double Q networks, and target value networks.

The entropy is modeled by scaling the reward factor, with an inverse relationship between the reward scale and the entropy of our agent. Larger reward scaling means more deterministic behavior, and a larger reward scale means more stochastic behavior.

We're going to implement this algorithm using the tensorflow 2 framework, and test it out on the Inverted Pendulum environment found in the PyBullet package.

Code for this video can be found at:
https://github.com/philtabor/Youtube-...

Learn how to turn deep reinforcement learning papers into code:

Get instant access to all my courses, including the new Prioritized Experience Replay course, with my subscription service. $29 a month gives you instant access to 42 hours of instructional content plus access to future updates, added monthly.

Discounts available for Udemy students (enrolled longer than 30 days). Just send an email to [email protected]

https://www.neuralnet.ai/courses

Or, pickup my Udemy courses here:

Deep Q Learning:
https://www.udemy.com/course/deep-q-l...

Actor Critic Methods:
https://www.udemy.com/course/actor-cr...

Curiosity Driven Deep Reinforcement Learning
https://www.udemy.com/course/curiosit...

Natural Language Processing from First Principles:
https://www.udemy.com/course/natural-...
Reinforcement Learning Fundamentals
https://www.manning.com/livevideo/rei...

Here are some books / courses I recommend (affiliate links):
Grokking Deep Learning in Motion: https://bit.ly/3fXHy8W
Grokking Deep Learning: https://bit.ly/3yJ14gT
Grokking Deep Reinforcement Learning: https://bit.ly/2VNAXql

Come hang out on Discord here:
/ discord

Need personalized tutoring? Help on a programming project? Shoot me an email! [email protected]

Website: https://www.neuralnet.ai
Github: https://github.com/philtabor
Twitter: / mlwithphil

Published On Feb 18, 2021

Share/Embed

Video Link