What is Reinforcement Learning? Lecture with 4 Examples | Intro to Markov Chains and RL
Mihai Nica Mihai Nica
6.49K subscribers
406 views
0

 Published On Jan 11, 2024

First Intro Class with some examples of Reinforcement Learning

Notes:
There were two video I played in class that I cut out of the posted vide:
1. At44:40 the video that was played is this video:    • AlphaGo Zero: Discovering new knowledge  
2. At57:20 the video that was played is the first 4 minutes of this video:
   • The Beautiful Math of Snakes and Ladd...  

Chapters (Powered by ChapterMe) -
00:00 - Learning paradigms and reinforcement learning examples
02:05 - Neural networks transformer, linear regression, reinforcement learning
03:23 - Linear transformations in regression
03:57 - Reopen, put in new answer
05:08 - Linear regression vs reinforcement learning differences
08:09 - Linear regression models, learning paradigms, supervised learning
09:11 - Supervised learning with examples and loss function
13:28 - Machine learning statistics for supervised and neural networks
13:43 - Unsupervised Learning
19:19 - Supervised and unsupervised learning examples
21:08 - Selfsupervised learning with word2vec and chatGPT
21:59 - Method for inventing Ys in self supervised learning
28:58 - Reinforcement learning training systems to maximize rewards
29:27 - Lyft, AlphaZero, and ChatGPT examples
31:34 - Example 1 Blackjack State Space
35:09 - RL finds Optimal Actions to get Blackjack Basic Strategy Card
36:21 - Example 2 ChessGo State Space? Actions? Rewards? Human Knowledge
37:38 - Reinforcement learning for chess training
38:01 - Board states and rules in chess
40:22 - Chess rules and rewards explained in simple terms
44:41 - Alphago Zero Discovering New Knowledge
46:34 - Example 3 Lyft Matching Correspond To Higher Values Learning At Lyft
48:25 - Example 4 ChatGPT Reinforcement Learning with Human Feedback (RLHF)
56:38 - Snakes and ladders state space example
58:03 - Probability problem roll dice three times, end on square seven
01:09:46 - Ways to Markov chain efficiently

show more

Share/Embed