Transformer Neural Networks - EXPLAINED! (Attention is all you need)
CodeEmporium CodeEmporium
117K subscribers
766,179 views
0

 Published On Jan 13, 2020

Please subscribe to keep me alive: https://www.youtube.com/c/CodeEmporiu...

BLOG:   / dataemporium  


PLAYLISTS FROM MY CHANNEL
⭕ Reinforcement Learning:    • Reinforcement Learning 101  
Natural Language Processing:    • Natural Language Processing 101  
⭕ Transformers from Scratch:    • Natural Language Processing 101  
⭕ ChatGPT Playlist:    • ChatGPT  
⭕ Convolutional Neural Networks:    • Convolution Neural Networks  
⭕ The Math You Should Know :    • The Math You Should Know  
⭕ Probability Theory for Machine Learning:    • Probability Theory for Machine Learning  
⭕ Coding Machine Learning:    • Code Machine Learning  


MATH COURSES (7 day free trial)
📕 Mathematics for Machine Learning: https://imp.i384100.net/MathML
📕 Calculus: https://imp.i384100.net/Calculus
📕 Statistics for Data Science: https://imp.i384100.net/AdvancedStati...
📕 Bayesian Statistics: https://imp.i384100.net/BayesianStati...
📕 Linear Algebra: https://imp.i384100.net/LinearAlgebra
📕 Probability: https://imp.i384100.net/Probability

OTHER RELATED COURSES (7 day free trial)
📕 ⭐ Deep Learning Specialization: https://imp.i384100.net/Deep-Learning
📕 Python for Everybody: https://imp.i384100.net/python
📕 MLOps Course: https://imp.i384100.net/MLOps
📕 Natural Language Processing (NLP): https://imp.i384100.net/NLP
📕 Machine Learning in Production: https://imp.i384100.net/MLProduction
📕 Data Science Specialization: https://imp.i384100.net/DataScience
📕 Tensorflow: https://imp.i384100.net/Tensorflow

REFERENCES
[1] The main Paper: https://arxiv.org/abs/1706.03762
[2] Tensor2Tensor has some code with a tutorial: https://www.tensorflow.org/tutorials/...
[3] Transformer very intuitively explained - Amazing: http://jalammar.github.io/illustrated...
[4] Medium Blog on intuitive explanation:   / what-is-a-transformer  
[5] Pretrained word embeddings: https://nlp.stanford.edu/projects/glove/
[6] Intuitive explanation of Layer normalization: https://mlexplained.com/2018/11/30/an...
[7] Paper that gives even better results than transformers (Pervasive Attention): https://arxiv.org/abs/1808.03867
[8] BERT uses transformers to pretrain neural nets for common NLP tasks. : https://ai.googleblog.com/2018/11/ope...
[9] Stanford Lecture on RNN: http://cs231n.stanford.edu/slides/201...
[10] Colah’s Blog: https://colah.github.io/posts/2015-08...
[11] Wiki for timeseries of events: https://en.wikipedia.org/wiki/Transfo...)

show more

Share/Embed