NLP Demystified 9: Automatically Finding Topics in Documents with Latent Dirichlet Allocation
Future Mojo Future Mojo
4.61K subscribers
7,985 views
0

 Published On May 17, 2022

Course playlist:    • Natural Language Processing Demystified  

What do you do when you need to make sense of a pile of documents and have no other information? In this video, we'll learn one approach to this problem using Latent Dirichlet Allocation.

We'll cover how it works, then build a model with spaCy and Gensim to automatically discover topics present in a document and to search for similar documents.

Colab notebook: https://colab.research.google.com/git...

Timestamps
00:00:00 Topic modelling with LDA
00:00:21 The two assumptions an LDA topic model makes
00:03:15 Building an LDA Machine to generate documents
00:10:16 The Dirichlet distribution
00:14:43 Further enhancements to the LDA machine
00:17:01 LDA as generative model
00:20:15 Training an LDA model using Collapsed Gibbs Sampling
00:28:44 DEMO: Discovering topics in a news corpus and searching for similar documents
00:45:24 Topic model use cases and other models

This video is part of Natural Language Processing Demystified --a free, accessible course on NLP.

Visit https://www.nlpdemystified.org/ to learn more.

show more

Share/Embed