Topic Modeling Workshop for the Beginners in Python
Prodramp
[Beginners Workshop in Python] This workshop is all you need to learn topic modeling in python combining Gensim, spacy, NLTK and few other python libraries. This workshop has the following 6 steps:
- Reading and Loading Data
- NIPS Papers as our source dataset
- Data Preparation
- Select required columns
- Remove Punctuations marks
- Exploratory Data Analysis
- Word Cloud
- Document Term Matrix
- Data Modeling and tokenization
- Stop words removal
- bigram and trigram
- Vectorization and Tokenization
- Model Building
- LDA Modelling
- Model Evaluations
- LDA Model Visualization
- Coherence Score
GitHub notebook for this workshop: https://github.com/prodramp/DeepWorks/tree/main/TopicModelling/BaseWorkshop
ā¬ā¬ā¬ā¬ā¬ā¬ ā° TUTORIAL TIME STAMPS ā° ā¬ā¬ā¬ā¬ā¬ā¬
- (00:00) Tutorial Starts
- (00:12) Topic Modeling Intro
- (03:10) Workshop Environment
- (03:26) Content location at GitHub
- (03:50) Dataset used in this workshop
- (04:40) LDA Intro
- (05:35) Topic Modeling Use Cases
- (05:55) 6 Steps in this Workshop
- (06:06) Step 1: Loading Data
- (09:20) Step 2: Data Preparation
- (09:41) Step 2.1: Removing Punctuation
- (10:22) Step 2.2: Removing digits and word with digits
- (10:43) Step 2.3: Lowercase all context
- (11:08) Step 3: EDA
- (11:27) Step 3.1: Word Cloud
- (12:28) Step 3.2: Document Term Matrix
- (14:50) Step 4: Data Modeling
- (15:21) Step 4.1: Stop words removal
- (21:09) Step 4.2: Creating Bigram and Trigram
- (23:26) Step 4.3: Lemmatization
- (26:03) Step 4.4: Tokenization
- (28:14) Step 5: LDA Topic Modeling
- (32:53) Step 6: Topic Modeling Performance and analysis
- (34:29) Step 6.1: Topic visualization
- (37:32) Step 6.2: Coherence Score
- (39:30) Saving notebook to GitHub
- (39:55) Recap
Please visit:
- Prodramp LLC | https://prodramp.com | @prodramp
- https://www.linkedin.com/company/prodramp
Content Creator: Avkash Chauhan (@avkashchauhan)
Tags: #python #lda #nlp #spacy #gensim #nlp #ml #ai #aicloud #h2oai #driverlessai #machinelearning #cloud #mlops #model #collaboration #deeplearning #modelserving #modeldeployment #keras #tensorflow #pytorch #datarobot #datahub #aiplatform #aicloud #modelperformance #modelfit #modeleffect #modelimpact #modelbias #modeldeployment #modelregistery #modelpipeline #neptuneai #streamlit #pythonapps #deepchecks #modeltesting #codeartifact #dataartifact #modelartifact #onnx #aws #supervisor #supervisord #kaggle #keplergl #mapbox #lightgbm #xgboost #classification #regression #dataengineering #pandas #keras #tensorflow #tensorboard #mnist #cnn #convnet #alexnet #prodramp #avkashchauhan #cnnexplainer #gnn #graph #graphneuralnetwork #pyg #networkx ... https://www.youtube.com/watch?v=EZiPAXez4KE
281342567 Bytes