Workshop: Topic Modeling Text with Python (intermediate level)

March 20, 1:00-3:00 PM


The Hot Source! project, funded by AHRC, is organizing an online workshop on March 20th, 2024, as part of its training activities from 1:00 pm – 3:00 pm. 

This workshop is designed for researchers who are interested in learning about Topic Modeling using Latent Dirichlet Allocation (LDA). Topic modeling involves uncovering hidden thematic structures within data, offering valuable insights for various research domains.  Latent Dirichlet Allocation (LDA) is a statistical model used in natural language processing and machine learning.  

Attendees will gain a comprehensive understanding of topic modeling and its applications.  The workshop will begin with an introduction to topic modeling, explaining its importance and relevance in various fields. Participants will then be introduced to LDA, a popular algorithm used for topic modeling.  

Next, participants will learn how to pre-process text data for LDA. They will explore techniques for cleaning and preparing text data to ensure optimal results in topic modeling.   The attendees will then delve into implementing LDA using the Gensim library, a powerful tool for topic modeling in Python, to train LDA models and extract topics from text data. 

In addition, participants will learn how to evaluate the performance of LDA models. They will explore various metrics and techniques for assessing the quality and coherence of topics generated by LDA. 

Finally, the workshop will cover visualizing LDA topics to visualize and interpret the results of topic modeling using different visualization techniques. 

This workshop  is suitable for researchers who are new to topic modeling or those who want to enhance their skills in this area. This course is suitable for individuals with basic to intermediate Python programming knowledge. By the end of the workshop, participants will have the necessary knowledge and skills to apply LDA for topic modeling in their research projects. 

Additional resources from intermediate-level workshops are available on topics such as, Introduction to NER, Introduction to Pandas, and Sentiment Analysis

If you’re looking for more text extraction resources to reinforce your learning, you can also join courses Printed text extraction with ABBYY FineReader software (beginner level) and handwritten text extraction with Transkribus Lite app (beginner level)

In addition to this workshop, participants can register for a 1:1 ‘Practical Surgery’ session on 27th  March.