Skip to content

Latest commit

 

History

History
5 lines (3 loc) · 653 Bytes

File metadata and controls

5 lines (3 loc) · 653 Bytes

Topic-Modelling-with-LSA-and-LDA

This project presents an overview of Topic Modelling - a classical problem of unsupervised machine learning’s branch i.e., Natural Language Processing (NLP) - by studying and comparing two latent algorithms - Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA). These techniques are applied to a public dataset - ‘A Million News Headlines’ - which contains a huge corpus of more than one million news headlines published by ABC (Australian Broadcasting Corporation) News over a period of 17 years.

Dataset link: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/SYBGZL