Topic-Modelling-with-LSA-and-LDA

This project presents an overview of Topic Modelling - a classical problem of unsupervised machine learning’s branch i.e., Natural Language Processing (NLP) - by studying and comparing two latent algorithms - Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA). These techniques are applied to a public dataset - ‘A Million News Headlines’ - which contains a huge corpus of more than one million news headlines published by ABC (Australian Broadcasting Corporation) News over a period of 17 years.

Dataset link: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/SYBGZL

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LSA - LDA.ipynb		LSA - LDA.ipynb
README.md		README.md
Top words in headlines dataset (excluding stop words).png		Top words in headlines dataset (excluding stop words).png
t-SNA Clustering of 15 LDA topics.png		t-SNA Clustering of 15 LDA topics.png
t-SNA Clustering of 15 LSA topics.png		t-SNA Clustering of 15 LSA topics.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Topic-Modelling-with-LSA-and-LDA

About

Releases

Packages

Languages

ilmseeker/Topic-Modelling-with-LSA-and-LDA

Folders and files

Latest commit

History

Repository files navigation

Topic-Modelling-with-LSA-and-LDA

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages