Skip to content

This project presents an overview of Topic Modelling - a classical problem of unsupervised machine learning’s branch i.e., Natural Language Processing (NLP) - by studying and comparing two latent algorithms - Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA). These techniques are applied to a public dataset - ‘A Million News H…

Notifications You must be signed in to change notification settings

ilmseeker/Topic-Modelling-with-LSA-and-LDA

Repository files navigation

Topic-Modelling-with-LSA-and-LDA

This project presents an overview of Topic Modelling - a classical problem of unsupervised machine learning’s branch i.e., Natural Language Processing (NLP) - by studying and comparing two latent algorithms - Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA). These techniques are applied to a public dataset - ‘A Million News Headlines’ - which contains a huge corpus of more than one million news headlines published by ABC (Australian Broadcasting Corporation) News over a period of 17 years.

Dataset link: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/SYBGZL

About

This project presents an overview of Topic Modelling - a classical problem of unsupervised machine learning’s branch i.e., Natural Language Processing (NLP) - by studying and comparing two latent algorithms - Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA). These techniques are applied to a public dataset - ‘A Million News H…

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published