Skip to content

pkrouth/topic-model-poc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Topic model for semantic search

Topic model on Legal text: Judgements

Topic modeling is a machine learning model that leverages unsupervised learning to analyze and automatically identify the clusters of similar words within the corpus. This approach is frequently used to discover hidden semantic patterns portrayed by a text corpus. Topic Model Schematic

David Blei, the author of Topic Modeling has also written a well-explained article titled Probabilistic Topic Models. Even in the age of ChatGPT, these topic models are quite practical for EDA or gaining insights from the data.

Common applications of topic model:

  • Real-time analysis on unstructured textual data
  • Automate the review of business documents (unstructured data) and segment them based on the underlying topic. e.g. determine whether the contents of a document are an invoice, complaint, or contract.
  • Extract topics (understanding of data) from documents containing various formats.

In this repo, I am open-sourcing some of the work done for creating a semantic search engine, driven by topic models trained on judgement text. An example of visualization of topic models (using LDAvis) for a differnet corpus is shown here Download for graphics

About

Topic model on Judgements

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors