A dashboard for analysing entities (people and organisations) in the content of recent news articles + media coverage by user sentiment.
It uses the LatestNews API for getting news content. You can find the deployed website here!
- Named entities (person or organisation) can be filtered by number to obtain graphs and wordclouds.
- Selecting a particular entity finds other trending topics in the news related to it + groups the various news sources covering the topic to find the intensity of their sentiments using the Afinn Lexicon.
- The
sentiment.pyfile creates the dashboard by loading the relevant files. It uses NLTK'S Parts of Speech tagger to chunk the tokens based on their POS tags to find named entities. - The
ner.ipynbfile contains previous attempts to solve the problem by using scikit-learn's CountVectorizer. - The dashboard currently only works on a small subset of data for the purpose of this experiment. Expanding it to cover news articles daily remains in the future scope.
nltk:POS tagging to perform NER by chunking tokens.pandas:formatting and cleaning the data.afinn:lexicon to measure coverage sentiment.plotly express:visualisations.streamlit:web framework.
The live project is deployed on https://newsentity.herokuapp.com/.
You must have Python 3.6 or higher to run the file.
- Create a new virtual environment for running the application. You can follow the instructions here.
- Navigate to the virtual environment and activate it.
- Install the dependancies using
pip install -r requirements.txt - Run the
news.pyfile withstreamlit run news.py



