Twitter Sentiment Analysis: Project Overview

Analyzed tweets to determine positive, negative, or neutral sentiment from kaggle competition data.

Resources Used:

Python Version: 3.6 Packages: pandas, numpy, plotly, SpaCy, nltk

Distribution of sentiments in training data

To evaluate my models I used the Jaccard index, which determines the similarity of two sample sentences.

Here are the distributions of Jaccard scores on tweets compared with training tweets and selected parts of a tweet.

List of the most common words (after removal of stopwords)

I used SpaCy to teach my named entity recogniser My steps: