Added new lexicons used in CryptoTwitter#81
Added new lexicons used in CryptoTwitter#81JaimeBadiola wants to merge 1 commit intocjhutto:masterfrom
Conversation
Updated lexicon of words to include common expresions used in CryptoTwitter
|
Thanks Jaime! This looks very promising, however, I'm having difficulty in the review as it appears (according to the diff analysis on GH) that you've changed every single line in each of the three most critical files used for VADER sentiment... can you point me to the specific changes (I think just some additions, correct?) |
cjhutto
left a comment
There was a problem hiding this comment.
The Diff shows that you basically deleted every single line of code and replaced them (for 3 of the most critical files). I'm unable to easily review what lines you've actually modified, and so cannot make a determination on impact of your suggested changes.
|
@cjhutto The files in the pull request use dos line ending that's why it looks like absolutely everything has been changed. Here is the diff with line endings converted to unix: diff -ur vaderSentiment.org/vader_lexicon.txt vaderSentiment.jaime/vader_lexicon.txt
--- vaderSentiment.org/vader_lexicon.txt 2020-08-19 16:07:36.382822907 +0300
+++ vaderSentiment.jaime/vader_lexicon.txt 2020-08-19 16:10:17.397542016 +0300
@@ -7514,4 +7514,26 @@
}:( -2.0 0.63246 [-3, -1, -2, -1, -3, -2, -2, -2, -2, -2]
}:) 0.4 1.42829 [1, 1, -2, 1, 2, -2, 1, -1, 2, 1]
}:-( -2.1 0.7 [-2, -1, -2, -2, -2, -4, -2, -2, -2, -2]
-}:-) 0.3 1.61555 [1, 1, -2, 1, -1, -3, 2, 2, 1, 1]
\ No newline at end of file
+}:-) 0.3 1.61555 [1, 1, -2, 1, -1, -3, 2, 2, 1, 1]
+bulls 1.9 1.86
+bull 1.8 1.682
+bullish 2.3 1.5798
+whales -1.1 1.9138
+support 1 1.9322
+resistance 0.3 2.1756
+bear -1.3 1.8797
+bearish -1.4 1.2042
+short -0.8 1.5213
+long 1.3 1.6375
+bounce 1.1 1.6854
+rekt -2.2 2.4404
+arbitrage 0.4 1.9633
+manipulation -2.7 1.2721
+bot -0.9 2.1833
+strategy 1.5 1.9679
+SEC 0 1.4142
+regulations -1.2 1.6865
+FUD -1.9 1.912
+ICO -0.4 2.1705
+CNBC -2.1 2.0276
+hodl 0 2.357
\ No newline at end of file
diff -ur vaderSentiment.org/vaderSentiment.py vaderSentiment.jaime/vaderSentiment.py
--- vaderSentiment.org/vaderSentiment.py 2020-08-19 16:07:36.378823238 +0300
+++ vaderSentiment.jaime/vaderSentiment.py 2020-08-19 16:10:17.397542016 +0300
@@ -18,7 +18,6 @@
import json
from itertools import product
from inspect import getsourcefile
-from io import open
# ##Constants##
@@ -72,7 +71,10 @@
"back handed": -2, "blow smoke": -2, "blowing smoke": -2,
"upper hand": 1, "break a leg": 2,
"cooking with gas": 2, "in the black": 2, "in the red": -2,
- "on the ball": 2, "under the weather": -2}
+ "on the ball": 2, "under the weather": -2, "Bull Market": 2.3,
+ "All time high": 2.3, "Trading analysis": 1, "Short squeeze": 0.6,
+ "Closing a long": 1.6, "Closing a short": -0.1, "Opening a long": 1.3,
+ "Opening a short": 0.9, "flip a coin": 0.6}
# check for special case idioms containing lexicon words
SPECIAL_CASE_IDIOMS = {"the shit": 3, "the bomb": 3, "bad ass": 1.5, "yeah right": -2,
|
Updated lexicon of words to include common expresions used in CryptoTwitter. Sentiment scored by 10 independent users who are familiar with the expresions.