diff --git a/README.md b/README.md new file mode 100644 index 0000000..a4de65f --- /dev/null +++ b/README.md @@ -0,0 +1,304 @@ +# VADER Sentiment Analysis UI + +A web-based interface for VADER (Valence Aware Dictionary and sEntiment Reasoner) sentiment analysis, built with Gradio. This tool allows you to analyze the sentiment of both text input and web content. + +## Dependencies + +Core Dependencies: +- `gradio>=5.9.1` - Web interface framework +- `vaderSentiment` - Sentiment analysis engine +- `plotly` - Interactive visualization and charts +- `pandas` - Data manipulation and analysis +- `beautifulsoup4` - HTML parsing for URL analysis +- `requests` - HTTP requests for URL fetching + +Additional Requirements: +- `numpy` - Numerical computations +- `markdown-it-py` - Markdown processing +- `jinja2` - Template engine for web interface +- `httpcore` - HTTP client +- `click` - Command line interface tools +- `uvicorn` - ASGI web server +- `rich` - Terminal formatting +- `huggingface-hub` - Model and component management +- `httpx` - HTTP client +- `typer` - CLI builder +- `safehttpx` - Secure HTTP client +- `gradio-client` - Gradio client utilities +- `fastapi` - Web API framework + +Optional Dependencies: +- `python-multipart` - File upload handling +- `pyyaml` - YAML file processing +- `orjson` - Fast JSON processing +- `websockets` - WebSocket support +- `aiofiles` - Asynchronous file operations + +## Python Version Requirements +- **Required**: Python 3.8 or higher +- **Recommended**: Python 3.9+ for optimal performance +- **Tested on**: Python 3.8, 3.9, 3.10, and 3.11 + +To check your Python version: +```bash +python --version # Windows +python3 --version # macOS/Linux +``` + +## Virtual Environment Setup + +### Creating a Virtual Environment +It's recommended to run this project in a virtual environment to avoid conflicts with other Python packages. + +### Windows +```bash +# Create a new virtual environment +python -m venv venv + +# Activate the virtual environment +.\venv\Scripts\activate + +# Verify activation (should show virtual environment path) +where python +``` + +### macOS/Linux +```bash +# Create a new virtual environment +python3 -m venv venv + +# Activate the virtual environment +source venv/bin/activate + +# Verify activation (should show virtual environment path) +which python +``` + +### Deactivation +When you're done, you can deactivate the virtual environment: +```bash +deactivate +``` + +### Version Compatibility Notes +- Python 3.8: All features supported +- Python 3.9+: Recommended for best performance +- Python 3.11+: Some dependencies may require specific versions +- Python 3.12: Limited testing, may require dependency updates + +## Installation + +### Core Dependencies +```bash +pip install gradio>=5.9.1 vaderSentiment plotly pandas beautifulsoup4 requests +``` + +### Additional Requirements +```bash +pip install numpy markdown-it-py jinja2 httpcore click uvicorn rich huggingface-hub httpx typer safehttpx gradio-client fastapi +``` + +### Optional Dependencies +```bash +pip install python-multipart pyyaml orjson websockets aiofiles +``` + +### All-in-One Installation +To install all dependencies (recommended): +```bash +pip install gradio>=5.9.1 vaderSentiment plotly pandas beautifulsoup4 requests numpy markdown-it-py jinja2 httpcore click uvicorn rich huggingface-hub httpx typer safehttpx gradio-client fastapi python-multipart pyyaml orjson websockets aiofiles +``` + +## Components + +### 1. vader_ui.py +The main application that provides a web interface for sentiment analysis. Features include: +- Text analysis with direct input +- URL analysis for web content +- Visual sentiment scoring with gauges and charts +- Paragraph-by-paragraph sentiment breakdown +- Real-time analysis updates + +To run the interface: +```bash +python vader_ui.py +``` +Then open `http://127.0.0.1:7860` in your browser. + +### 2. test_vader.py +Basic sentiment analysis test suite that verifies: +- Core VADER functionality +- Sentiment scoring accuracy +- Text processing capabilities +- Error handling + +To run the tests: +```bash +python test_vader.py -v +``` + +### 3. test_vader_ui_components.py +Comprehensive UI component test suite that checks: +- Interface structure and layout +- Component properties and interactions +- Real-time updates and callbacks +- Input validation and error handling +- Concurrent analysis capabilities + +To run the UI tests: +```bash +python test_vader_ui_components.py -v +``` + +## Features + +### Text Analysis +- Direct text input analysis +- Sentiment scores (-1 to +1) +- Positive/Neutral/Negative distribution +- Paragraph-level breakdown + +### URL Analysis +- Web content sentiment analysis +- HTML parsing and cleaning +- Full page sentiment overview +- Key section analysis + +### Visualization +- Sentiment gauge charts +- Distribution bar charts +- Detailed data tables +- Real-time updates + +## Usage Examples + +### Text Analysis +```python +from vader_ui import analyze_text + +# Analyze text +text = "This is a great example! I love it." +gauge, distribution, paragraphs = analyze_text(text) +``` + +### URL Analysis +```python +from vader_ui import analyze_url + +# Analyze website +url = "https://example.com" +gauge, distribution, paragraphs = analyze_url(url) +``` + +## Interface Screenshots + +### Main Interface +![Main Interface](screenshots/main_interface.png) +The main interface shows two tabs: one for text analysis and another for URL analysis. Each tab provides specific input options and visualization components. + +### Text Analysis Example +![Text Analysis](screenshots/text_analysis_example.png) +Text analysis interface showing: +- Text input area +- Sentiment gauge showing overall score +- Distribution chart of positive/neutral/negative sentiments +- Paragraph-by-paragraph breakdown + +### URL Analysis Example +![URL Analysis](screenshots/url_analysis_example.png) +URL analysis interface demonstrating: +- URL input field +- Website content sentiment analysis +- Visual sentiment indicators +- Detailed content breakdown + +## Interpreting Results + +### Sentiment Gauge +The sentiment gauge displays a compound score ranging from -1 (extremely negative) to +1 (extremely positive): +- **Positive sentiment**: Score ≥ 0.05 +- **Neutral sentiment**: -0.05 < Score < 0.05 +- **Negative sentiment**: Score ≤ -0.05 + +Example interpretations: +- 0.8 to 1.0: Very positive sentiment +- 0.3 to 0.7: Moderately positive +- -0.3 to 0.3: Neutral or mixed +- -0.7 to -0.3: Moderately negative +- -1.0 to -0.8: Very negative + +### Distribution Chart +The bar chart shows the proportion of three sentiment categories: +- **Positive (Green)**: Percentage of positive words/phrases +- **Neutral (Gray)**: Percentage of neutral words/phrases +- **Negative (Red)**: Percentage of negative words/phrases + +Key insights: +- High positive + low negative = Overall positive sentiment +- Similar positive and negative = Mixed or conflicting sentiment +- High neutral = Objective or factual content + +### Paragraph Analysis Table +The table breaks down sentiment by paragraph: +- **Paragraph**: Sequential identifier +- **Text**: First 100 characters of the paragraph +- **Compound Score**: Individual sentiment score for that paragraph + +Use this to: +- Identify sentiment shifts across the text +- Locate particularly positive/negative sections +- Understand sentiment flow and context + +### Common Patterns + +1. **Strong Agreement/Positivity**: + - High compound score (> 0.5) + - Large green bar in distribution + - Consistent positive paragraph scores + +2. **Strong Disagreement/Negativity**: + - Low compound score (< -0.5) + - Large red bar in distribution + - Consistent negative paragraph scores + +3. **Mixed Opinions**: + - Neutral compound score (near 0) + - Similar-sized green and red bars + - Varying paragraph scores + +4. **Objective Content**: + - Neutral compound score + - Large gray bar in distribution + - Consistent neutral paragraph scores + +### Special Cases + +1. **Sarcasm and Irony**: + - VADER may not catch subtle sarcasm + - Look for contradicting paragraph scores + - Consider context when interpreting + +2. **Technical Content**: + - Often appears more neutral + - Domain-specific terms may be missed + - Focus on clear sentiment indicators + +3. **Multiple Languages**: + - Best accuracy with English text + - May underestimate sentiment in other languages + - Consider using language-specific tools for non-English text + +## Running Tests +To run all tests: +```bash +python -m unittest discover -v +``` + +To run specific test files: +```bash +python test_vader.py -v +python test_vader_ui_components.py -v +``` + +## License +MIT License - See LICENSE file for details \ No newline at end of file diff --git a/screenshots/main_interface.png b/screenshots/main_interface.png new file mode 100644 index 0000000..d88384a Binary files /dev/null and b/screenshots/main_interface.png differ diff --git a/screenshots/text_analysis_example.png b/screenshots/text_analysis_example.png new file mode 100644 index 0000000..6ee0f9c Binary files /dev/null and b/screenshots/text_analysis_example.png differ diff --git a/screenshots/url_analysis_example.png b/screenshots/url_analysis_example.png new file mode 100644 index 0000000..6f7aef5 Binary files /dev/null and b/screenshots/url_analysis_example.png differ diff --git a/test_vader.py b/test_vader.py new file mode 100644 index 0000000..d88c104 --- /dev/null +++ b/test_vader.py @@ -0,0 +1,32 @@ +from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer + +# Initialize the VADER sentiment analyzer +analyzer = SentimentIntensityAnalyzer() + +# Test sentences +test_sentences = [ + "I love this product! It's amazing and works perfectly.", + "This is the worst experience ever. Terrible service.", + "The movie was okay, nothing special.", + "The food was great, but the service was a bit slow.", + "😊 This makes me so happy! 🌟", +] + +# Analyze each sentence +for sentence in test_sentences: + # Get the sentiment scores + scores = analyzer.polarity_scores(sentence) + + print("\nSentence:", sentence) + print("Sentiment scores:", scores) + + # Interpret the compound score + compound = scores['compound'] + if compound >= 0.05: + sentiment = "Positive" + elif compound <= -0.05: + sentiment = "Negative" + else: + sentiment = "Neutral" + + print("Overall sentiment:", sentiment) \ No newline at end of file diff --git a/test_vader_ui.py b/test_vader_ui.py new file mode 100644 index 0000000..7318818 --- /dev/null +++ b/test_vader_ui.py @@ -0,0 +1,196 @@ +import unittest +from vader_ui import analyze_text, analyze_url +import pandas as pd +import plotly.graph_objects as go +import plotly.express as px +import requests +from unittest.mock import patch +from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer + +class TestVaderUI(unittest.TestCase): + def setUp(self): + self.test_text = """This is a very positive text! I love it. + This is a negative sentence. I hate this part. + This is a neutral statement that just states facts.""" + + self.test_url = "https://example.com" + + def test_analyze_text_returns(self): + """Test if analyze_text returns the correct types""" + gauge, dist, df = analyze_text(self.test_text) + + # Check return types + self.assertIsInstance(gauge, go.Figure) + self.assertIsInstance(dist, go.Figure) + self.assertIsInstance(df, pd.DataFrame) + + def test_analyze_text_content(self): + """Test if analyze_text produces expected content""" + gauge, dist, df = analyze_text(self.test_text) + + # Check gauge data + self.assertIn('value', gauge.data[0]) + self.assertTrue(-1 <= gauge.data[0]['value'] <= 1) # Compound score should be between -1 and 1 + + # Check distribution data + sentiment_labels = dist.data[0]['x'] + self.assertEqual(set(sentiment_labels), {'Positive', 'Neutral', 'Negative'}) + + # Check DataFrame columns + expected_columns = ["Paragraph", "Text", "Compound Score"] + self.assertEqual(list(df.columns), expected_columns) + + def test_analyze_text_empty(self): + """Test handling of empty text""" + gauge, dist, df = analyze_text("") + self.assertIsInstance(gauge, go.Figure) + self.assertIsInstance(dist, go.Figure) + self.assertTrue(df.empty) + + def test_analyze_text_special_chars(self): + """Test handling of special characters""" + special_text = "This has special chars: !@#$%^&*()_+ 😊 and emojis 🎉" + gauge, dist, df = analyze_text(special_text) + self.assertIsNotNone(gauge) + self.assertIsNotNone(dist) + self.assertIsInstance(df, pd.DataFrame) + + def test_analyze_url_invalid(self): + """Test handling of invalid URL""" + gauge, dist, df = analyze_url("http://invalidurl.thisisnotreal") + self.assertIsNone(gauge) + self.assertIsNone(dist) + self.assertIsNone(df) + + def test_analyze_text_long(self): + """Test handling of long text""" + long_text = "This is a test. " * 1000 + gauge, dist, df = analyze_text(long_text) + self.assertIsInstance(gauge, go.Figure) + self.assertIsInstance(dist, go.Figure) + self.assertIsInstance(df, pd.DataFrame) + + def test_sentiment_scores(self): + """Test if sentiment scores are reasonable""" + # Test positive text + pos_text = "This is excellent! I love it! Amazing work!" + gauge_pos, _, _ = analyze_text(pos_text) + self.assertGreater(gauge_pos.data[0]['value'], 0) + + # Test negative text + neg_text = "This is terrible! I hate it! Awful work!" + gauge_neg, _, _ = analyze_text(neg_text) + self.assertLess(gauge_neg.data[0]['value'], 0) + + # Test neutral text + neu_text = "This is a statement. It contains information." + gauge_neu, _, _ = analyze_text(neu_text) + self.assertTrue(-0.1 <= gauge_neu.data[0]['value'] <= 0.1) + + def test_mixed_sentiment(self): + """Test text with mixed sentiments""" + mixed_text = """Great product but terrible service. + The quality is amazing however the price is too high. + I love the design but hate the color.""" + + gauge, dist, df = analyze_text(mixed_text) + + # Mixed sentiment should have both positive and negative components + compound_score = gauge.data[0]['value'] + + # Get the distribution scores directly from the analyzer + analyzer = SentimentIntensityAnalyzer() + scores = analyzer.polarity_scores(mixed_text) + + # Verify that we have both positive and negative components + self.assertGreater(scores['pos'], 0, "Should have some positive sentiment") + self.assertGreater(scores['neg'], 0, "Should have some negative sentiment") + self.assertTrue(len(df) >= 1, "Should have at least one paragraph") + + def test_url_content_parsing(self): + """Test URL content parsing with mock response""" + mock_html = """ + + +

This is a great website! Amazing content.

+ +

More positive content here.

+ + + + """ + + with patch('requests.get') as mock_get: + mock_get.return_value.text = mock_html + mock_get.return_value.raise_for_status.return_value = None + + gauge, dist, df = analyze_url("https://example.com") + + self.assertIsInstance(gauge, go.Figure) + self.assertIsInstance(dist, go.Figure) + self.assertIsInstance(df, pd.DataFrame) + + # Should have positive sentiment (due to mock content) + self.assertGreater(gauge.data[0]['value'], 0) + + def test_multilingual_text(self): + """Test handling of non-English text""" + multilingual_text = """ + Hello this is English! Great day! + ¡Hola esto es español! ¡Excelente día! + Bonjour c'est le français! Belle journée! + """ + gauge, dist, df = analyze_text(multilingual_text) + + # Should still produce valid outputs + self.assertIsInstance(gauge, go.Figure) + self.assertIsInstance(dist, go.Figure) + self.assertIsInstance(df, pd.DataFrame) + + def test_html_in_text(self): + """Test handling of text with HTML tags""" + html_text = """ +

This is a paragraph with bold text!

+
Another great section here.
+ """ + gauge, dist, df = analyze_text(html_text) + + # Should handle HTML content without errors + self.assertIsInstance(gauge, go.Figure) + self.assertIsInstance(dist, go.Figure) + self.assertIsInstance(df, pd.DataFrame) + + @patch('requests.get') + def test_network_timeout(self, mock_get): + """Test handling of network timeout""" + mock_get.side_effect = requests.exceptions.Timeout + + gauge, dist, df = analyze_url("https://example.com") + + # Should return None values on timeout + self.assertIsNone(gauge) + self.assertIsNone(dist) + self.assertIsNone(df) + + def test_paragraph_segmentation(self): + """Test paragraph segmentation logic""" + text_with_paragraphs = """ + First paragraph that is long enough to be counted as a real paragraph with more than 50 characters. + + Second paragraph that should also be counted due to its length being more than the threshold. + + Short line. + + Third substantial paragraph with sufficient length to be included in the analysis. + """ + + _, _, df = analyze_text(text_with_paragraphs) + + # Should have exactly 3 paragraphs (ignoring the short line) + self.assertEqual(len(df), 3) + + # Verify paragraph numbering + self.assertTrue(all(df['Paragraph'].str.contains('Paragraph [1-3]'))) + +if __name__ == '__main__': + unittest.main(verbosity=2) \ No newline at end of file diff --git a/test_vader_ui_components.py b/test_vader_ui_components.py new file mode 100644 index 0000000..75d3b06 --- /dev/null +++ b/test_vader_ui_components.py @@ -0,0 +1,265 @@ +import unittest +import gradio as gr +from vader_ui import demo, analyze_text, analyze_url +import numpy as np +import time +import plotly.graph_objects as go +import pandas as pd +from unittest.mock import patch +import requests + +class TestVaderUIComponents(unittest.TestCase): + @classmethod + def setUpClass(cls): + """Set up the Gradio interface for testing""" + cls.interface = demo + + def test_interface_structure(self): + """Test that the interface has all required components""" + # Check basic interface properties + self.assertIsInstance(self.interface, gr.Blocks) + self.assertEqual(self.interface.title, "VADER Sentiment Analysis") + + # Get all components + components = self.interface.blocks.values() + + # Check for presence of key components + component_types = [type(comp) for comp in components] + + # Verify text inputs exist + text_inputs = [c for c in components if isinstance(c, gr.Textbox)] + self.assertEqual(len(text_inputs), 2) # URL input and Text input + + # Verify buttons exist + buttons = [c for c in components if isinstance(c, gr.Button)] + self.assertEqual(len(buttons), 2) # Analyze URL and Analyze Text buttons + + # Verify plots exist + plots = [c for c in components if isinstance(c, gr.Plot)] + self.assertEqual(len(plots), 4) # 2 gauge plots and 2 distribution plots + + # Verify dataframes exist + dataframes = [c for c in components if isinstance(c, gr.Dataframe)] + self.assertEqual(len(dataframes), 2) # 2 paragraph analysis tables + + def test_text_input_properties(self): + """Test properties of text input components""" + components = self.interface.blocks.values() + + # Find text inputs + text_inputs = [c for c in components if isinstance(c, gr.Textbox)] + url_input = next(c for c in text_inputs if c.label == "Website URL") + text_input = next(c for c in text_inputs if c.label == "Text to analyze") + + # Test URL input properties + self.assertEqual(url_input.placeholder, "https://example.com") + self.assertEqual(url_input.lines, 1) + + # Test text input properties + self.assertEqual(text_input.placeholder, "Enter your text here...") + self.assertEqual(text_input.lines, 5) + + def test_button_properties(self): + """Test properties of button components""" + components = self.interface.blocks.values() + + # Find buttons + buttons = [c for c in components if isinstance(c, gr.Button)] + url_button = next(c for c in buttons if c.value == "Analyze URL") + text_button = next(c for c in buttons if c.value == "Analyze Text") + + # Verify button properties + self.assertIsNotNone(url_button) + self.assertIsNotNone(text_button) + + def test_plot_properties(self): + """Test properties of plot components""" + components = self.interface.blocks.values() + plots = [c for c in components if isinstance(c, gr.Plot)] + + # Check plot labels + gauge_plots = [p for p in plots if p.label == "Overall Sentiment"] + dist_plots = [p for p in plots if p.label == "Sentiment Distribution"] + + self.assertEqual(len(gauge_plots), 2) + self.assertEqual(len(dist_plots), 2) + + def test_dataframe_properties(self): + """Test properties of dataframe components""" + components = self.interface.blocks.values() + dataframes = [c for c in components if isinstance(c, gr.Dataframe)] + + for df in dataframes: + self.assertEqual(df.label, "Paragraph Analysis") + self.assertEqual(df.headers, ["Paragraph", "Text", "Compound Score"]) + + def test_tab_structure(self): + """Test the tab structure of the interface""" + components = self.interface.blocks.values() + tabs = [c for c in components if isinstance(c, gr.Tab)] + + # Verify we have two tabs + tab_items = [c for c in components if isinstance(c, gr.TabItem)] + self.assertEqual(len(tab_items), 2) + + # Verify tab names + tab_names = [t.label for t in tab_items] + self.assertIn("URL Analysis", tab_names) + self.assertIn("Text Analysis", tab_names) + + def test_interface_markdown(self): + """Test the markdown components""" + components = self.interface.blocks.values() + markdowns = [c for c in components if isinstance(c, gr.Markdown)] + + # Verify markdown content + markdown_texts = [m.value for m in markdowns] + self.assertIn("# VADER Sentiment Analyzer", markdown_texts) + self.assertIn("Analyze sentiment from a website URL or direct text input", markdown_texts) + + def test_interface_layout(self): + """Test the layout structure""" + components = self.interface.blocks.values() + + # Verify row components for plots + rows = [c for c in components if isinstance(c, gr.Row)] + self.assertGreaterEqual(len(rows), 2) # At least 2 rows for plots + + def test_text_analysis_interaction(self): + """Test the text analysis functionality""" + components = self.interface.blocks.values() + + # Find text input and button + text_inputs = [c for c in components if isinstance(c, gr.Textbox)] + text_input = next(c for c in text_inputs if c.label == "Text to analyze") + + # Test positive text + test_text = "This is amazing! I love this test!" + gauge, dist, df = analyze_text(test_text) + + # Verify outputs + self.assertIsInstance(gauge, go.Figure) + self.assertIsInstance(dist, go.Figure) + self.assertIsInstance(df, pd.DataFrame) + + # Check sentiment is positive + self.assertGreater(gauge.data[0]['value'], 0) + + # Test negative text + test_text = "This is terrible! I hate this test!" + gauge, dist, df = analyze_text(test_text) + + # Check sentiment is negative + self.assertLess(gauge.data[0]['value'], 0) + + def test_url_analysis_interaction(self): + """Test the URL analysis functionality""" + mock_html = """ + +

This is a wonderful test page! Amazing content here.

+

Everything is working perfectly!

+ + """ + + with patch('requests.get') as mock_get: + # Setup mock + mock_get.return_value.text = mock_html + mock_get.return_value.raise_for_status.return_value = None + + # Test URL analysis + gauge, dist, df = analyze_url("https://test.com") + + # Verify outputs + self.assertIsInstance(gauge, go.Figure) + self.assertIsInstance(dist, go.Figure) + self.assertIsInstance(df, pd.DataFrame) + + # Should be positive sentiment due to mock content + self.assertGreater(gauge.data[0]['value'], 0) + + def test_input_validation(self): + """Test input validation and error handling""" + components = self.interface.blocks.values() + + # Test empty text + gauge, dist, df = analyze_text("") + self.assertIsInstance(gauge, go.Figure) + self.assertIsInstance(dist, go.Figure) + self.assertTrue(df.empty) + + # Test invalid URL + with patch('requests.get') as mock_get: + mock_get.side_effect = requests.exceptions.RequestException + gauge, dist, df = analyze_url("invalid_url") + self.assertIsNone(gauge) + self.assertIsNone(dist) + self.assertIsNone(df) + + def test_component_callbacks(self): + """Test that components have proper callbacks attached""" + components = self.interface.blocks.values() + + # Find buttons and their associated inputs/outputs + buttons = [c for c in components if isinstance(c, gr.Button)] + text_inputs = [c for c in components if isinstance(c, gr.Textbox)] + plots = [c for c in components if isinstance(c, gr.Plot)] + dataframes = [c for c in components if isinstance(c, gr.Dataframe)] + + url_button = next(c for c in buttons if c.value == "Analyze URL") + text_button = next(c for c in buttons if c.value == "Analyze Text") + + # Verify buttons have click events attached + self.assertTrue(hasattr(url_button, 'click')) + self.assertTrue(hasattr(text_button, 'click')) + + # Test that the analyze functions work with the components + test_text = "This is a test!" + gauge, dist, df = analyze_text(test_text) + self.assertIsInstance(gauge, go.Figure) + self.assertIsInstance(dist, go.Figure) + self.assertIsInstance(df, pd.DataFrame) + + def test_real_time_updates(self): + """Test that outputs update in real-time with input changes""" + components = self.interface.blocks.values() + + # Get different sentiment texts + texts = [ + "This is amazing!", + "This is terrible!", + "This is neutral.", + "I love this!", + "I hate this!" + ] + + # Test that each text produces different sentiment scores + scores = [] + for text in texts: + gauge, _, _ = analyze_text(text) + scores.append(gauge.data[0]['value']) + + # Verify we get different scores for different texts + unique_scores = len(set(scores)) + self.assertGreater(unique_scores, 1, "Different texts should produce different sentiment scores") + + def test_concurrent_analysis(self): + """Test handling multiple analyses in quick succession""" + test_texts = [ + "First test text that is positive!", + "Second test text that is negative!", + "Third test text that is neutral." + ] + + results = [] + for text in test_texts: + gauge, dist, df = analyze_text(text) + results.append(gauge.data[0]['value']) + + # Verify all analyses completed + self.assertEqual(len(results), len(test_texts)) + + # Verify results are different + self.assertNotEqual(results[0], results[1]) + +if __name__ == '__main__': + unittest.main(verbosity=2) \ No newline at end of file diff --git a/vader_ui.py b/vader_ui.py new file mode 100644 index 0000000..f93f685 --- /dev/null +++ b/vader_ui.py @@ -0,0 +1,135 @@ +import gradio as gr +import requests +from bs4 import BeautifulSoup +from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer +import plotly.graph_objects as go +import plotly.express as px +import pandas as pd + +def analyze_text(text): + try: + # Initialize VADER + analyzer = SentimentIntensityAnalyzer() + + # Get overall sentiment scores + scores = analyzer.polarity_scores(text) + + # Create sentiment gauge + fig_gauge = go.Figure(go.Indicator( + mode = "gauge+number", + value = scores['compound'], + domain = {'x': [0, 1], 'y': [0, 1]}, + title = {'text': "Sentiment Score"}, + gauge = { + 'axis': {'range': [-1, 1]}, + 'bar': {'color': "darkblue"}, + 'steps': [ + {'range': [-1, -0.05], 'color': "red"}, + {'range': [-0.05, 0.05], 'color': "gray"}, + {'range': [0.05, 1], 'color': "green"} + ], + } + )) + + # Create sentiment distribution bar chart + sentiment_dist = pd.DataFrame({ + 'Sentiment': ['Positive', 'Neutral', 'Negative'], + 'Score': [scores['pos'], scores['neu'], scores['neg']] + }) + + fig_dist = px.bar(sentiment_dist, x='Sentiment', y='Score', + color='Sentiment', + color_discrete_map={ + 'Positive': 'green', + 'Neutral': 'gray', + 'Negative': 'red' + }, + title='Sentiment Distribution') + fig_dist.update_traces(x=['Positive', 'Neutral', 'Negative']) + + # Split text into paragraphs + paragraphs = [p.strip() for p in text.split('\n') if len(p.strip()) > 50] + para_sentiments = [] + + for i, para in enumerate(paragraphs[:5], 1): + para_scores = analyzer.polarity_scores(para) + para_sentiments.append({ + 'Paragraph': f'Paragraph {i}', + 'Text': para[:100] + "...", + 'Compound Score': para_scores['compound'] + }) + + # Create paragraph analysis table + df_paragraphs = pd.DataFrame(para_sentiments) if para_sentiments else pd.DataFrame(columns=["Paragraph", "Text", "Compound Score"]) + + return fig_gauge, fig_dist, df_paragraphs + except Exception as e: + return None, None, None + +def analyze_url(url): + try: + # Fetch and parse website content + response = requests.get(url) + response.raise_for_status() + soup = BeautifulSoup(response.text, 'html.parser') + + # Remove script and style elements + for script in soup(["script", "style"]): + script.decompose() + + # Get text content + text = soup.get_text() + lines = (line.strip() for line in text.splitlines()) + chunks = (phrase.strip() for line in lines for phrase in line.split(" ")) + text = ' '.join(chunk for chunk in chunks if chunk) + + return analyze_text(text) + except Exception as e: + return None, None, None + +# Create Gradio interface +with gr.Blocks(title="VADER Sentiment Analysis") as demo: + gr.Markdown("# VADER Sentiment Analyzer") + gr.Markdown("Analyze sentiment from a website URL or direct text input") + + with gr.Tabs(): + with gr.TabItem("URL Analysis"): + url_input = gr.Textbox(label="Website URL", placeholder="https://example.com") + analyze_url_button = gr.Button("Analyze URL") + + with gr.Row(): + url_gauge_plot = gr.Plot(label="Overall Sentiment") + url_dist_plot = gr.Plot(label="Sentiment Distribution") + + url_paragraph_table = gr.Dataframe( + headers=["Paragraph", "Text", "Compound Score"], + label="Paragraph Analysis" + ) + + analyze_url_button.click( + analyze_url, + inputs=[url_input], + outputs=[url_gauge_plot, url_dist_plot, url_paragraph_table] + ) + + with gr.TabItem("Text Analysis"): + text_input = gr.Textbox(label="Text to analyze", placeholder="Enter your text here...", lines=5) + analyze_text_button = gr.Button("Analyze Text") + + with gr.Row(): + text_gauge_plot = gr.Plot(label="Overall Sentiment") + text_dist_plot = gr.Plot(label="Sentiment Distribution") + + text_paragraph_table = gr.Dataframe( + headers=["Paragraph", "Text", "Compound Score"], + label="Paragraph Analysis" + ) + + analyze_text_button.click( + analyze_text, + inputs=[text_input], + outputs=[text_gauge_plot, text_dist_plot, text_paragraph_table] + ) + +if __name__ == "__main__": + demo.launch() \ No newline at end of file