Skip to content

Commit cc77e5e

Browse files
committed
feat: implement categorical comparison analysis on real-world data
1 parent f52550b commit cc77e5e

File tree

1 file changed

+148
-14
lines changed

1 file changed

+148
-14
lines changed

β€Žexamples/README.md

Lines changed: 148 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -81,27 +81,49 @@ uv run python advanced_training.py
8181
- Training parameter tuning
8282
- Model performance comparison
8383

84-
### 5. [MLflow Integration](mlflow_integration.py)
85-
Demonstrates MLflow integration for experiment tracking and model management:
86-
- Training and logging models to MLflow
87-
- Experiment tracking with metrics and parameters
88-
- Model versioning and registry
89-
- Loading models from MLflow for inference
90-
- Model artifact management
84+
### 5. [Categorical Comparison](categorical_comparison.py)
85+
Compares model performance with and without categorical features:
86+
- Loading real-world data (Sirene dataset)
87+
- Feature engineering and preprocessing
88+
- Model comparison with statistical analysis
89+
- Performance evaluation and visualization
9190

9291
**Run the example:**
9392
```bash
9493
cd examples
95-
pip install mlflow # Install MLflow first
96-
uv run python mlflow_integration.py
94+
uv run python categorical_comparison.py
9795
```
9896

9997
**What you'll learn:**
100-
- MLflow experiment tracking setup
101-
- Model logging and versioning
102-
- Loading models for inference
103-
- Model registry management
104-
- Reproducible ML workflows
98+
- Real-world data handling
99+
- Feature impact analysis
100+
- Statistical model comparison
101+
- Data preprocessing techniques
102+
103+
### 6. [Simple Explainability](simple_explainability_example.py)
104+
Demonstrates model explainability with ASCII histogram visualizations:
105+
- Training a FastText classifier with enhanced data
106+
- Word-level contribution analysis
107+
- ASCII histogram visualization in terminal
108+
- Interactive mode for custom text analysis
109+
- Real-time prediction explanations
110+
111+
**Run the example:**
112+
```bash
113+
cd examples
114+
# Regular mode - analyze predefined examples
115+
uv run python simple_explainability_example.py
116+
117+
# Interactive mode - analyze your own text
118+
uv run python simple_explainability_example.py --interactive
119+
```
120+
121+
**What you'll learn:**
122+
- Model explainability and interpretation
123+
- Word importance analysis
124+
- Interactive prediction tools
125+
- ASCII-based data visualization
126+
- Real-time model analysis
105127

106128
## πŸš€ Quick Start
107129

@@ -191,6 +213,118 @@ Class distribution: Negative=5, Neutral=5, Positive=5
191213
Final Accuracy: 3/6 = 0.500
192214
```
193215

216+
### Simple Explainability
217+
```
218+
πŸ” Simple Explainability Example
219+
220+
πŸ” Testing explainability on 5 examples:
221+
============================================================
222+
223+
πŸ“ Example 1:
224+
Text: 'This product is amazing!'
225+
Prediction: Positive
226+
227+
πŸ“Š Word Contribution Histogram:
228+
--------------------------------------------------
229+
This | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.3549
230+
product | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.1651
231+
is | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.2844
232+
amazing! | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.1956
233+
--------------------------------------------------
234+
βœ… Analysis completed for example 1
235+
236+
πŸ“ Example 2:
237+
Text: 'Poor quality and terrible service'
238+
Prediction: Negative
239+
⚠️ Explainability failed:
240+
βœ… Analysis completed for example 2
241+
242+
πŸ“ Example 3:
243+
Text: 'Great value for money'
244+
Prediction: Positive
245+
246+
πŸ“Š Word Contribution Histogram:
247+
--------------------------------------------------
248+
Great | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.3287
249+
value | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.2220
250+
for | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.2929
251+
money | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.1564
252+
--------------------------------------------------
253+
βœ… Analysis completed for example 3
254+
255+
πŸ“ Example 4:
256+
Text: 'Completely disappointing and awful experience'
257+
Prediction: Negative
258+
259+
πŸ“Š Word Contribution Histogram:
260+
--------------------------------------------------
261+
Completely | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.1673
262+
disappointing | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.4676
263+
and | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.0910
264+
awful | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.1225
265+
experience | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.1516
266+
--------------------------------------------------
267+
βœ… Analysis completed for example 4
268+
269+
πŸ“ Example 5:
270+
Text: 'Love this excellent design'
271+
Prediction: Positive
272+
273+
πŸ“Š Word Contribution Histogram:
274+
--------------------------------------------------
275+
Love | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.2330
276+
this | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.2525
277+
excellent | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.3698
278+
design | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.1447
279+
--------------------------------------------------
280+
βœ… Analysis completed for example 5
281+
282+
πŸŽ‰ Explainability analysis completed for 5 examples!
283+
284+
πŸ’‘ Tip: Use --interactive flag to enter interactive mode for custom text analysis!
285+
Example: uv run python examples/simple_explainability_example.py --interactive
286+
```
287+
288+
### Interactive Explainability Mode
289+
```
290+
============================================================
291+
🎯 Interactive Explainability Mode
292+
============================================================
293+
Enter your own text to see predictions and explanations!
294+
Type 'quit' or 'exit' to end the session.
295+
296+
πŸ’¬ Enter text: Amazing product quality!
297+
298+
πŸ” Analyzing: 'Amazing product quality!'
299+
🎯 Prediction: Positive
300+
301+
πŸ“Š Word Contribution Histogram:
302+
--------------------------------------------------
303+
Amazing | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.5429
304+
product | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.2685
305+
quality! | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.1886
306+
--------------------------------------------------
307+
πŸ’‘ Most influential word: 'Amazing' (score: 0.5429)
308+
309+
--------------------------------------------------
310+
πŸ’¬ Enter text: Terrible customer support
311+
312+
πŸ” Analyzing: 'Terrible customer support'
313+
🎯 Prediction: Negative
314+
315+
πŸ“Š Word Contribution Histogram:
316+
--------------------------------------------------
317+
Terrible | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.5238
318+
customer | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.1988
319+
support | β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 0.2774
320+
--------------------------------------------------
321+
πŸ’‘ Most influential word: 'Terrible' (score: 0.5238)
322+
323+
--------------------------------------------------
324+
πŸ’¬ Enter text: quit
325+
πŸ‘‹ Thanks for using the explainability tool!
326+
```
327+
194328
## πŸ› οΈ Customizing Examples
195329

196330
### Modify Data

0 commit comments

Comments
Β (0)