ArgFine

The ArguAI framework uses fine-grained argumentation mining models and NLP techniques to analyze comments on online participation portals. It includes Flair-based label (argument type) classifiers trained on CeR, CMV, AM2, and PE datasets, with optional eNRC emotion features.

Environment

Recommended: use the enrc conda environment (from ArgStanceNRC or your own), which provides flair, nrclex, enrc, and related deps.
Alternative: from project root, install with Poetry (see src/pyproject.toml):
```
cd src && poetry install
```

Then activate before running scripts:

conda activate enrc   # if using conda
cd /path/to/ArguAI/src

Datasets

Place or link JSON datasets under ArguAI/datasets/:

Dataset	File	Text column	Label column
CeR	`df_CeR.json`	`text`	`type`
CMV	`dfCMV-v2.json`	`op_EDU`	`Label`
AM2	`dfAM2-v1.json`	`texts`	`types`
PE	`dfPE_stance-v1.json`	`EDU`	`semanticType`

These are used by train_flair.py (and by argfine.py when called from notebooks).

Running the scripts

1. Train Flair label classifier (`train_flair.py`)

Trains a Flair TextClassifier for argument-type (label) classification on the datasets above. Optionally adds NRC or eNRC emotion features (8-dim) to the transformer embedding.

Basic usage (no emotion features):

cd src
python train_flair.py --dataset CeR --model roberta-base --epochs 10

With eNRC features (requires ArgStanceNRC repo and enrc package; uses cache under ArguAI/nrc_cache/):

python train_flair.py --dataset CeR --use-enrc --threshold 0.4 --epochs 10

All datasets, one model per dataset:

python train_flair.py --dataset all --use-enrc --threshold 0.4

Single model on all four datasets combined:

python train_flair.py --dataset all --combined --use-enrc --output-dir ../resources/taggers/flair_label_all_combined_enrc

Main options:

Option	Description	Default
`--dataset`	`AM2`, `CeR`, `CMV`, `PE`, or `all`	`AM2`
`--model`	HuggingFace transformer name	`roberta-base`
`--epochs`	Training epochs	`10`
`--lr`	Learning rate	`5e-5`
`--batch-size`	Mini-batch size	`4`
`--use-nrc`	Use NRC emotion features (nrclex)	off
`--use-enrc`	Use expandNRC (eNRC) from ArgStanceNRC	off
`--threshold`	eNRC threshold	`0.8`
`--output-dir`	Directory for saved model	`../resources/taggers/flair_label_<dataset>_enrc`
`--combined`	With `--dataset all`, train one combined model	off

Trained models are written under ArguAI/resources/taggers/ (or --output-dir), including final-model.pt.

2. ArguFine utilities (`argfine.py`)

argfine.py is a library used from notebooks (e.g. Demo_argfine_public-participants.ipynb). It provides:

visualize_column_counts, modify_flat, ready_for_flair, split_df for data prep
train_argfine(train_s, dev_s, test_s, model_name, learning_rate, mini_batch_size, max_epoch, label_type, base_path) to train a Flair text classifier

Example pattern from a notebook or script:

from argfine import modify_flat, ready_for_flair, split_df, train_argfine

# Load data, filter to valid labels (e.g. fact/value/policy), flatten
df = pd.read_json("../datasets/df_CeR.json")
valid = {"fact", "value", "policy"}
df_flat = modify_flat(df, valid, text_col="text", type_col="type")
df_ready = ready_for_flair(df_flat, valid, "text", "type")
train_s, dev_s, test_s = split_df(df_ready, "text", "type")

train_argfine(
    train_s, dev_s, test_s,
    model_name="roberta-base",
    learning_rate=5e-5,
    mini_batch_size=4,
    max_epoch=10,
    label_type="label",
    base_path="../resources/taggers/argfine_CeR",
)

Run the demo notebook for a full pipeline:

cd src
jupyter notebook Demo_argfine_public-participants.ipynb

Project layout

ArguAI/
├── README.md
├── datasets/          # df_CeR.json, dfCMV-v2.json, dfAM2-v1.json, dfPE_stance-v1.json
├── nrc_cache/         # Cached NRC/eNRC features (created by train_flair.py)
├── resources/taggers/ # Saved Flair models (final-model.pt, etc.)
└── src/
    ├── argfine.py         # Data prep + train_argfine()
    ├── train_flair.py     # CLI for label classification (with optional eNRC)
    ├── pyproject.toml
    └── Demo_argfine_public-participants.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
datasets		datasets
moral_cache		moral_cache
nrc_cache		nrc_cache
paper		paper
src		src
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ArgFine

Environment

Datasets

Running the scripts

1. Train Flair label classifier (`train_flair.py`)

2. ArguFine utilities (`argfine.py`)

Project layout

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ArgFine

Environment

Datasets

Running the scripts

1. Train Flair label classifier (train_flair.py)

2. ArguFine utilities (argfine.py)

Project layout

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Train Flair label classifier (`train_flair.py`)

2. ArguFine utilities (`argfine.py`)

Packages