`hate_measure`: Neural Networks for Measuring Hate Speech

hate_measure is a package for training and serving continuous hate-speech scorers built on top of encoder models.

The package uses the Hugging Face dataset ucberkeley-dlab/measuring-hate-speech and follows the measurement-based framing described in Kennedy et al. (2019), Constructing interval variables via faceted Rasch measurement and multitask deep learning: a hate speech application and Sachdeva et al. (2022), The Measuring Hate Speech Corpus: Leveraging Rasch Measurement Theory for Data Perspectivism.

The repository centers on a custom regression head over ModernBERT and is aligned with the published Hugging Face checkpoint ucberkeley-dlab/mhs-scorer-modernbert-large.

Overview

This repository currently provides:

A dataset loader for the Measuring Hate Speech dataset on Hugging Face
A custom transformers-compatible regression model and config
A training entry point built on Hugging Face Trainer
Hub-facing model/config files for publishing custom code models

At a high level, the model predicts a continuous hate speech score rather than a binary hate/not-hate label.

Measuring Hate Speech: Background

The Measuring Hate Speech project frames hate speech as a measurement problem rather than a simple hard classification task. The Measuring Hate Speech corpus consists of 50,070 unique social media comments from YouTube, Reddit, and Twitter, labeled by 11,143 annotators, with labels aggregated using faceted Rasch measurement theory to derive a continuous hate_speech_score.

The dataset includes:

Text for each comment
A continuous hate_speech_score
Construct-level labels such as sentiment, respect, insult, humiliation, dehumanization, violence, genocide, and related dimensions
Target identity annotations
Annotator demographic and perspective-related metadata

Installation

Python 3.11+ is required.

With `uv`

uv sync

Optional Weights & Biases support:

uv sync --extra wandb

With pip

python -m venv .venv
source .venv/bin/activate
pip install -e .

Optional Weights & Biases support:

pip install -e ".[wandb]"

Training

The main training entry point is:

python scripts/train.py --output_dir outputs/mhs-modernbert-large

More options can be specified:

uv run python scripts/train.py \
  --output_dir outputs/mhs-modernbert-large \
  --base_model AnswerDotAI/ModernBERT-large \
  --num_epochs 3 \
  --batch_size 32 \
  --learning_rate 2e-5 \
  --warmup_ratio 0.1 \
  --weight_decay 0.01 \
  --max_length 512 \
  --report_to none

Notes

The script defaults to --report_to wandb, so pass --report_to none unless you installed the optional wandb extra.
Mixed precision is enabled by default with --bf16; use --no-bf16 if your hardware does not support it.
The script supports --push_to_hub and --hub_model_id for publishing trained checkpoints.

Using the Published Hugging Face Model

The published model card for ucberkeley-dlab/mhs-scorer-modernbert-large shows how to load the hosted checkpoint with remote custom code.

import torch
from transformers import AutoModel, AutoTokenizer

model_id = "ucberkeley-dlab/mhs-scorer-modernbert-large"
tokenizer = AutoTokenizer.from_pretrained("AnswerDotAI/ModernBERT-large")
model = AutoModel.from_pretrained(model_id, trust_remote_code=True)

text = "your text here"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

with torch.no_grad():
    outputs = model(**inputs)

score = outputs.logits.squeeze(-1).item()
print(score)

Treat the output as a continuous relative score, not as a calibrated probability or a substitute for a full moderation policy.

Repository Layout

.
├── src/hate_measure/
│   ├── data.py               # Hugging Face dataset loading/tokenization
│   ├── train.py              # Training function
│   ├── constants.py          # Construct/target/annotator column groups
│   └── nn/
│       ├── config.py         # Custom HF config
│       └── scorer.py         # Custom regression model
├── scripts/
│   └── train.py              # CLI entry point
├── hub/
│   ├── config.py             # Hub remote-code config
│   └── scorer.py             # Hub remote-code model
├── notebooks/                # Exploratory data notebooks
└── data/                     # Local research data artifacts

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
hub		hub
scripts		scripts
src/hate_measure		src/hate_measure
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`hate_measure`: Neural Networks for Measuring Hate Speech

Overview

Measuring Hate Speech: Background

Installation

With `uv`

Training

Using the Published Hugging Face Model

Repository Layout

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

hate_measure: Neural Networks for Measuring Hate Speech

Overview

Measuring Hate Speech: Background

Installation

With uv

Training

Using the Published Hugging Face Model

Repository Layout

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`hate_measure`: Neural Networks for Measuring Hate Speech

With `uv`

Packages