Add MiniMax M3 (post-v1.1, rotated v1.3 council) by VibeCodingScientist · Pull Request #7 · AppliedScientific/refusalbench

VibeCodingScientist · 2026-06-03T15:40:50Z

Summary

Adds MiniMax M3 as a second post-v1.1-frozen addition (after Claude Opus 4.8). Same rotated v1.3 council — no further judge changes since 2026-05-29.

results/snapshots/2026-05-minimax3/eval/minimax_m3.csv — 705 raw responses (clean; 8 retry-eligible API errors filtered)
results/snapshots/2026-05/council/adjudicated.csv — +705 M3 rows (14,094 → 14,799); frozen rows untouched
results/should_refuse/should_refuse_sweep_public.csv — +75 M3 PC rows (1,500 → 1,575); 21 distinct models
benchmark/config/sweep_models.json — registers M3
README.md "Model updates" table — adds M3 row + brief comparison
CHANGELOG.md — new entry

Result

	Benign	Borderline	Dual-use	Overall	PC TPR	Youden's J
MiniMax M3	21.3 %	16.6 %	29.4 %	22.4 %	80 % (gap zone)	+0.59
MiniMax M2.7 (frozen, ref)	6 %	6 %	14 %	9 %	72 % (Tier B)	+0.66

M3 refuses more dangerous prompts (TPR 72 % → 80 %, moving out of Tier B into the gap zone) and more dual-use prompts (14 % → 29 %), but benign over-refusal tripled (6 % → 21 %). Net: Youden's J slips slightly (+0.66 → +0.59) — the dangerous-side gain didn't outpace the benign-side drift.

Test plan

adjudicated.csv = 14,799 rows; M3 = 705; v1.1-frozen 13,389 unchanged
should_refuse_sweep_public.csv = 1,575 rows; M3 = 75; 21 distinct models
Eval CSV = 705 responses, no errors
HF Space + Dataset already updated to match

Co-authored with Claude Code.

Summary by CodeRabbit

Documentation
- Added MiniMax M3 model information with compliance and discrimination metrics to documentation
- Updated model version comparisons and clarified benchmark panel composition and agreement metrics
Chores
- Configured MiniMax M3 in the benchmark suite with routing information, organization metadata, and pricing details

Appends MiniMax M3 to the committed data as a second post-frozen addition. The v1.1-frozen 13,389 rows are left unchanged. - snapshots/2026-05-minimax3/eval/minimax_m3.csv: 705 raw responses (clean) - snapshots/2026-05/council/adjudicated.csv: +705 M3 rows (14,094 to 14,799) - should_refuse_sweep_public.csv: +75 M3 PC rows (1,500 to 1,575) - sweep_models.json: registers M3 - README "Model updates" table: M3 row + brief comparison vs M2.7 - CHANGELOG entry MiniMax M3: PC gap zone (TPR 80%, between B-cap 73% and A-floor 95%), benign 21%, dual-use 29%, Youden's J +0.59. Refuses more dangerous prompts than M2.7 (TPR 72% to 80%) but benign over-refusal tripled (6% to 21%), so J slips +0.66 to +0.59. Adjudicated under the same rotated v1.3 council as Opus 4.8; no further judge rotation since 2026-05-29. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-06-03T15:41:11Z

📝 Walkthrough

Walkthrough

This PR adds MiniMax M3 model support to the RefusalBench suite. The model is registered in the benchmark configuration, documented in the changelog, and added to the readme with comparative metrics against M2.7 and rotated council footnote updates.

Changes

MiniMax M3 Model Addition

Layer / File(s)	Summary
Model configuration `benchmark/config/sweep_models.json`	Added `minimax/minimax-m3` entry with OpenRouter provider routing, jurisdiction/organization metadata, `v1.3_addition` role designation, and USD-per-MTok pricing (input and output).
Release documentation `CHANGELOG.md`, `README.md`	Introduced new Unreleased section (2026-06-03) documenting M3 addition to sweep and PC gap zone metrics; added M3 row to readme model updates table with release/test dates and compliance metrics; expanded explanatory text with M3 vs. M2.7 comparison bullets and updated rotated v1.3 council footnote with judge replacement access details and revised inter-judge agreement percentages.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Poem

🐰 A whisker-worthy model joins the fold,

M3 hops in with metrics bright and bold,

Config, changelog, readme all align—

Benchmarks grow, and tests will surely shine! ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: adding MiniMax M3 as a new model entry with the rotated v1.3 council designation, which is the primary objective of the PR.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch add-minimax-m3

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@benchmark/config/sweep_models.json`:
- Line 177: The routing_note for minimax/minimax-m3 currently says "Replaces
M2.7 in the panel" which contradicts the presence of
minimax/minimax-m2.7-20260318 still marked role: "primary"; update the
routing_note text to reflect that M3 is a post-v1.1 addition compared against a
frozen M2.7 (or explicitly mark M2.7 deprecated) so the note and the model
entries are consistent—modify the "routing_note" string for minimax/minimax-m3
to read something like "post-v1.1 addition compared against frozen M2.7" or
change the minimax/minimax-m2.7-20260318 entry to indicate deprecation if
replacement is intended.

In `@README.md`:
- Line 29: The snapshot description for "v1.1-frozen panel" currently lists "18
frontier models + Llama 3.3 70B control + NVIDIA Nemotron 3 Super 120B" which
sums to 20 but the surrounding text frames the panel as 19 models; reconcile
this by either changing "18 frontier models" to "17 frontier models" or by
removing/adjusting one of the listed components so the total equals 19, and
update the phrase "18 frontier models + Llama 3.3 70B control + NVIDIA Nemotron
3 Super 120B" accordingly to match the canonical 19-model count.
- Line 27: Update the table row for "MiniMax M3 *" to use the concrete release
date used in config documents: replace the string "early Jun 2026" with
"2026-05-31" so the README's table entry for MiniMax M3 (the row containing
"MiniMax M3 * | MiniMax | early Jun 2026 | ...") matches the configured
"Released 2026-05-31" date across docs.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 66ae8ca7-e6a0-422d-8532-8fb33d417137

📥 Commits

Reviewing files that changed from the base of the PR and between 13f9637 and 634616f.

⛔ Files ignored due to path filters (3)

results/should_refuse/should_refuse_sweep_public.csv is excluded by !**/*.csv
results/snapshots/2026-05-minimax3/eval/minimax_m3.csv is excluded by !**/*.csv
results/snapshots/2026-05/council/adjudicated.csv is excluded by !**/*.csv

📒 Files selected for processing (3)

CHANGELOG.md
README.md
benchmark/config/sweep_models.json

coderabbitai · 2026-06-03T15:44:22Z

+      "jurisdiction": "asia",
+      "organization": "minimax",
+      "role": "v1.3_addition",
+      "routing_note": "Released 2026-05-31. OpenRouter ID: minimax/minimax-m3. Multimodal (text/image/video input), 1M context. Replaces M2.7 in the panel.",


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Clarify the M3 routing note to avoid contradiction with active M2.7 entry.

routing_note says M3 “Replaces M2.7 in the panel,” but minimax/minimax-m2.7-20260318 is still present as role: "primary" (Lines 180-188). Please reword to “post-v1.1 addition compared against frozen M2.7” (or explicitly mark M2.7 deprecated if replacement is intended).

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@benchmark/config/sweep_models.json` at line 177, The routing_note for minimax/minimax-m3 currently says "Replaces M2.7 in the panel" which contradicts the presence of minimax/minimax-m2.7-20260318 still marked role: "primary"; update the routing_note text to reflect that M3 is a post-v1.1 addition compared against a frozen M2.7 (or explicitly mark M2.7 deprecated) so the note and the model entries are consistent—modify the "routing_note" string for minimax/minimax-m3 to read something like "post-v1.1 addition compared against frozen M2.7" or change the minimax/minimax-m2.7-20260318 entry to indicate deprecation if replacement is intended.

coderabbitai · 2026-06-03T15:44:22Z

 | Model | Provider | Released | Tested | Council | Snapshot | Headline |
 |---|---|---|---|---|---|---|
 | **Claude Opus 4.8** \* | Anthropic | [2026-05-28](https://www.anthropic.com/news/claude-opus-4-8) | 2026-05-29 | **v1.3** (rotated) | post-v1.1 | PC Tier A (TPR 100 %); benign 57 %, dual-use 100 %, Youden's J **+0.43** |
+| **MiniMax M3** \* | MiniMax | early Jun 2026 | 2026-06-03 | **v1.3** (rotated) | post-v1.1 | PC gap zone (TPR 80 %, between B-cap 73 % and A-floor 95 %); benign 21 %, dual-use 29 %, Youden's J **+0.59** |


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use a concrete M3 release date for consistency across docs.

This row says “early Jun 2026,” while config documents Released 2026-05-31. Using the exact date in both places avoids timeline ambiguity.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@README.md` at line 27, Update the table row for "MiniMax M3 *" to use the concrete release date used in config documents: replace the string "early Jun 2026" with "2026-05-31" so the README's table entry for MiniMax M3 (the row containing "MiniMax M3 * | MiniMax | early Jun 2026 | ...") matches the configured "Released 2026-05-31" date across docs.

coderabbitai · 2026-06-03T15:44:22Z

+| **MiniMax M3** \* | MiniMax | early Jun 2026 | 2026-06-03 | **v1.3** (rotated) | post-v1.1 | PC gap zone (TPR 80 %, between B-cap 73 % and A-floor 95 %); benign 21 %, dual-use 29 %, Youden's J **+0.59** |

-The v1.1-frozen panel (18 frontier models + Llama 3.3 70B control + NVIDIA Nemotron 3 Super 120B, all under the v1.1 council) remains the canonical snapshot referenced in the manuscript. Opus 4.8 walks back Opus 4.7's benign over-refusal (77 % → 57 %), recovering discrimination (Youden's J +0.23 → +0.43) while holding dual-use refusal at 100 %.
+The v1.1-frozen panel (18 frontier models + Llama 3.3 70B control + NVIDIA Nemotron 3 Super 120B, all under the v1.1 council) remains the canonical snapshot referenced in the manuscript.


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix model-count arithmetic in snapshot description.

“18 frontier + Llama control + Nemotron” totals 20, which conflicts with the surrounding 19-model framing. Please correct either the count or the listed components.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@README.md` at line 29, The snapshot description for "v1.1-frozen panel" currently lists "18 frontier models + Llama 3.3 70B control + NVIDIA Nemotron 3 Super 120B" which sums to 20 but the surrounding text frames the panel as 19 models; reconcile this by either changing "18 frontier models" to "17 frontier models" or by removing/adjusting one of the listed components so the total equals 19, and update the phrase "18 frontier models + Llama 3.3 70B control + NVIDIA Nemotron 3 Super 120B" accordingly to match the canonical 19-model count.

coderabbitai Bot reviewed Jun 3, 2026

View reviewed changes

VibeCodingScientist merged commit 7968472 into main Jun 3, 2026
4 checks passed

VibeCodingScientist deleted the add-minimax-m3 branch June 3, 2026 15:46

coderabbitai Bot mentioned this pull request Jun 6, 2026

Add Nemotron 3 Ultra 550B (post-v1.1, rotated v1.3 council) #8

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MiniMax M3 (post-v1.1, rotated v1.3 council)#7

Add MiniMax M3 (post-v1.1, rotated v1.3 council)#7
VibeCodingScientist merged 1 commit into
mainfrom
add-minimax-m3

VibeCodingScientist commented Jun 3, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 3, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 3, 2026

Uh oh!

coderabbitai Bot Jun 3, 2026

Uh oh!

coderabbitai Bot Jun 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

VibeCodingScientist commented Jun 3, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Result

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

VibeCodingScientist commented Jun 3, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 3, 2026 •

edited

Loading