What's Changed
⚠️ Breaking Changes
- Attacks and executors now operate on
Messageinstead ofSeedPromptGroup - Scorer evaluation and registry refactors introduce new protocols and identifiers
- Scenario names and configuration APIs have been renamed for consistency
PrependedConversationConfigand attack parameter handling have been aligned- Message normalization and registry metadata were refactored
Please review the deprecation notes and migration guidance before upgrading.
🎯 Targets
- Added
WebSocketCopilotTarget, enabling WebSocket-based prompt execution against Microsoft Copilot - Refactored
ImageTarget, including image download support - Added image edit/remix support to
OpenAIImageTarget - Introduced target identifiers (including underlying model and version metadata) across all target classes
- Added audio and tool support to chat completions
📚 Datasets
- Added VLSU Multimodal Dataset
- Added 30 jailbreak attack templates, spanning:
- Authority & institutional framing (6)
- Philosophical / decision-theory exploits (5)
- Identity / persona attacks (4)
- Context manipulation (4)
- Few-shot priming (3)
- Fictional / narrative framing (3)
- Technical exploits (3)
- Emotional / scenario-based attacks (2)
- Restored the Transphobia Awareness Dataset
🔄 Converters
- Added NegationTrapConverter which frames requests as negations
- Added ConverterIdentifier and standardized identifiable behavior
- Reorganized and expanded converter documentation
- Fixed edge cases in word-selection converters and perturbation loops
⚙️ Executors & Attacks
- Aligned attack parameters across executors
- Updated attack interface to use
Message - Added ChunkedRequestAttack which extracts data by requesting it in small chunks
- Added support for simulated conversations in attacks
- Improved attack reliability, error reporting, and maintainability
📊 Scoring
- Enabled multi-modal scoring support for
SelfAskTrueFalseScorer, allowing image- and multimodal-aware evaluations - Refactored scorer evaluation flow and registry integration
- Added scorer identifiers and improved metadata consistency
- Introduced stricter typing and clearer scorer interfaces
🧪 Scanners & Scenarios
- Added new scenarios:
- Scams
- Leakage
- Psychosocial
- Added
ScenarioDatasetConfigurationallowing custom dataset configuration - Enabled baseline-only execution for scenarios
- Renamed scenarios for clarity and consistency
- Improved scenario documentation and example notebooks
🧰 Setup & Tooling
- Added UV support for dependency management
- Improved devcontainer experience:
- ARM64 / Apple Silicon support
- Simplified virtual environment handling
- Environment file configurability
- Consolidated linting under ruff
- Enabled strict mypy checking across the repository
- Added skeleton frontend and backend for the GUI
🧩 Other
- Added new
binary_pathdata type to support binary artifacts and richer schema definitions - Added identifiers across targets, scorers, and converters
- Multiple reliability and integration test improvements
🐛 Fixes & Maintenance
- Numerous fixes across:
- Image handling and integration tests
- Docker and devcontainer setup
- Environment activation and permissions
- Retry configuration and pipelines
- Improved type hinting across authentication and analytics modules
- Added
py.typedfor better downstream type checking
🆕 New Contributors
A big thank you to our new contributors! 🎉
- @Arth-Singh made their first contribution in #1254
- @ytc338 made their first contribution in #1300
- @fitzpr made their first contribution in #1261
- @fukusuket made their first contribution in #1305
- @varunj-msft made their first contribution in #1284
- @slister1001 made their first contribution in #1321
Full List of Changes
- FEAT Integration Request: Jailbreak Template Collection for Enhanced Red Teaming. by @Arth-Singh in #1254
- MAINT: Edge Case with Word Selection Converters by @rlundeen2 in #1257
- MAINT: Fixing Retry configuration so it works from .env by @rlundeen2 in #1256
- MAINT add missing API reference entries, add unit tests for API reference, and move fuzzer to executor.promptgen.fuzzer module by @romanlutz in #1258
- MAINT: fix docstrings for
/prompt_targetby @paulinek13 in #1263 - FIX add transphobia awareness dataset back by @romanlutz in #1264
- FEAT add UV support by @hannahwestra25 in #1226
- TEST: integration test fixes by @rlundeen2 in #1265
- MAINT Breaking: Modifying attack params by @rlundeen2 in #1260
- FEAT: Refactor and Enhance Scorer Identifier for Evaluations by @jsong468 in #1262
- FIX add OPENAI_CHAT_MODEL as required in docs, initializers by @romanlutz in #1267
- FIX: Add ARM64/Apple Silicon support for devcontainer builds by @riyosha in #1251
- FIX: use max_iterations in CharSwapConverter perturbation loop by @KutalVolkan in #1269
- FIX activate env by @hannahwestra25 in #1274
- FIX make bash default and remove volume mount for venv in devcontainer by @romanlutz in #1277
- MAINT add py.typed to help with mypy type checking for consuming packages by @romanlutz in #1271
- MAINT CONTROVERSIAL: Make env files configurable by @rlundeen2 in #1253
- FIX fix permission denied error when creating env by @hannahwestra25 in #1279
- MAINT remove dispose memory engine calls in docs by @romanlutz in #1278
- FIX: Updating Pipelines by @rlundeen2 in #1282
- MAINT: Updating AttackExecutor to more generically call attacks by @rlundeen2 in #1270
- FIX: set virtual env in docker dev setup by @hannahwestra25 in #1281
- FIX/FEAT: Enable multi-modal pieces for SelfAskTrueFalseScorer scoring by @jsong468 in #1287
- FEAT: Adding simulated_conversation and adding prepended_conversation to context by @rlundeen2 in #1276
- FIX: Bug with prepended_conversation system prompt by @rlundeen2 in #1289
- FEAT: adding underlying_model for target identification by @jsong468 in #1234
- MAINT change devcontainer base image to python from MCR, mount .env* files into devcontainer by @romanlutz in #1291
- DOC reorganize converter docs by @romanlutz in #1268
- FIX: integration tests and ImageTarget refactor by @rlundeen2 in #1293
- FEAT Support JSON Schema in Responses by @riedgar-ms in #1177
- FIX remove obsolete assertions from image target integration tests by @romanlutz in #1294
- MAINT update ignored notebook index by @romanlutz in #1295
- FIX: integration test fixes by @rlundeen2 in #1297
- MAINT add skeleton frontend by @romanlutz in #1290
- MAINT: Adding simulated assistant role by @rlundeen2 in #1292
- MAINT Breaking: Message Normalizer Refactor by @rlundeen2 in #1296
- FEAT: Scenario DatasetConfiguration by @rlundeen2 in #1288
- MAINT BREAKING: Renaming scenarios by @rlundeen2 in #1301
- FEAT add skeleton backend for the GUI/frontend by @romanlutz in #1298
- FIX Breaking: PrependedConversationConfig and Attack Param Consistency by @rlundeen2 in #1299
- FEAT: New Scenario - Scams by @nina-msft in #1202
- MAINT add deprecation instructions by @romanlutz in #1303
- MAINT remove flake8, black and consolidate under ruff (including copyright check) by @romanlutz in #1302
- MAINT: Enhance type hinting across auth and analytics modules by @ytc338 in #1300
- FEAT: Add NegationTrapConverter and ChunkedRequestAttack by @fitzpr in #1261
- FIX example filename in Docker setup instructions by @fukusuket in #1305
- FEAT BREAKING: Scorer evaluation refactor by @jsong468 in #1280
- FEAT: SeedSimulatedConversation to generate simulated conversations in attacks by @rlundeen2 in #1304
- MAINT: Fixing deprecated usage by @rlundeen2 in #1306
- FEAT Breaking: Registry protocol + ScorerRegistry by @rlundeen2 in #1308
- MAINT strict mypy checking on the whole repository by @romanlutz in #1310
- MAINT: fix docstrings for
/prompt_converterby @paulinek13 in #1314 - FEAT: Added VLSU Multimodal Dataset by @riyosha in #1309
- FEAT: Add binary_path data type by @jsong468 in #1315
- FIX MAINT: Improved Attack reliability and maintainability by @rlundeen2 in #1317
- FEAT: Adding audio and tool support to chat completions by @rlundeen2 in #1311
- FEAT: Leakage Scenario - New by @varunj-msft in #1284
- MAINT: Registry Metadata Refactor by @rlundeen2 in #1323
- FIX: commit
blank_canvas.pngrequired byLeakageScenarioby @paulinek13 in #1330 - DOC: Update
doc/code/scenarios/1_configuring_scenariosNotebook by @nina-msft in #1332 - DOC: Add Response Converters section to converters documentation by @Copilot in #1326
- MAINT: Refactoring General Identifiers and ScorerIdentifier by @rlundeen2 in #1328
- FEAT: More Informative Attack Exceptions by @rlundeen2 in #1318
- FEAT: add WebSocket-based prompt target for Microsoft Copilot by @paulinek13 in #1275
- TEST add frontend unit and e2e tests by @romanlutz in #1331
- FIX Auto-wrap synchronous token providers for AsyncOpenAI compatibility by @Copilot in #1327
- FEAT: Adding ConverterIdentifier and minor Identifiable refactor by @rlundeen2 in #1333
- FEAT Support baseline-only execution in Scenario by @slister1001 in #1321
- FEAT: Psychosocial Scenario by @jbolor21 in #1266
- FEAT: Support image edition/remix in OpenAIImageTarget by @fdubut in #1322
- FIX update encoding default data configuration by @hannahwestra25 in #1335
- FEAT: Adding pyrit_version to identifiers by @rlundeen2 in #1334
- FEAT: Adding
TargetIdentifierby @jsong468 in #1336 - FIX use message in xpia website notebook by @hannahwestra25 in #1339