Skip to content
Change the repository type filter

All

    Repositories list

    • JavaScript
      29000Updated Jan 9, 2026Jan 9, 2026
    • OpenRT

      Public
      Open-source red teaming framework for MLLMs with 37+ attack methods
      Python
      014700Updated Jan 8, 2026Jan 8, 2026
    • AIGC-Identification-Toolkit

      Public
      Jupyter Notebook
      1510Updated Dec 26, 2025Dec 26, 2025
    • Python
      01100Updated Dec 16, 2025Dec 16, 2025
    • IS-Bench

      Public
      Data and Code for Paper IS-Bench: Evaluating Interactive Safety of VLM-Driven Embodied Agents in Daily Household Tasks
      Python
      23410Updated Nov 24, 2025Nov 24, 2025
    • The code repo of paper "X-Boundary: Establishing Exact Safety Boundary to Shield LLMs from Multi-Turn Jailbreaks without Compromising Usability"
      Python
      43800Updated Nov 24, 2025Nov 24, 2025
    • AIGC_detection

      Public
      Makefile
      0005Updated Oct 13, 2025Oct 13, 2025
    • CodeAttack

      Public
      [ACL 2024] CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion
      Python
      85810Updated Oct 1, 2025Oct 1, 2025
    • Vue
      0100Updated Sep 29, 2025Sep 29, 2025
    • VLSBench

      Public
      [ACL 2025] Data and Code for Paper VLSBench: Unveiling Visual Leakage in Multimodal Safety
      Python
      15200Updated Jul 21, 2025Jul 21, 2025
    • .github

      Public
      0000Updated Jul 18, 2025Jul 18, 2025
    • TELLME

      Public
      Self-Explainability Enhancement of LLMs’ Representations
      Python
      01300Updated Jul 2, 2025Jul 2, 2025
    • [ICML 2025] ReflectionBench: Evaluating Epistemic Agency in Large Language Models
      Python
      21310Updated Jun 24, 2025Jun 24, 2025
    • MLLMGuard

      Public
      Python
      44430Updated Jun 19, 2025Jun 19, 2025
    • CELLO

      Public
      Python
      0100Updated Feb 13, 2025Feb 13, 2025
    • MORE

      Public
      JavaScript
      0000Updated Feb 13, 2025Feb 13, 2025
    • Python
      0000Updated Feb 13, 2025Feb 13, 2025
    • CaLM

      Public
      Python
      0100Updated Feb 13, 2025Feb 13, 2025
    • ADCE

      Public
      The official code for paper: Beyond Surface Structure: A Causal Assessment of LLMs' Comprehension ability.
      Python
      1100Updated Feb 13, 2025Feb 13, 2025
    • Python
      1011821Updated Feb 3, 2025Feb 3, 2025
    • Python
      0200Updated Jan 17, 2025Jan 17, 2025
    • ESC-Eval

      Public
      Python
      01000Updated Jan 17, 2025Jan 17, 2025
    • Python
      0100Updated Jan 17, 2025Jan 17, 2025
    • Python
      0300Updated Jan 17, 2025Jan 17, 2025
    • modpo

      Public
      Python
      0100Updated Jan 17, 2025Jan 17, 2025
    • 【ACL 2024】 SALAD benchmark & MD-Judge
      Python
      0200Updated Jan 16, 2025Jan 16, 2025
    • REEF

      Public
      The repository of the paper "REEF: Representation Encoding Fingerprints for Large Language Models," aims to protect the IP of open-source LLMs.
      Python
      87410Updated Jan 16, 2025Jan 16, 2025
    • CredID

      Public
      Python
      3300Updated Dec 26, 2024Dec 26, 2024
    • T2ISafety

      Public
      0300Updated Nov 25, 2024Nov 25, 2024
    • DEAN

      Public
      Python
      21000Updated Oct 25, 2024Oct 25, 2024