Skip to content
@SWE-bench

SWE-bench

Organization for maintaining the SWE-bench/agent projects

SWE-bench

Pinned Loading

  1. SWE-bench SWE-bench Public

    SWE-bench [Multimodal]: Can Language Models Resolve Real-world Github Issues?

    Python 2.5k 425

  2. experiments experiments Public

    Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.

    Shell 147 150

Repositories

Showing 6 of 6 repositories
  • experiments Public

    Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.

    SWE-bench/experiments’s past year of commit activity
    Shell 147 150 4 14 Updated Feb 25, 2025
  • sb-cli Public

    Run SWE-bench evaluations remotely

    SWE-bench/sb-cli’s past year of commit activity
    Python 5 MIT 0 3 0 Updated Feb 25, 2025
  • swe-bench.github.io Public

    Landing page + leaderboard for SWE-Bench benchmark

    SWE-bench/swe-bench.github.io’s past year of commit activity
    HTML 1 4 1 1 Updated Feb 25, 2025
  • SWE-bench Public

    SWE-bench [Multimodal]: Can Language Models Resolve Real-world Github Issues?

    SWE-bench/SWE-bench’s past year of commit activity
    Python 2,499 MIT 425 32 5 Updated Feb 25, 2025
  • .github Public
    SWE-bench/.github’s past year of commit activity
    0 0 0 0 Updated Oct 24, 2024
  • humanevalfix-results Public

    Evaluation data + results for SWE-agent inference on HumanEvalFix task

    SWE-bench/humanevalfix-results’s past year of commit activity
    Jupyter Notebook 0 0 0 0 Updated Jul 11, 2024

Most used topics

Loading…