Requesting model to test? #3

TomLucidor · 2024-11-15T07:54:22Z

For both the Code Editing and Refactoring benchmark, it does not include enough LLMs like Phi-3 / Phi-3.1 / Phi-3.5 family of models since they claim to be both small and capable. This issue came up when observing dashboards such as BigCodeBench and EvalPlus.

paul-gauthier · 2024-11-15T13:51:47Z

Thanks for trying aider and filing this issue.

PRs are welcome!

https://aider.chat/docs/leaderboards/#contributing-benchmark-results

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Requesting model to test? #3

Requesting model to test? #3

TomLucidor commented Nov 15, 2024

paul-gauthier commented Nov 15, 2024

Requesting model to test? #3

Requesting model to test? #3

Comments

TomLucidor commented Nov 15, 2024

paul-gauthier commented Nov 15, 2024