Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions _posts/2024-08-29-style-control.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,13 @@ authors:
url: "https://infwinston.github.io/"
---

### Updated Notes (2025/05/15):
- Starting from May 15, 2025, Style Control becomes the default view on [lmarena.ai/leaderboard](https://lmarena.ai/leaderboard).
- For consistency, we change the anchor model to be the current #1 model on the non-style-control leaderboard (specifically, `gemini-2.5-pro-preview-05-06` to 1446). This does not change the interpretation of the scores because the Bradley-Terry coefficients are translation invariant. We perform this translation to preserve consistency between the style-control and non-style-control leaderboards.
- Moving forward, the non-style-control leaderboard will be referred to as the "raw leaderboard", and the default will be the style-control leaderboard. We will no longer explicitly mention style control.

---

Why is GPT-4o-mini so good? Why does Claude rank so low, when anecdotal experience suggests otherwise?

We have answers for you. We controlled for the effect of length and markdown, and indeed, _the ranking changed_. This is just a first step towards our larger goal of disentangling **substance** and **style** in Chatbot Arena leaderboard.
Expand Down
Loading