Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion router_inference/predictions/vllm-sr-robustness.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
{
"global index": "AIME_112",
"prompt": "Deliver a thorough resolution of the ensuing arithmetic puzzle in a sequential manner.\n\nScenario: Absent\n\nInquiry: With an inexhaustible stockpile of coins valued at 1-cent, 10-cents, and 25-cents, Silas is on a quest to assemble a coin collection summing to $ N $ cents, where $ N $ is a positive integer figure. He employs what's known as the **greedy method**, opting each time for the highest denomination that doesn't push his collection's sum beyond $ N $. Illustratively, for 42 cents, Silas opts for a 25-cent coin, followed by a 10-cent, and then seven 1-cent coins. However, this assortment of nine coins surpasses the required count to reach 42 cents; in fact, selecting four 10-cent coins and two 1-cent coins sums up the same with only six coins.\n\nTypically, the greedy method prevails for a specific $ N $ if no alternate assemblage of 1-cent, 10-cent, and 25-cent coins sums to $ N $ cents using a smaller number of coins than the greedy method's set. Ascertain the quantity of values for $ N $ spanning from 1 to 1000 inclusive where the greedy method is successful.\n\nSubmit your conclusive response in the format \\boxed{{}}, with the precise mathematical expression or numeral encased within the braces. For instance: \\boxed{{42}}. Ensure your elucidation is explicit, succinct, and confined to three sentences.",
"prediction": "gemini-2.0-flash-001",
"prediction": "gpt-4o-mini",
"generated_result": null,
"cost": null,
"accuracy": null,
Expand Down
2 changes: 1 addition & 1 deletion router_inference/predictions/vllm-sr.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
{
"global index": "ArcMMLU_655",
"prompt": "Please read the following multiple-choice questions and provide the most likely correct answer based on the options given.\n\nContext: None\n\nQuestion: The average life expectancy of the most of firm is around ( ) years.\n\nOptions: \nA. 10\nB. 20\nC. 30\nD. 40\n\n\nProvide the correct letter choice in \\boxed{X}, where X is the correct letter choice. Keep the explanation or feedback within 3 sentences.",
"prediction": "gemini-2.0-flash-001",
"prediction": "gpt-4o-mini",
"generated_result": {
"generated_answer": "The correct answer is difficult to know without more context about what kind of \"firm\" is being referred to. However, a commonly cited statistic suggests that the average lifespan of S&P 500 companies is decreasing, with estimates around 20 years, so that makes B plausible. Therefore, the best answer among the options is:\n\n\\boxed{B}\n",
"success": true,
Expand Down