Skip to content

feat: Add Vaka LoCoMo benchmark scripts#1501

Closed
PowerfulLxx wants to merge 1 commit intovolcengine:mainfrom
PowerfulLxx:feat/vaka-locomo-benchmark
Closed

feat: Add Vaka LoCoMo benchmark scripts#1501
PowerfulLxx wants to merge 1 commit intovolcengine:mainfrom
PowerfulLxx:feat/vaka-locomo-benchmark

Conversation

@PowerfulLxx
Copy link
Copy Markdown
Contributor

@PowerfulLxx PowerfulLxx commented Apr 16, 2026

Description

Related Issue

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)
  • Performance improvement
  • Test update

Changes Made

Testing

  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have tested this on the following platforms:
    • Linux
    • macOS
    • Windows

Checklist

  • My code follows the project's coding style
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Screenshots (if applicable)

Additional Notes

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


lixiong.124 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@github-actions
Copy link
Copy Markdown

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🏅 Score: 85
🧪 No relevant tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review

JSON Parsing Fragility

Uses raw json.loads on judge LLM output, which can fail for minor JSON irregularities (trailing commas, extra text, etc.). Use json-repair to handle malformed JSON gracefully.

def extract_json_object(content: str) -> dict:
    start_idx = content.find("{")
    end_idx = content.rfind("}")
    if start_idx == -1 or end_idx == -1 or end_idx < start_idx:
        raise ValueError(f"No JSON object found in judge response: {content}")
    return json.loads(content[start_idx : end_idx + 1])
Typo in Field Name

Field name "command_abbility" is misspelled (should be "ability").

"command_abbility",

@github-project-automation github-project-automation bot moved this from Backlog to Done in OpenViking project Apr 16, 2026
@github-actions
Copy link
Copy Markdown

PR Code Suggestions ✨

No code suggestions found for the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants