add flaggems test script #1022

tianxiao-baai · 2025-10-30T03:14:51Z

OP Test

New Feature

Description

add unified benchmark summary script for operator testing

gemini-code-assist · 2025-10-30T03:15:06Z

Summary of Changes

Hello @tianxiao-baai, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request delivers a new, unified benchmark summary script for operator testing within the FlagGems framework. The script automates the execution of both accuracy and performance tests, leveraging parallel processing across specified GPUs to efficiently gather comprehensive data. Its primary goal is to provide a streamlined method for evaluating operator performance and correctness, consolidating all critical metrics into an easily digestible CSV report.

Highlights

New Test Script: Introduces run_flaggems_tests.sh, a new bash script designed to automate the end-to-end testing of operators within the FlagGems framework.
Parallel Execution: The script enables concurrent accuracy and performance testing of operators by distributing them across multiple specified GPUs, significantly improving testing efficiency.
Unified Reporting: It generates a comprehensive CSV summary of test results, including accuracy status (passed, failed, skipped) and detailed speedup metrics for various data types such as float16, float32, bfloat16, int16, int32, bool, and cfloat.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a new shell script for running FlagGems operator tests in parallel across multiple GPUs. The script is a valuable addition for automating the testing process. My review focuses on improving the script's robustness, efficiency, and maintainability. I've identified several areas for improvement, including unsafe variable handling that could lead to errors, inefficient calculations within loops, and brittle logic for file and output parsing. Applying the suggested changes will make the script more reliable and performant.

run_flaggems_tests.sh

gemini-code-assist · 2025-10-30T03:17:18Z

run_flaggems_tests.sh

+                sum=0; cnt=0
+                for v in "$float16_speedup" "$float32_speedup" "$bfloat16_speedup" "$int16_speedup" "$int32_speedup" "$bool_speedup" "$cfloat_speedup"; do
+                    if [[ $v =~ $is_number_regex ]]; then
+                        # 只把大于 0 的计入
+                        gt=$(awk -v x="$v" 'BEGIN{ print (x+0>0) ? 1 : 0 }')
+                        if [ "$gt" -eq 1 ]; then
+                            sum=$(awk -v a="$sum" -v b="$v" 'BEGIN{ printf "%.12f", a + b }')
+                            cnt=$((cnt+1))
+                        fi
+                    fi
+                done


This block calculates a fallback average speedup by calling awk multiple times inside a for loop. Spawning a new process for each arithmetic operation and comparison is highly inefficient. This entire calculation can be performed in a single, much faster awk command that processes all values at once.

Suggested change

sum=0; cnt=0

for v in "$float16_speedup" "$float32_speedup" "$bfloat16_speedup" "$int16_speedup" "$int32_speedup" "$bool_speedup" "$cfloat_speedup"; do

if [[ $v =~ $is_number_regex ]]; then

# 只把大于 0 的计入

gt=$(awk -v x="$v" 'BEGIN{ print (x+0>0) ? 1 : 0 }')

if [ "$gt" -eq 1 ]; then

sum=$(awk -v a="$sum" -v b="$v" 'BEGIN{ printf "%.12f", a + b }')

cnt=$((cnt+1))

fi

fi

done

local values="$float16_speedup $float32_speedup $bfloat16_speedup $int16_speedup $int32_speedup $bool_speedup $cfloat_speedup"

read sum cnt < <(echo "$values" | awk '{

s=0; c=0;

for(i=1; i<=NF; i++) {

if ($i > 0) { s+=$i; c++ }

}

print s, c

}')

sum=${sum:-0}

cnt=${cnt:-0}

run_flaggems_tests.sh

gemini-code-assist · 2025-10-30T03:17:18Z

run_flaggems_tests.sh

+            float16_speedup=$(echo "$float16_speedup" | sed 's/^"//;s/"$//;s/^[[:space:]]*//;s/[[:space:]]*$//')
+            float32_speedup=$(echo "$float32_speedup" | sed 's/^"//;s/"$//;s/^[[:space:]]*//;s/[[:space:]]*$//')
+            bfloat16_speedup=$(echo "$bfloat16_speedup" | sed 's/^"//;s/"$//;s/^[[:space:]]*//;s/[[:space:]]*$//')
+            int16_speedup=$(echo "$int16_speedup" | sed 's/^"//;s/"$//;s/^[[:space:]]*//;s/[[:space:]]*$//')
+            int32_speedup=$(echo "$int32_speedup" | sed 's/^"//;s/"$//;s/^[[:space:]]*//;s/[[:space:]]*$//')
+            bool_speedup=$(echo "$bool_speedup" | sed 's/^"//;s/"$//;s/^[[:space:]]*//;s/[[:space:]]*$//')
+            cfloat_speedup=$(echo "$cfloat_speedup" | sed 's/^"//;s/"$//;s/^[[:space:]]*//;s/[[:space:]]*$//')
+            avg_speedup=$(echo "$avg_speedup" | sed 's/^"//;s/"$//;s/^[[:space:]]*//;s/[[:space:]]*$//')


These lines repeat the same sed command for eight different variables. This violates the DRY (Don't Repeat Yourself) principle and makes the code harder to maintain. Furthermore, the awk script's trim function (lines 108-113) already removes surrounding quotes and whitespace, which makes these sed calls redundant. I recommend removing these lines for cleaner code. If they are intended as a defensive measure, they should be refactored into a single function.

gemini-code-assist · 2025-10-30T03:17:18Z

run_flaggems_tests.sh

+# ---------------- 执行测试 ----------------
+for gpu_id in "${GPUS[@]}"; do
+    {
+        for op in ${GPU_TASKS[$gpu_id]}; do


The loop for op in ${GPU_TASKS[$gpu_id]} relies on word splitting to iterate through operator names. If an operator name contains whitespace, it will be incorrectly split into multiple parts, causing test failures. To make this more robust, you could store operators as a newline-separated string (e.g., GPU_TASKS[$gpu_id]+="${OPS[$i]}"$'\n' on line 220) and then iterate using a while read loop: while IFS= read -r op; do [[ -n "$op" ]] && run_op_test "$op" "$gpu_id"; done <<< "${GPU_TASKS[$gpu_id]}".

meinie0826 · 2025-10-30T06:22:57Z

Please install the pre-commit, and format your code. You can refer to https://github.com/FlagOpen/FlagGems/blob/master/docs/code_countribution.md.

meinie0826 · 2025-10-30T06:23:31Z

Gemini provides some effective suggestions that can be referenced.

Signed-off-by: tianxiao <[email protected]>

Removed summary header and timestamp from the summary file. Signed-off-by: tianxiao <[email protected]>

Signed-off-by: tianxiao <[email protected]>

CLAassistant · 2025-11-18T01:26:45Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

add flaggems test script

1554c4e

gemini-code-assist bot reviewed Oct 30, 2025

View reviewed changes

tianxiao-baai added 4 commits November 3, 2025 09:54

code format

43c83bf

fix：make pytest invocations safe with GPU env and log paths

36f8db9

fix:delete benchmark result file before run

6ec9ba0

bugfix

04b3bd0

meinie0826 previously approved these changes Nov 5, 2025

View reviewed changes

Refactor run_flaggems_tests.sh for clarity and updates

850b710

Signed-off-by: tianxiao <[email protected]>

tianxiao-baai dismissed meinie0826’s stale review via 850b710 November 5, 2025 06:12

meinie0826 previously approved these changes Nov 5, 2025

View reviewed changes

code format

0a4c31c

tianxiao-baai dismissed meinie0826’s stale review via 0a4c31c November 7, 2025 06:33

tianxiao-baai added 3 commits November 7, 2025 15:50

Clean up summary output in run_flaggems_tests.sh

de8ad1e

Removed summary header and timestamp from the summary file. Signed-off-by: tianxiao <[email protected]>

Update summary header in run_flaggems_tests.sh

dc8d083

Signed-off-by: tianxiao <[email protected]>

Merge branch 'flagos-ai:master' into master

289723d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add flaggems test script #1022

add flaggems test script #1022

Uh oh!

tianxiao-baai commented Oct 30, 2025

Uh oh!

gemini-code-assist bot commented Oct 30, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Oct 30, 2025

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Oct 30, 2025

Uh oh!

gemini-code-assist bot Oct 30, 2025

Uh oh!

meinie0826 commented Oct 30, 2025

Uh oh!

meinie0826 commented Oct 30, 2025

Uh oh!

CLAassistant commented Nov 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

add flaggems test script #1022

Are you sure you want to change the base?

add flaggems test script #1022

Uh oh!

Conversation

tianxiao-baai commented Oct 30, 2025

OP Test

New Feature

Description

Uh oh!

gemini-code-assist bot commented Oct 30, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

meinie0826 commented Oct 30, 2025

Uh oh!

meinie0826 commented Oct 30, 2025

Uh oh!

CLAassistant commented Nov 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants