🏆 Competition | 📄 Technical Report
Songliang Cao1,2 *, Jiele Zhao1 *, Yuru Wang1, Hao Li1, Daqi Liu1, Zehan Zhang1 †
Fangzhen Li1 †, Yu Wang, Yue Zhang, Bing Wang1, Guang Chen1, Hao Lu2, Hangjun Ye1
1Xiaomi EV 2Huazhong University of Science and Technology
*This work was done during internship at Xiaomi. †Project leader.
Using Conda is recommended for environment management
conda create -n refAV python=3.10
conda activate refAV
All of the required libaries and packages can be installed with
pip install -r requirements.txt
export PYTHONPATH=.
For more information check out Argoverse User Guide.
conda install s5cmd -c conda-forge
export DATASET_NAME="sensor" # sensor, lidar, motion_forecasting or tbv.
export TARGET_DIR="$HOME/data/datasets" # Target directory on your machine.
s5cmd --no-sign-request cp "s3://argoverse/datasets/av2/$DATASET_NAME/*" $TARGET_DIR
The RefAV scenario mining dataset can be downloaded from Huggingface using:
pip install huggingface_hub
hf auth login
export TARGET_DIR="$(pwd)/scenario_mining_downloads"
hf download CainanD/RefAV --repo-type dataset --local-dir=$TARGET_DIR
Alternatively, you may download the scenario-mining add on from Argoverse without having to sign-in to Huggingface with:
export TARGET_DIR="$(pwd)/scenario_mining_downloads"
s5cmd --no-sign-request cp "s3://argoverse/tasks/scenario_mining/*" $TARGET_DIR
Our method is based on 3D tracking results. You can either generate tracking results using a SOTA method, or download existing results (our choice):
- Option 1 — Generate your own tracking results. See the LT3D repository for information on training a baseline detector and tracker on the Argoverse 2 dataset.
- Option 2 — Download existing tracking results (recommended). We use the tracking outputs from previous winning Argoverse submissions:
- Test set: Google Drive
- Val set: Hugging Face or Google Drive
Before running the code, please configure the relevant paths in refAV/path.py.
The provided 3D tracking results contain a certain amount of noise, so we refine them before scenario mining.
Note: The validation-set tracking results contain a significant number of over-detections, which slow down the pipeline and degrade final performance. We use a VLM for grounding to filter out most of these over-detections.
# use Qwen3.5 for grounding
git clone https://github.com/QwenLM/Qwen3-VL.git grounding/Qwen3-VL
python grounding/batch_grounding_av2.py
# correct the tracking results using the grounded results
python grounding/correct_tracking.py
For both the test and val tracking results, we additionally apply a unified track-refinement step to mitigate trajectory noise (ID switches, overlapping boxes, etc.):
python grounding/run_tracker.py
--val_dir /path/to/val \
--output_dir /path/to/output \
--mode stitch
For convenience, we also provide the refined tracking results so you can skip this step: TODO.
Based on the refined tracking results, we perform scenario mining and generate the scenario description files.
A. Description augmentation
python desc_aug.py
--batch_file tools/log_prompt_pairs_test_unique.json \
--output_json output/desc_aug/desc_aug_test.json \
--model claude-opus-4-6
B. Generate base code with double check
python double_check_desc_aug.py
--aug_file output/desc_aug/desc_aug_test.json
--output_dir output/llm_code_predictions/RefAV/double_check_test
--include_original
--workers 12
This produces generated code for each description-augmentation variant, written to double_check_test/, double_check_test_aug1/, and double_check_test_aug2/ respectively.
C. Self-iterative refinement
For each augmentation variant, we run a self-iterative feedback loop that re-prompts the coder model with execution feedback to refine the generated code. Run the command below once per variant by changing the --base argument to double_check_test, double_check_test_aug1, and double_check_test_aug2 in turn.
bash iterative_feedback.sh \
--base double_check_test \ # repeat with double_check_test_aug1 / double_check_test_aug2
--rounds 5 \
--split test \
--tracker Le3DE2D_tracking_tz_adjustment_refine \
--atomic_func_name atomic_functions_0529 \
--batch_file tools/log_prompt_pairs_test_unique.json \
--log_prompts /data/Scenario_Mining_Challenge/av2_sm_downloads/log_prompt_pairs_test.json \
--coder claude-opus-4-6
D. Ensemble prediction
We ensemble the refined predictions from all three augmentation variants to produce the final submission.
python pkl_ensemble.py
Our work is built upon the foundational framework provided by the RefAV repository. We extend our sincere gratitude to the authors for making their code public.
