Skip to content

xiaomi-research/automine

Repository files navigation

AutoMine Solution for AV2 2026 Scenario Mining Challenge

🏆 Competition   |   📄 Technical Report

Songliang Cao1,2 *, Jiele Zhao1 *, Yuru Wang1, Hao Li1, Daqi Liu1, Zehan Zhang1 †
Fangzhen Li1 †, Yu Wang, Yue Zhang, Bing Wang1, Guang Chen1, Hao Lu2, Hangjun Ye1

1Xiaomi EV   2Huazhong University of Science and Technology

*This work was done during internship at Xiaomi.   Project leader.

AutoMine pipeline

Table of Contents

1. Installation

Using Conda is recommended for environment management

conda create -n refAV python=3.10
conda activate refAV

All of the required libaries and packages can be installed with

pip install -r requirements.txt
export PYTHONPATH=.

2. Data Preparation

2.1 Argoverse2 Sensor Dataset

For more information check out Argoverse User Guide.

conda install s5cmd -c conda-forge

export DATASET_NAME="sensor"  # sensor, lidar, motion_forecasting or tbv.
export TARGET_DIR="$HOME/data/datasets"  # Target directory on your machine.

s5cmd --no-sign-request cp "s3://argoverse/datasets/av2/$DATASET_NAME/*" $TARGET_DIR

2.2 Argoverse2 Scenario Mining Dataset

The RefAV scenario mining dataset can be downloaded from Huggingface using:

pip install huggingface_hub
hf auth login

export TARGET_DIR="$(pwd)/scenario_mining_downloads"
hf download CainanD/RefAV --repo-type dataset --local-dir=$TARGET_DIR

Alternatively, you may download the scenario-mining add on from Argoverse without having to sign-in to Huggingface with:

export TARGET_DIR="$(pwd)/scenario_mining_downloads"
s5cmd --no-sign-request cp "s3://argoverse/tasks/scenario_mining/*" $TARGET_DIR

2.3 Tracking Predictions

Our method is based on 3D tracking results. You can either generate tracking results using a SOTA method, or download existing results (our choice):

  • Option 1 — Generate your own tracking results. See the LT3D repository for information on training a baseline detector and tracker on the Argoverse 2 dataset.
  • Option 2 — Download existing tracking results (recommended). We use the tracking outputs from previous winning Argoverse submissions:

3. Running

Before running the code, please configure the relevant paths in refAV/path.py.

3.1 Track Refinement

The provided 3D tracking results contain a certain amount of noise, so we refine them before scenario mining.

Note: The validation-set tracking results contain a significant number of over-detections, which slow down the pipeline and degrade final performance. We use a VLM for grounding to filter out most of these over-detections.

# use Qwen3.5 for grounding
git clone https://github.com/QwenLM/Qwen3-VL.git grounding/Qwen3-VL
python grounding/batch_grounding_av2.py

# correct the tracking results using the grounded results
python grounding/correct_tracking.py

For both the test and val tracking results, we additionally apply a unified track-refinement step to mitigate trajectory noise (ID switches, overlapping boxes, etc.):

python grounding/run_tracker.py 
        --val_dir /path/to/val \
        --output_dir /path/to/output \
        --mode stitch

For convenience, we also provide the refined tracking results so you can skip this step: TODO.

3.2 Scenario Mining

Based on the refined tracking results, we perform scenario mining and generate the scenario description files.

A. Description augmentation

python desc_aug.py 
  --batch_file tools/log_prompt_pairs_test_unique.json \
  --output_json output/desc_aug/desc_aug_test.json \
  --model claude-opus-4-6 

B. Generate base code with double check

python double_check_desc_aug.py 
  --aug_file output/desc_aug/desc_aug_test.json  
  --output_dir output/llm_code_predictions/RefAV/double_check_test 
  --include_original
  --workers 12 

This produces generated code for each description-augmentation variant, written to double_check_test/, double_check_test_aug1/, and double_check_test_aug2/ respectively.

C. Self-iterative refinement

For each augmentation variant, we run a self-iterative feedback loop that re-prompts the coder model with execution feedback to refine the generated code. Run the command below once per variant by changing the --base argument to double_check_test, double_check_test_aug1, and double_check_test_aug2 in turn.

bash iterative_feedback.sh \
  --base double_check_test \  # repeat with double_check_test_aug1 / double_check_test_aug2
  --rounds 5 \
  --split test \
  --tracker Le3DE2D_tracking_tz_adjustment_refine \
  --atomic_func_name atomic_functions_0529 \
  --batch_file tools/log_prompt_pairs_test_unique.json \
  --log_prompts /data/Scenario_Mining_Challenge/av2_sm_downloads/log_prompt_pairs_test.json \
  --coder claude-opus-4-6

D. Ensemble prediction

We ensemble the refined predictions from all three augmentation variants to produce the final submission.

python pkl_ensemble.py

Acknowledgements

Our work is built upon the foundational framework provided by the RefAV repository. We extend our sincere gratitude to the authors for making their code public.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages