RTC adjustments. Bug fix & Alex Soare optimization #2499

helper2424 · 2025-11-21T14:14:52Z

What this does

Implement suggestion from the https://alexander-soare.github.io/robotics/2025/08/05/smooth-as-butter-robot-policies.html.

Two important changes:

Added the logic that allows skipping max_guidance param and use number of flow matching steps as basic clipping parameter.
The guidance calucaltion is extended with sigma_d. The default value is 1.0, so the default behavior is equal to roginal paper. But library users can adjust this value
Fix the bug with in-painting algorithm

How it was tested

Run pytest tests/policies/pi0_pi05
Run pytest tests/policies/smolvla
Run pytest tests/policies/rtc
Also, was run the following script to check different params for RTC:

#!/bin/bash

# Script to run RTC evaluation experiments with different parameters
# This script tests various combinations of:
# - num_inference_steps (flow matching steps)
# - sigma_d (variance clipping parameter)
# - Different policies (SmolVLA and PI0.5)

set -e  # Exit on error

# Color codes for output
GREEN='\033[0;32m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color

# Configuration arrays
NUM_STEPS=(2 5 10 20 50 100)
SIGMA_D_VALUES=(0.1 0.2 0.5 0.8 1.0 1.2 1.5)

# Model configurations
SMOLVLA_POLICY="helper2424/smolvla_check_rtc_last3"
SMOLVLA_DATASET="helper2424/check_rtc"
PI05_POLICY="lerobot/pi05_libero_finetuned"
PI05_DATASET="HuggingFaceVLA/libero"

# Common parameters
DEVICE="mps"
SEED=10
EXECUTION_HORIZON=8
ATTENTION_SCHEDULE="EXP"
INFERENCE_DELAY=4

# Create results directory
RESULTS_DIR="rtc_experiments_results"
mkdir -p "$RESULTS_DIR"

# Log file
LOG_FILE="$RESULTS_DIR/experiment_log.txt"
echo "RTC Evaluation Experiments - $(date)" > "$LOG_FILE"
echo "======================================" >> "$LOG_FILE"

# Function to run a single experiment
run_experiment() {
    local model_name=$1
    local policy_path=$2
    local dataset_repo=$3
    local num_steps=$4
    local sigma_d=$5

    local output_dir="${RESULTS_DIR}/${model_name}_steps_${num_steps}_sigma_${sigma_d}"

    echo -e "${BLUE}Running: ${model_name} | steps=${num_steps} | sigma_d=${sigma_d}${NC}"
    echo "$(date): Starting ${model_name} steps=${num_steps} sigma_d=${sigma_d}" >> "$LOG_FILE"

    # Run the evaluation
    uv run python examples/rtc/eval_dataset.py \
        --policy.path="$policy_path" \
        --dataset.repo_id="$dataset_repo" \
        --rtc.execution_horizon="$EXECUTION_HORIZON" \
        --rtc.sigma_d="$sigma_d" \
        --device="$DEVICE" \
        --rtc.prefix_attention_schedule="$ATTENTION_SCHEDULE" \
        --seed="$SEED" \
        --num_inference_steps="$num_steps" \
        --inference_delay="$INFERENCE_DELAY" \
        --output_dir="$output_dir" 2>&1 | tee -a "$LOG_FILE"

    if [ $? -eq 0 ]; then
        echo -e "${GREEN}✓ Completed: ${model_name} | steps=${num_steps} | sigma_d=${sigma_d}${NC}"
        echo "$(date): SUCCESS ${model_name} steps=${num_steps} sigma_d=${sigma_d}" >> "$LOG_FILE"
    else
        echo "ERROR: Failed ${model_name} steps=${num_steps} sigma_d=${sigma_d}" >> "$LOG_FILE"
        echo "Continuing with next experiment..."
    fi

    echo "" >> "$LOG_FILE"
}

# Main execution loop
echo "Starting RTC evaluation experiments..."
echo "Results will be saved to: $RESULTS_DIR"
echo ""

# # Run experiments for SmolVLA
# echo "=========================================="
# echo "Running SmolVLA experiments"
# echo "=========================================="
# for num_steps in "${NUM_STEPS[@]}"; do
#     for sigma_d in "${SIGMA_D_VALUES[@]}"; do
#         run_experiment \
#             "smolvla" \
#             "$SMOLVLA_POLICY" \
#             "$SMOLVLA_DATASET" \
#             "$num_steps" \
#             "$sigma_d"
#     done
# done

# Run experiments for PI0.5
echo "=========================================="
echo "Running PI0.5 experiments"
echo "=========================================="
for num_steps in "${NUM_STEPS[@]}"; do
    for sigma_d in "${SIGMA_D_VALUES[@]}"; do
        run_experiment \
            "pi05" \
            "$PI05_POLICY" \
            "$PI05_DATASET" \
            "$num_steps" \
            "$sigma_d"
    done
done

echo ""
echo "=========================================="
echo "All experiments completed!"
echo "Results saved to: $RESULTS_DIR"
echo "Log file: $LOG_FILE"
echo "=========================================="

# Generate summary
SUMMARY_FILE="$RESULTS_DIR/summary.txt"
echo "Experiment Summary - $(date)" > "$SUMMARY_FILE"
echo "======================================" >> "$SUMMARY_FILE"
echo "" >> "$SUMMARY_FILE"
echo "Total experiments: $(( ${#NUM_STEPS[@]} * ${#SIGMA_D_VALUES[@]} * 2 ))" >> "$SUMMARY_FILE"
echo "Models tested: SmolVLA, PI0.5" >> "$SUMMARY_FILE"
echo "Num steps tested: ${NUM_STEPS[*]}" >> "$SUMMARY_FILE"
echo "Sigma_d values tested: ${SIGMA_D_VALUES[*]}" >> "$SUMMARY_FILE"
echo "" >> "$SUMMARY_FILE"
echo "Results directory structure:" >> "$SUMMARY_FILE"
find "$RESULTS_DIR" -type d -name "*_steps_*" | sort >> "$SUMMARY_FILE"

echo ""
echo "Summary saved to: $SUMMARY_FILE"

Some reports after test script run

Check - https://huggingface.co/spaces/helper2424/rtc_tests

SmolVLA; n_step=2; sigma_d=0.1

SmolVLA; n_step=5; sigma_d=1.0

SmolVLA; n_step=50; sigma_d=0.2

Pi0.5; n_steps=5, sigma=0.8

Pi0.5; n_steps=10, sigma=0.2

alexander-cobot

Thanks @helper2424 for opening this PR! I have a few comments which will hopefully clarify, my article's guidance.

The main points are:

I think you meant variance_clipping_factor to be σ_d from my article judging by its default value. I think the name needs revising. See my inline comments.
max_guidance_weight should probably default to be num_steps._
You don't need any use_soare_optimization guards. σ_d = 1.0 covers the default RTC implementation. σ_d < 1.0 (and a good value might be 0.2) covers my article.

I'm also available offline to discuss :)

alexander-cobot · 2025-11-21T15:44:06Z

src/lerobot/policies/rtc/modeling_rtc.py

        time,
        original_denoise_step_partial,
        execution_horizon=None,
+        num_flow_matching_steps=None,


I would suggest not passing this here, as the total number of denoising steps shouldn't be the concern of a single denoising step. My article's guidance is to set max_guidance_weight = num_steps. I'm fairly convinced that is the correct thing to do, enough so that I would recommend just forcing to default to this unless the user explicitly provides max_guidance_weight. I also note that you have already defaulted it to 10, which is not the value of 5 that the original RTC paper suggests (which is fine IMO, but just showing that there are already deviations from the original specification, so we might as well ground it with my article's guidance).

Got it. I used 10, because by default, num_inference_steps: int = 10 , so 5 won't work well for Lerobot policies with default config. Probably, PI used 5 during testing RTC.

I made it, and after this returned the logic back. pi0.x pass num steps as parameter to the predic_action_chunk

alexander-cobot · 2025-11-21T15:51:24Z

src/lerobot/policies/rtc/configuration_rtc.py

+    use_soare_optimization: bool = True
+    variance_clipping_factor: float = 0.2


I think you meant: sigma_d: float = 1.0 instead of variance_clipping_factor (or if you prefer a more descriptive parameter, prior_variance, but be careful because "variance" is sigma_d ** 2). That parameter is used in all cases. When it is 1.0 you are not using the improvement suggested in my article. Otherwise you are. And therefore, you can drop use_soare_optimization altogether, and don't need to guard any code with if use_soare_optimization

Note, as per my article, it's max_guidance_weight that you would set equal to num_flow_matching_steps. In the RTC paper they don't give guidance for that, and just suggest setting it to 5.0.

alexander-cobot · 2025-11-21T15:56:54Z

src/lerobot/policies/rtc/modeling_rtc.py

        tau_tensor = torch.as_tensor(tau)
        squared_one_minus_tau = (1 - tau_tensor) ** 2
-        inv_r2 = (squared_one_minus_tau + tau_tensor**2) / (squared_one_minus_tau)
+        if self.config.use_soare_optimization:


Based on my comments above. This whole block won't need this if else guard for use_soare_optimization.
You just need
inv_r2 = (squared_one_minus_tau + tau_tensor ** 2 * sigma_d ** 2) / (squared_one_minus_tau * sigma_d ** 2)
or if you are going to call it prior_variance instead, since that's already σ² it would be:
inv_r2 = (squared_one_minus_tau + tau_tensor ** 2 * prior_variance) / (squared_one_minus_tau * prior_variance)

Then setting sigma_d = 1.0 reverts to the original RTC implementation.

Btw this is just eqn 8 in my article

alexander-cobot · 2025-11-24T15:57:33Z

docs/source/rtc.mdx


 **`inference_delay`**: How many timesteps of inference latency your system has. This is passed to `predict_action_chunk()` rather than the config, since it may vary at runtime.

+**`sigma_d`**: The variance of the prior distribution. This is a hyperparameter that can be tuned to balance the smoothness of the transitions and the reactivity of the policy.


nit: sigma is not "variance", but "standard deviation" rather.

alexander-cobot · 2025-11-24T15:59:32Z

docs/source/rtc.mdx

 ```

-**`max_guidance_weight`**: How strongly to enforce consistency with the previous chunk. This is a hyperparameter that can be tuned to balance the smoothness of the transitions and the reactivity of the policy. For 10 steps flow matching (SmolVLA, Pi0, Pi0.5), a value of 10.0 is a optimal value.
+**`max_guidance_weight`**: How strongly to enforce consistency with the previous chunk. This is a hyperparameter that can be tuned to balance the smoothness of the transitions and the reactivity of the policy.


This is a clipping parameter, no the actual guidance weight. You might modify the sentence to say include something like "a clipping parameter on the computed guidance weight. Ensures stability."

alexander-cobot · 2025-11-24T16:02:54Z

examples/rtc/eval_dataset.py

        },
    )

+    sample_correlation_shift: int | None = field(


Note: that a good value here is something less than the chunk size. For example, you might want to simulate a chunk size of 50 where you begin inference for the next chunk at the 25th step.

helper2424 added 2 commits November 21, 2025 21:13

Add Alex Soare RTC optimization

8c85667

Add variance cliping factor

a1b0183

alexander-cobot reviewed Nov 21, 2025

View reviewed changes

helper2424 added 15 commits November 22, 2025 15:29

Fixes for RTC

d9314da

fixup! Fixes for RTC

f42250b

fixup! fixup! Fixes for RTC

9c3c810

Fix eval_dataset

19717fb

Fixup

b966ad3

Fix tests

6d836a0

Extend modeling RTC tests

798748c

Fix tests

0018e61

Remove back the logic for num_flow_mathcing steps

25081f9

Update tests

cb6c862

Fix tests

c38bfe1

Fix tests

0866355

Fix docs

a642ec8

Add an ability to make a dataset shift

052b6d1

Increase default execution horizon in eval_with_robot

527f9f5

alexander-cobot reviewed Nov 24, 2025

View reviewed changes

Update

8eb10cd

helper2424 changed the title ~~RTC optimization from Alex Soare~~ RTC adjustments. Bug fix & Alex Soare optimization Nov 24, 2025

This was referenced Nov 25, 2025

[Question] RTC denoise update equation is not same with the paper #2511

Closed

RTC open source? #2532

Open

		use_soare_optimization: bool = True
		variance_clipping_factor: float = 0.2


		`inference_delay`: How many timesteps of inference latency your system has. This is passed to `predict_action_chunk()` rather than the config, since it may vary at runtime.

		`sigma_d`: The variance of the prior distribution. This is a hyperparameter that can be tuned to balance the smoothness of the transitions and the reactivity of the policy.

RTC adjustments. Bug fix & Alex Soare optimization #2499

Are you sure you want to change the base?

RTC adjustments. Bug fix & Alex Soare optimization #2499

Uh oh!

Conversation

helper2424 commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this does

How it was tested

Some reports after test script run

Uh oh!

alexander-cobot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

helper2424 commented Nov 21, 2025 •

edited

Loading

alexander-cobot left a comment •

edited

Loading