Skip to content

Conversation

@holly-hacker
Copy link

@holly-hacker holly-hacker commented Oct 24, 2025

This PR adds a function on embassy_rp::pio::StateMachineTx to allow pushing continuous data to a PIO state machine through a ping-pong/double buffering mechanism, allowing one buffer to be filled while the other is being sent. This method uses 2 DMA channels with CHAIN_TO allowing no downtime between transfers (assuming the user fills their buffers in time using the given callback).

Example usage to generate a 440Hz sine wave over 8 digital pins at a desired sample rate:

const SAMPLE_RATE: u32 = 16_000; // 16kHz

let tx = pio.sm0.tx();

let mut data_1 = [0x0u8; 128];
let mut data_2 = [0x0u8; 128];

let mut sample_index = 0usize;
tx.dma_push_ping_pong(
    p.DMA_CH0.reborrow(),
    p.DMA_CH1.reborrow(),
    &mut data_1,
    &mut data_2,
    |buf| {
        info!("In start of fill callback, index={}", sample_index);
        if sample_index > 100_000 {
            return core::ops::ControlFlow::Break(());
        }

        for b in buf.iter_mut() {
            let time = sample_index as f32 / SAMPLE_RATE as f32;
            let wave = fast_sin(time * 440. * f32::consts::PI * 2.);
            *b = ((wave + 1.) / 2. * 256.) as u8;

            sample_index += 1;
        }

        core::ops::ControlFlow::Continue(())
    },
)
.await;

// push a zero to reset the pin state
tx.dma_push(p.DMA_CH0, &[0u8; 1], false).await;

At 16kHz, this results in a buffer of ~8ms which looks clean (or as clean as you can get with a messy 8bit R-2R "DAC" 😄) on the oscilloscope:
image


Some open questions:

  • I haven't added any compiler fences which are used in noticed in dma_push. I'm not knowledgeable enough about compiler internal to know how/when the compiler will re-order instructions to know where to insert them. Feedback is appreciated, but the current implementation seems to work
  • I haven't looked at task priority yet so in a program with many concurrent tasks, another task may take the cpu time needed to fill the buffer. I'm not sure if this is something that should be handled on embassy's side.
  • Users can't send partial buffers which may be desired when ending a transfer. I could fix this by changing ControlFlow to ControlFlow::<(), Option<u32>> to let the user specify how much of their buffer they want to send, but that makes the API a bit ugly. Not sure what to do here, open to feedback.
  • I haven't touched any code in embassy_rp::dma, which seems to contain some duplicate code to embassy_rp::dma::StateMachineTx. I'm not sure how the maintainers want to approach this.

I haven't added any examples yet. If desired, I can do that when the general design of this function is approved.

I'm pretty new to embedded dev in general, so I may have missed something obvious. Thorough code review would be appreciated.

Fixes #4190

@felipebalbi
Copy link

Maybe add your example above as an actual example in the repository (i.e. as part of examples/rp235x).

@holly-hacker
Copy link
Author

I've added the rp235x example and tested it locally to ensure it works, but I haven't tested with the rp2040 yet because I'd prefer to get some feedback on this PR first.

@holly-hacker holly-hacker force-pushed the pio-dma-ping-pong branch 2 times, most recently from d30909d to c1e89aa Compare November 28, 2025 20:24
Comment on lines +600 to +602
if let ControlFlow::Break(()) = fill_buffer_callback(data1) {
break;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is sound. If the callback is too slow and the ch1 is triggered while it's still executing we're holding onto a mutable borrow while another immutable access (by the DMA) is happening.

Perhaps the channel should be disabled for the duration of this callback? Or the function should be unsafe?

If you went with marking the function as unsafe, I'm not sure how could you even guarantee soundness of this without analyzing the whole system, as other executing tasks could block the CPU for long enough to make the callback unable to meet its deadline.

}

trace!("Waiting for pong transfer");
Transfer::new(ch2.reborrow()).await;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this behave in case the callback runs for too long? Can it potentially hand forever?
Can it immediately fall-through even though ch2 is currently running (because we missed it finishing)?

break;
}

trace!("Waiting for pong transfer");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically you should have a fence here, before the other transfer is started. Though I'm quite sure that an await point makes for a fence anyway, so perhaps it's not an issue.

If you disabled the chanel for the duration of the callback then you do need a fence before re-enabling the channel. Otherwise the compiler could reorder some writes from inside the callback to after the re-enable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RP2040 DMA: Enhance PIO RX DMA support for continuous hardware-chained ping-pong buffering via CHAIN_TO

3 participants