Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix PIO I2C race condition #617

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from
Open

Conversation

dteal
Copy link

@dteal dteal commented Feb 21, 2025

This is a (hacky) fix for a race condition bug in the PIO I2C code that breaks I2C reads.

In many cases when reading data over the I2C bus via PIO, the correct data is received over the wire (as seen on a logic analyzer) but is incorrectly reported by the PIO state machine, with the result that I2C reads don't work. This problem has already been observed in the following threads and issue:

[1] https://forums.raspberrypi.com/viewtopic.php?t=356883
[2] https://forums.raspberrypi.com/viewtopic.php?t=340111
[3] #168

Testing suggests the (incorrect) bytes reported by the PIO are the expected I2C byte stream but shifted by 7 (!) bits. For example, the thread [2] reported the following examples of incorrectly read byte sequences. The I2C bus sees an 8 bytes for the address + read bit, then 32 bytes of data:

Data on I2C Bus        Data in receive buffer
0xED 0xFF 0x8B 0x00    0x77 0xDB 0xFF 0x16
0xD5 0xFF 0x8B 0x00    0x77 0xAB 0xFF 0x16
0xE1 0xFF 0x8B 0x00    0x77 0xC3 0xFF 0x16
0xD5 0xFF 0x7B 0x00    0x77 0xAB 0xFE 0xF6

This makes the following sequences in binary form, where we can see the PIO data is offset 7 bits from what is visible on the I2C bus:

bus 1:        11101101111111111000101100000000
pio 1: 01110111110110111111111100010110
bus 2:        11010101111111111000101100000000
pio 2: 01110111101010111111111100010110
bus 3:        11100001111111111000101100000000
pio 3: 01110111110000111111111100010110
bus 4:        11010101111111110111101100000000
pio 4: 01110111101010111111111011110110

I have replicated all this with a BH1750 light sensor and FDC2112 capacitance sensor on both a RP2040 and RP2350, and I can further confirm the mysterious first 7 bits reported by the PIO appear to be the last 7 bits of the (address + read bit) byte sent over the I2C bus prior to the expected data. The PIO is correctly reading the bus but has somehow shifted the result by 7 bits.

My best explanation for this is as follows:

First, I suspect the PIO program itself works as expected. Every time it reads or writes a byte on the bus, it puts 8 bits in the ISR (note this happens for both reads and writes, which the PIO treats identically). Autopush should be enabled, moving 8 bits at a time into the RX FIFO. To write a byte over I2C, the main core should put a byte into the state machine TX FIFO, wait for the state machine, then read and discard a byte from the RX FIFO. To read a byte, the main core should but 0xff into the TX FIFO, then read the result from the RX FIFO.

However, the C program running on the main core messes with this. In pio_i2c.c, the function pio_i2c_write_blocking() first calls pio_i2c_rx_enable(), which disables the PIO state machine autopush via direct register manipulation. This is done so that the PIO doesn't put anything on the RX FIFO, so bytes don't have to be read then discarded. The function pio_i2c_write_blocking() re-enables autopush.

Crucially, enabling/disabling autopush is not synced to the PIO state machine clock.

I haven't been able to figure out exactly when things break (I don't know why the shift seems is reliably 7 bits instead of a random number depending on the setup), but the following is a possibility:

  • Autopush is disabled.
  • 1 byte is placed in the TX FIFO, then put onto the bus. The state machine ISR gets 8 bits and the input shift counter is increased to 8. Nothing else happens because autopush is disabled.
  • 1 more byte is placed in the TX FIFO. However, after the first bit is put on the bus (making the shift counter 9), autopush is re-enabled and immediately PUSHes the ISR to the RX FIFO and resets the input shift counter. But then the remaining 7 bits are sent and put in the ISR.
  • Every subsequent byte in the PIO state machine has its first bit autopushed with the last 7 bits of the previous byte.

Even though the exact race condition hasn't been pinned down, there's enough evidence to create solutions that seem to work:

  • A. Remove pio_i2c_rx_enable() and have pio_i2c_write_blocking() read and discard the resulting RX FIFO dummy bytes. I confirmed this seemed to work in at least one case. This is similar to the solution noted in [2].
  • B. Ensure the PIO state machine bytes are always aligned by continually resetting the ISR. This has the advantage that any other race condition bugs in the code will at most make a single byte on the I2C bus incorrect; the next bytes will be automatically re-aligned. This is the solution proposed in [1] and [3].
  • C. More thoroughly rewrite pio_i2c.c with more caution for potential race conditions.

I chose to implement solution B by adding a mov isr, null instruction to the PIO assembly since this protects against all race conditions. Since this is a patch rather than fixing the ultimate root issue, it is possible there exist some setups where there is still edge case broken behavior, but I was not able to find any. Further implementing solution A might be wise in the future because it probably fixes whatever the root issue is. Solution C might be helpful in the long term as there are some other suspicious bits of code (e.g., emptying the RX FIFO buffer via while (!pio_sm_is_rx_fifo_empty(pio, sm)){(void)pio_i2c_get(pio, sm);} at the start of pio_i2c_read_blocking() seems like it should be unnecessary).

@peterharperuk peterharperuk added this to the 2.1.2 milestone Feb 21, 2025
@peterharperuk
Copy link
Contributor

Thanks for the analysis. I'll see if I can reproduce the problem with a BH1750.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants