Skip to content

WIP - Make FreeRTOS a first-class citizen #3063

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 16 commits into
base: master
Choose a base branch
from
Draft

Conversation

earlephilhower
Copy link
Owner

With the expanded performance and memory of the Pico 2, having a full operating system with threads could really improve developer life and get the best possible algorithm performance.

Start by moving FreeRTOS checks from a global bool set by weak function linkage checks to a compiler definition. This can allow the build process to differ between bare metal and FreeRTOS builds (i.e. async_context implementations) and save a few bytes of program space.

@earlephilhower earlephilhower force-pushed the freesanity branch 2 times, most recently from 0e85745 to 3aabdc9 Compare August 8, 2025 17:26
With the expanded performance and memory of the Pico 2, having a full
operating system with threads could really improve developer life and
get the best possible algorithm performance.

Start by moving FreeRTOS checks from a global bool set by weak function
linkage checks to a compiler definition.  This can allow the build process
to differ between bare metal and FreeRTOS builds (i.e. async_context
implementations) and save a few bytes of program space.
Implement the LWIP task (the only task allowed to call actual LWIP calls)
using a work queue accessible from any other task/core.

Move FreeRTOS into the main cores/rp2040 directory to allow for easier
core usage.

Dynamically build the proper async_context for raw or FreeRTOS in
the IDE, not at libpico time.
The CYW43 driver can come up and start processing data.  Unfortunately when
it needs to send data out through LWIP we have a deadlock.

There is an CYW43 async_context semaphore owned by the calling task.

In this case, the task is the periodic callback in "asyn_con" (i.e. the
background timer).

1. When the timeout hits, the async context task is woken up and
the first thing it does is take the async_context semaphore.

2. During background processing (sys_check_timeouts) an LWIP call
is made.

3. The LWIP call sends a message to the LWIP task and wakes it up.
The ASYN_CON task is now suspended waiting for the LWIP task done
notification.  It holds the ASYN_CON semaphore while asleep.

4. LWIP does a bunch of stuff and tries to do an ethernet_output
to send bits over the wire (i.e. accept DHCP or something).

5. Eventually LWIP's netif call stack goes to the CYW43 object
(while still in the LWIP task) who tries to acquire the
CYW43 semaphore and fails (because it's already held by the async
task that's sleeping).

6. There is no 6, it's deadlocked at this point.
Deadlocks/pauses  seem to be unavoidable when running Ethernet packet
receive outside of the LWIP thread.

Add a callback queue message to the LWIP thread to allow for it to
run whatever ethernet processing needed.  These messages still get stuffed
by another simple periodic task.
IRQs will send a work queue callback instad of actually doing
anything.  Disable all IRQs when the work noted, and then reset
it when done.
The IRQ callback routine ended up passing in a stack variable address.  For main
app code this is legal because the app blocks until the LWIP call returns.  But
for IRQs, it returns immediately (can't block in IRQ) and the stack pointer we
passed in will be corrupt.  Use a dumb static (heap) variable for now.

W5100 is now running with multiple AdvancedWebServer clients in parallel with
a WiFiClient MOTD process on core 1, in parallel.

Rewriting the CYW43 driver to work with this TBD.
Trying not to end up on thedailywtf.com
It is possible under heavy load with multithreading that a connection gets a
tcp_abort just before tcp_accept will be called.  When we abort, we clear the
this pointer, and so when we try and recover our object we crash.

Check for aborted connections (and ignore) on WiFiServer, like we do in
WiFiClient
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant