Summary
ESP32-S3 CSI nodes crash with LoadProhibited in the WiFi driver's interrupt handler when promiscuous mode captures frames at high rates. The crash is inside Espressif's closed-source binary blob and cannot be fixed at the application level without reducing the WiFi hardware interrupt rate.
Crash signature
Guru Meditation Error: Core 0 panic'ed (LoadProhibited)
EXCVADDR: 0x00000004
Decoded backtrace (xtensa-esp32s3-elf-addr2line):
_xt_lowint1 ← WiFi hardware interrupt
wDev_ProcessFiq ← WiFi driver FIQ (closed-source blob)
spi_flash_restore_cache ← SPI flash cache restore
cache_ll_l1_resume_icache ← L1 ICache resume ← NULL deref here
Root cause
The WiFi MAC hardware generates a Level 1 interrupt for every frame captured by the promiscuous filter. wDev_ProcessFiq handles these interrupts and at some point calls spi_flash_restore_cache, which calls cache_ll_l1_resume_icache. When the interrupt rate is high enough (>50 Hz), this function encounters a NULL pointer — likely because the SPI flash cache state is in an inconsistent state from a concurrent flash operation (display QSPI, NVS write, etc.).
The crash is not in application code. It's inside the ESP-IDF WiFi binary blob (libpp.a).
Controlled experiments
All tests on ESP32-S3 (QFN56 rev v0.2, 8MB PSRAM, MAC 80:b5:4e:c1:be:b8), ESP-IDF v5.4, WiFi SSID Spiridonovi1 ch2.
| # |
Build |
Display |
Promiscuous filter |
Effective CSI rate |
Crash point |
Result |
| 1 |
Our build |
ON |
MGMT+DATA |
~500 Hz |
~2400 cb (~70s) |
Crash |
| 2 |
Our build |
OFF |
MGMT+DATA |
~500 Hz |
~5300 cb (~90s) |
Crash (slower) |
| 3 |
Our build |
OFF |
MGMT-only |
~10 Hz |
2700+ cb (4.7 min) |
Stable |
| 4 |
Our build |
ON |
MGMT-only |
~10 Hz |
2400+ cb (4 min+) |
Stable |
| 5 |
Our build + 50Hz callback gate |
ON |
MGMT+DATA |
~50 Hz |
~1300 cb |
Crash |
| 6 |
Ruv's v0.6.1-esp32 release |
OFF |
MGMT+DATA |
~100 Hz |
~8200 cb |
Crash (19x in 2 min) |
Key findings:
- Display OFF doubles time-to-crash but doesn't prevent it (test 2 vs 1, test 6)
- Callback-level rate limiting does NOT help because the WiFi HW interrupt fires for every captured frame regardless of callback execution (test 5)
- MGMT-only filter is the only fix that works — it reduces the hardware interrupt rate itself (test 3, 4)
- v0.6.1-esp32 release crashes with the same bug (test 6)
What doesn't work
Callback rate limiting
A 50 Hz early gate in wifi_csi_callback that returns immediately for excess frames does not help. The crash occurs in wDev_ProcessFiq which runs before the callback is invoked. Reducing callback execution time has no effect on interrupt rate.
SPIRAM XIP (CONFIG_SPIRAM_FETCH_INSTRUCTIONS + CONFIG_SPIRAM_RODATA)
In theory this eliminates the SPI flash cache race entirely (instructions served from PSRAM, cache never suspended). In practice, manual sdkconfig edits with CONFIG_SPIRAM_MODE_QUAD=y produced an IllegalInstruction crash-loop from boot — likely because this board's PSRAM is Octal, not Quad. Needs proper idf.py menuconfig validation with the correct PSRAM mode for this hardware.
Additional IRAM options
CONFIG_ESP_WIFI_EXTRA_IRAM_OPT=y and CONFIG_ESP_WIFI_SLP_IRAM_OPT=y were tested as part of the SPIRAM build but couldn't be isolated due to the PSRAM misconfiguration.
What works
MGMT-only promiscuous filter
wifi_promiscuous_filter_t filt = {
.filter_mask = WIFI_PROMIS_FILTER_MASK_MGMT, // was MGMT | DATA
};
Reduces WiFi hardware interrupt rate from ~100-500 Hz to ~10 Hz (beacon/probe frames only). Tested stable for 4+ minutes with display ON, zero crashes.
Trade-off: CSI data rate drops from ~100-500 frames/sec to ~10 frames/sec. However, 10 Hz is sufficient for presence detection, breathing rate (10-30 BPM), and heart rate detection. Edge processing adaptive calibration completes successfully at this rate.
Proper fix path (not yet tested)
SPIRAM XIP is the correct platform-level fix. When CONFIG_SPIRAM_FETCH_INSTRUCTIONS=y + CONFIG_SPIRAM_RODATA=y, the SPI flash cache is never suspended during flash operations, eliminating the race entirely. This requires:
- Determine correct PSRAM mode for this board (Quad vs Octal) — check Waveshare ESP32-S3 datasheet
- Configure via
idf.py menuconfig (not manual sdkconfig edits) to get all dependencies right
- Test with full MGMT+DATA promiscuous at 500 Hz
Also affected: node_id clobber
Separately from the crash, the g_nvs_config.node_id clobber (#390) is confirmed on our hardware. Ruv's v0.6.1 late capture at csi_collector_init() works on some boots but not all — we proved wifi_init_sta() corrupts the struct before the capture runs. Our early capture (csi_collector_set_node_id() called before wifi_init_sta()) is the reliable fix. See PR #393 comments.
Hardware
- Board: Waveshare ESP32-S3 AMOLED 1.8" (SH8601 368x448 QSPI display)
- Chip: ESP32-S3 (QFN56) rev v0.2, 8MB PSRAM (AP_3v3), 16MB flash (Boya)
- ESP-IDF: v5.4
- WiFi:
Spiridonovi1 ch2, WPA2-PSK
Refs
Summary
ESP32-S3 CSI nodes crash with
LoadProhibitedin the WiFi driver's interrupt handler when promiscuous mode captures frames at high rates. The crash is inside Espressif's closed-source binary blob and cannot be fixed at the application level without reducing the WiFi hardware interrupt rate.Crash signature
Decoded backtrace (
xtensa-esp32s3-elf-addr2line):Root cause
The WiFi MAC hardware generates a Level 1 interrupt for every frame captured by the promiscuous filter.
wDev_ProcessFiqhandles these interrupts and at some point callsspi_flash_restore_cache, which callscache_ll_l1_resume_icache. When the interrupt rate is high enough (>50 Hz), this function encounters a NULL pointer — likely because the SPI flash cache state is in an inconsistent state from a concurrent flash operation (display QSPI, NVS write, etc.).The crash is not in application code. It's inside the ESP-IDF WiFi binary blob (
libpp.a).Controlled experiments
All tests on ESP32-S3 (QFN56 rev v0.2, 8MB PSRAM, MAC
80:b5:4e:c1:be:b8), ESP-IDF v5.4, WiFi SSIDSpiridonovi1ch2.Key findings:
What doesn't work
Callback rate limiting
A 50 Hz early gate in
wifi_csi_callbackthat returns immediately for excess frames does not help. The crash occurs inwDev_ProcessFiqwhich runs before the callback is invoked. Reducing callback execution time has no effect on interrupt rate.SPIRAM XIP (
CONFIG_SPIRAM_FETCH_INSTRUCTIONS+CONFIG_SPIRAM_RODATA)In theory this eliminates the SPI flash cache race entirely (instructions served from PSRAM, cache never suspended). In practice, manual sdkconfig edits with
CONFIG_SPIRAM_MODE_QUAD=yproduced anIllegalInstructioncrash-loop from boot — likely because this board's PSRAM is Octal, not Quad. Needs properidf.py menuconfigvalidation with the correct PSRAM mode for this hardware.Additional IRAM options
CONFIG_ESP_WIFI_EXTRA_IRAM_OPT=yandCONFIG_ESP_WIFI_SLP_IRAM_OPT=ywere tested as part of the SPIRAM build but couldn't be isolated due to the PSRAM misconfiguration.What works
MGMT-only promiscuous filter
Reduces WiFi hardware interrupt rate from ~100-500 Hz to ~10 Hz (beacon/probe frames only). Tested stable for 4+ minutes with display ON, zero crashes.
Trade-off: CSI data rate drops from ~100-500 frames/sec to ~10 frames/sec. However, 10 Hz is sufficient for presence detection, breathing rate (10-30 BPM), and heart rate detection. Edge processing adaptive calibration completes successfully at this rate.
Proper fix path (not yet tested)
SPIRAM XIP is the correct platform-level fix. When
CONFIG_SPIRAM_FETCH_INSTRUCTIONS=y+CONFIG_SPIRAM_RODATA=y, the SPI flash cache is never suspended during flash operations, eliminating the race entirely. This requires:idf.py menuconfig(not manual sdkconfig edits) to get all dependencies rightAlso affected: node_id clobber
Separately from the crash, the
g_nvs_config.node_idclobber (#390) is confirmed on our hardware. Ruv's v0.6.1 late capture atcsi_collector_init()works on some boots but not all — we provedwifi_init_sta()corrupts the struct before the capture runs. Our early capture (csi_collector_set_node_id()called beforewifi_init_sta()) is the reliable fix. See PR #393 comments.Hardware
Spiridonovi1ch2, WPA2-PSKRefs