Firmware: Add an additional layer of DMA-assisted sample buffering by martinling · Pull Request #1601 · greatscottgadgets/hackrf

martinling · 2025-10-20T15:31:58Z

This PR attempts to implement the buffer scheme described in #1363 (comment).

The M0 SGPIO code continues to use the existing 32KB sample buffer, which has special placement.

An additional 32KB USB buffer is placed in ram_local1, above the .text section.

The GPDMA controller is used to transfer samples between the sample buffer and USB buffer.

The initial commits in this PR set up the new buffer scheme, whilst using memcpy running synchronously on the M4 core as a placeholder for the DMA operations. This works correctly up to around 8Msps.

The final commit switches to the DMA implementation.

mossmann · 2025-11-14T23:58:32Z

On my Linux test host, this improves maximum shortfall-free RX sample rate from 19.2 Msps to 21.8 Msps and improves TX from 20.1 Msps to 21.8 Msps.

mossmann

I haven't spotted the bug, but I'm seeing short TX with hackrf_transfer -n truncated by a few milliseconds. Looks like the transmissions are about 9000 samples short.

mossmann · 2025-11-15T00:22:56Z

Transmission duration does not change linearly with the number of samples.

Testing with hackrf_transfer -a 1 -x 31 -c 127 -f 2500000000 -n 32768 -s 2000000

32768 samples: 8.88 ms (expected 16.38 ms)
32769 samples: 12.29 ms (expected 16.38 ms)
33000 samples: 12.41 ms (expected 16.50 ms)
40000 samples: 12.93 ms (expected 20.00 ms)
40960 samples: 13.03 ms (expected 20.48 ms)
40961 samples: 16.85 ms(expected 20.48 ms)

martinling · 2025-11-15T09:29:56Z

I believe this is due to the current behaviour of hackrf_enable_tx_flush.

When that function is used, the host will send 32KB of zeroes after the end of the provided TX data, before making the control request to switch transceiver mode back to idle.

The theory behind that approach was that, since there's only 32KB of space to store samples, once you've sent that many zeroes and the device has accepted them, the earlier samples you care about must have been transmitted.

With this PR, we double the buffer space, so the host would now need to send 64KB of zeroes to achieve the same. If you change DEVICE_BUFFER_SIZE in libhackrf/src/hackrf.c to 65536, I think that will stop the behaviour you're seeing.

Unfortunately, we hardcoded that figure rather than adding a vendor request to query it from the device. My recollection is that we discussed that decision verbally at the time, but since I'd already made some attempts to increase the buffer size without success, we thought that 32KB figure wasn't ever going to change and it wasn't worth adding this query.

Our options are:

Bump the DEVICE_BUFFER_SIZE in libhackrf. This is no good, because users will get short TX if running older software with newer firmware, just as you're seeing now.
Add a vendor request to query the buffer size, defaulting to 32KB if the request is not supported. Same problem, because older software won't make the request. We would have needed to add this at the same time we added hackrf_enable_tx_flush.
Make the firmware emulate the previous behaviour, as follows: Once the host has sent the mode change request to leave TX mode, stop making transfers into the USB buffer, but move all existing data from the USB buffer into the sample buffer before actually shutting down TX.

I'll implement the latter.

martinling · 2025-11-17T13:53:35Z

Ensuring that all data in the USB buffer is moved to the sample buffer before stopping TX has fixed the non-linearity you were seeing, but I'm now seeing timing coming up short by what looks like a consistent 4.096ms, i.e. 8KB of samples or one DMA transfer. I'll see about tracking down the remaining bug.

martinling · 2025-11-18T13:47:41Z

Some experimentation with looking at ramped signals on a scope shows that the transmission includes the first 32KB preloaded into the sample buffer; is missing the next 8KB, which is the first to be transferred by DMA; then all subsequent samples are transmitted correctly. This is true from the memcpy version onwards, regardless of whether DMA_TRANSFER_SIZE is 16KB or 8KB.

martinling · 2025-11-18T13:59:14Z

Fixed that - still some issues with small sample counts.

martinling · 2026-01-05T16:46:23Z

I've rebased this on main and done some further testing of the forward/back compatibility.

Everything seems to be working as expected, so this is ready for review again.

The forward/back compatibility logic is:

Host: if API version < 1.10, assume buffer is 32KB. If API version >= 1.10, ask for buffer size. The rest of the flush logic is unchanged (send enough zeroes, then assume all samples were transmitted).

Firmware: if the host didn’t ask for our buffer size, NAK any request to stop TX until all samples are transmitted.

This should result in everything working optimally, except in the case of old host software with new firmware, in which case the device takes a little longer to leave TX.

mossmann · 2026-01-07T19:50:03Z

With this change I'm seeing RX hang when adjusting the sample rate mid-stream, for example with the flowgraph in #1629 (comment)

martinling · 2026-03-16T19:37:02Z

This has been rebased on main. I can no longer reproduce the RX hang seen in #1601 (comment), which I think was due to the concurrency problems fixed in #1648. As such I think this is ready for a fresh review.

These problems have been fixed.

mossmann

I'm still able to induce hangs with a sample rate slider. I've seen this in both TX and RX, but so far I've only induced RX failure with a PortaPack installed. The TX failure also seems easier to reproduce with a PortaPack.

I tried merging #1689, hoping it would avoid PortaPack concurrency problems, but it didn't help.

martinling · 2026-03-17T15:08:38Z

Bummer. OK, I'll see if I can reproduce with a Portapack. In the meantime, putting this one back to draft for now.

martinling · 2026-03-25T17:06:17Z

As discussed yesterday, I was able to reproduce the RX hang with a PortaPack attached, and examine the situation with a debugger.

It turned out the hang was caused by a race condition in the following code:

volatile uint32_t dma_pending;

// ...

void transceiver_start_dma(...) {
        // ...
        gpdma_channel_enable(DMA_CHANNEL);
        dma_pending = size;
}

void dma_isr(void)
{
        gpdma_channel_disable(DMA_CHANNEL);
        GPDMA_INTTCCLEAR = (1 << DMA_CHANNEL);
        m0_state.m4_count += dma_pending;
        dma_pending = 0;
}

Because transceiver_start_dma() called gpdma_channel_enable(DMA_CHANNEL) before updating dma_pending, it was possible for an interrupt to occur between those steps. If that interrupt also took long enough to complete, the DMA transfer could complete in the background, and dma_isr would execute next, setting dma_pending to zero, only to then have it reset back to size by the last line of transceiver_start_dma(). The buffer management code would then be blocked from making further DMA transfers, as it would believe one was still pending.

The PortaPack made the bug easier to reproduce, because the USB ISR would update the baseband LPF cutoff on the LCD display. Once I realised this, I was able to reproduce the race more quickly by rapidly alternating between sample rates that caused different LPF settings to be chosen.

This race has been fixed, and the branch rewritten with a cleaner implementation and history. I'm no longer able to reproduce the RX hang and am reopening this for review.

mossmann · 2026-03-25T21:41:29Z

I'm no longer able to reproduce the hang in RX, but I can still reproduce it quite easily in TX.

martinling · 2026-03-25T22:36:20Z

OK, the TX hang must be a different bug then. I'll investigate.

martinling · 2026-03-30T15:50:45Z

I was able to reproduce the TX hang, and worked out that it was due to mistakes in the calculations of space_in_use and samp_buf_margin in the TX path through start_dma_if_possible().

I've fixed that, and rebased the branch to resolve new conflicts with the merge of #1701.

martinling · 2026-04-10T21:37:33Z

Rebased on main.

It occurs to me we could also apply this improvement to sweep mode.

martinling force-pushed the extra-buffer branch 5 times, most recently from d863307 to 8c54161 Compare October 21, 2025 14:22

martinling force-pushed the extra-buffer branch 2 times, most recently from be432d4 to 912d993 Compare November 11, 2025 17:07

martinling marked this pull request as ready for review November 11, 2025 17:08

mossmann self-requested a review November 11, 2025 17:13

martinling linked an issue Nov 11, 2025 that may be closed by this pull request

hackrf_transfer -c produces possible buffer underruns #1503

Open

martinling mentioned this pull request Nov 11, 2025

hackrf_transfer -c produces possible buffer underruns #1503

Open

mossmann previously requested changes Nov 15, 2025

View reviewed changes

martinling force-pushed the extra-buffer branch 3 times, most recently from ffc5ad4 to 33bf9ad Compare November 17, 2025 16:56

martinling force-pushed the extra-buffer branch from 33bf9ad to b1b2e25 Compare November 18, 2025 13:54

martinling force-pushed the extra-buffer branch from b1b2e25 to 10a9836 Compare December 1, 2025 23:43

martinling changed the base branch from main to praline December 18, 2025 07:51

martinling force-pushed the extra-buffer branch from 6e596a1 to f78ad7f Compare December 18, 2025 07:51

martinling force-pushed the extra-buffer branch from f78ad7f to 10a7582 Compare January 5, 2026 15:40

martinling changed the base branch from praline to main January 5, 2026 16:36

martinling requested a review from mossmann January 5, 2026 16:40

martinling force-pushed the extra-buffer branch from 483ad5f to bba9a13 Compare February 24, 2026 19:44

martinling force-pushed the extra-buffer branch 2 times, most recently from 8bb59b7 to 8f22c35 Compare March 16, 2026 19:17

mossmann requested changes Mar 17, 2026

View reviewed changes

martinling marked this pull request as draft March 17, 2026 15:09

martinling force-pushed the extra-buffer branch 2 times, most recently from 11b7d98 to 13d53ea Compare March 24, 2026 20:30

martinling marked this pull request as ready for review March 25, 2026 17:06

martinling force-pushed the extra-buffer branch from 5908711 to 3da9326 Compare March 30, 2026 15:44

martinling added 2 commits April 10, 2026 22:35

Rename usb_bulk_buffer to usb_samp_buffer.

b99fe52

Add new usb_bulk_buffer.

b487a7b

martinling force-pushed the extra-buffer branch from 3da9326 to dd937a0 Compare April 10, 2026 21:35

martinling added 9 commits April 10, 2026 22:44

Use both buffers, with memcpy in place of DMA for now.

c101b96

Reduce memory transfer size to 8KB.

773eb6b

Flush both buffers before leaving TX mode.

ff3630b

Read USB API version in hackrf_open() and cache it.

9618f33

NAK the request that takes us out of TX mode until buffer flushed.

d7eef3f

Add vendor request to retrieve buffer size.

bd75296

For API >= 1.12, get buffer size when opening device.

4bb5b07

If the host knows our buffer size, don't auto-flush.

f943f3a

Use DMA for transfers.

92ce978

martinling force-pushed the extra-buffer branch from dd937a0 to 92ce978 Compare April 10, 2026 21:51

Conversation

martinling commented Oct 20, 2025 • edited by mossmann Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mossmann commented Nov 14, 2025

Uh oh!

mossmann left a comment

Choose a reason for hiding this comment

Uh oh!

mossmann commented Nov 15, 2025

Uh oh!

martinling commented Nov 15, 2025

Uh oh!

martinling commented Nov 17, 2025

Uh oh!

martinling commented Nov 18, 2025

Uh oh!

martinling commented Nov 18, 2025

Uh oh!

martinling commented Jan 5, 2026

Uh oh!

mossmann commented Jan 7, 2026

Uh oh!

martinling commented Mar 16, 2026

Uh oh!

mossmann left a comment

Choose a reason for hiding this comment

Uh oh!

martinling commented Mar 17, 2026

Uh oh!

martinling commented Mar 25, 2026

Uh oh!

mossmann commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

martinling commented Mar 25, 2026

Uh oh!

martinling commented Mar 30, 2026

Uh oh!

martinling commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

martinling commented Oct 20, 2025 •

edited by mossmann

Loading

mossmann commented Mar 25, 2026 •

edited

Loading