Audio: STFT Process: Add Xtensa HiFi function versions by singalsu · Pull Request #10638 · thesofproject/sof

singalsu · 2026-03-20T15:53:13Z

This patch adds to stft_process-hifi3.c the HiFi3 versions of higher complexity functions stft_process_apply_window() and stft_process_overlap_add_ifft_buffer().

The functions with no clear HiFi optimization benefit are moved from stft_process-generic.c to stft_process_common.c. Those functions move data with practically no processing to samples.

This change saves 17 MCPS (from 63 MCPS to 46 MCPS). The test was done with script run:

scripts/rebuild-testbench.sh -p mtl
scripts/sof-testbench-helper.sh -x -m stft_process_1024_256_
-p profile-stft_process.txt

The above STFT used FFT length 1024 with hop 256.

This patch adds to stft_process-hifi3.c the HiFi3 versions of higher complexity functions stft_process_apply_window() and stft_process_overlap_add_ifft_buffer(). The functions with no clear HiFi optimization benefit are moved from stft_process-generic.c to stft_process_common.c. Those functions move data with practically no processing to samples. This change saves 17 MCPS (from 63 MCPS to 46 MCPS). The test was done with script run: scripts/rebuild-testbench.sh -p mtl scripts/sof-testbench-helper.sh -x -m stft_process_1024_256_ \ -p profile-stft_process.txt The above STFT used FFT length 1024 with hop 256. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>

Copilot

Pull request overview

Adds HiFi3 SIMD implementations for STFT hot-path helpers and refactors shared, non-SIMD-specific routines into a common compilation unit to reduce MCPS.

Changes:

Add HiFi3 intrinsic implementations of stft_process_apply_window() and stft_process_overlap_add_ifft_buffer().
Move source/sink and buffer-fill helper functions from stft_process-generic.c into stft_process_common.c.
Introduce Kconfig SIMD level selection and update build sources to include the HiFi3 unit.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
src/audio/stft_process/stft_process_common.c	Adds shared source/sink and FFT buffer fill helpers (moved from generic).
src/audio/stft_process/stft_process-hifi3.c	New HiFi3 intrinsic implementations for windowing + overlap-add.
src/audio/stft_process/stft_process-generic.c	Removes moved helpers; wraps generic implementations behind `SOF_USE_HIFI(NONE, ...)`.
src/audio/stft_process/Kconfig.simd	Adds Kconfig choice for SIMD optimization level selection.
src/audio/stft_process/Kconfig	Includes the new SIMD Kconfig via `rsource`.
src/audio/stft_process/CMakeLists.txt	Adds the HiFi3 compilation unit to the build.

Comments suppressed due to low confidence (1)

src/audio/stft_process/stft_process-hifi3.c:1

The function relies on 64-bit alignment and even-sample constraints but does not enforce either at runtime. Misalignment can cause load/store exceptions or significant penalties depending on the core/config, and “even samples” is already required to avoid the >> 1 infinite-loop hazard. Add an explicit alignment/size assertion (or a guarded scalar fallback when (uintptr_t)obuf->w_ptr is not 8-byte aligned or when the contiguous region before wrap is odd-length) to make failures deterministic and easier to diagnose.

// SPDX-License-Identifier: BSD-3-Clause

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-20T16:12:07Z