[SYCL] Implement sycl_ext_oneapi_fp8 for CRI#21568
Conversation
| #ifdef __SYCL_DEVICE_ONLY__ | ||
| // New FP8 builtins | ||
| extern __DPCPP_SYCL_EXTERNAL sycl::half | ||
| __builtin_spirv_ClampConvertE4M3ToFP16INTEL(uint8_t) noexcept; |
There was a problem hiding this comment.
I was just talking with @bashbaug today, and he mentioned that there is a multi-vendor SPIR-V extension for FP8 conversions. It looks like this PR is using the proposed Intel extension. I think this is the multi-vendor extension:
https://github.khronos.org/SPIRV-Registry/extensions/EXT/SPV_EXT_float8.html
@bashbaug do you know if our driver supports the multi-vendor extension? @dklochkov-emb does the multi-vendor extension provide all the functionality we need, or would we still need something from the Intel extension?
There was a problem hiding this comment.
do you know if our driver supports the multi-vendor extension?
I believe there is some experimental support, but the extension is not formally supported currently.
It would be good to understand:
- Whether any functionality is fundamentally missing from the EXT extension. For example, the EXT extension currently does not include any stochastic rounding functionality. Is this needed, and if so, can it be provided by a layered extension?
- Any details about required support with the EXT extension. For example, which rounding modes should be supported, and which other floating-point types should be supported for conversions to/from the fp8 types?
There was a problem hiding this comment.
For example, the EXT extension currently does not include any stochastic rounding functionality. Is this needed, and if so, can it be provided by a layered extension?
Yes, we need stochastic rounding for E5M2 when converting from half, bfloat16, and float. I suppose it could be implemented by a layered extension, but it would be up to use to define such an extension.
For example, which rounding modes should be supported
For E45M3, we need only RNE with saturation.
For E5M2, we need only RNE, but we need both saturation and non-saturation.
and which other floating-point types should be supported for conversions to/from the fp8 types
I think the Intel SPIRV extension supports conversions to/from half, bfloat16, and float.
It seems like we should do one of two things:
-
Do all of the following:
- Fully implement the EXT extension.
- Define a layered SPIRV extension with the missing features.
- Change this PR to use the EXT and that new layered extension.
-
Clean up and publish the existing Intel SPIRV extension.
There was a problem hiding this comment.
I believe we already have everything ready in terms of SPIR-V specs and support -- SPV_INTEL_fp_conversions was published in December 2025 as a layered extension on top of SPV_EXT_float8, and it covers all the functionality you mention.
Specifically:
- RNE without saturation (E5M2):
OpFConvertfromSPV_EXT_float8 - RNE with saturation (E4M3 and E5M2):
OpClampConvertFToFINTELfromSPV_INTEL_fp_conversions - Stochastic rounding (E5M2, with and without saturation):
OpStochasticRoundFToFINTEL/OpClampStochasticRoundFToFINTELfromSPV_INTEL_fp_conversions - Source types (
half,bfloat16,float): supported in both extensions
There was a problem hiding this comment.
I agree that the EXT extension does not support stochastic rounding. It should be OK to use OpStochasticRoundFToFINTEL and OpClampStochasticRoundFToFINTEL for this instead, though.
There was a problem hiding this comment.
SPV_INTEL_fp_conversions was published in December 2025 as a layered extension on top of
SPV_EXT_float8, and it covers all the functionality you mention.
This extension is only "approved for public preview" at present. Since we plan to use it in SYCL, it should be promoted to "approved for publication", and it should be published on the Khronos website.
- RNE with saturation (E4M3 and E5M2):
OpClampConvertFToFINTELfromSPV_INTEL_fp_conversions
It seems like this functionality is also provided by the SPV_EXT_float8 extension via the SaturatedToLargestFloat8NormalConversionEXT decoration. If we want SPV_INTEL_fp_conversions to contain only additional functionality, should we remove OpClampConvertFToFINTEL? CC: @bashbaug
It also seems like there is some functionality required by SYCL that is missing from both SPIRV extensions. The SYCL extension requires saturation (clamped) conversion when converting to integer. The INTEL extension provides OpClampConvertFToSINTEL, but that only works when converting to a signed integer. For some reason, there is no clamped equivalent to OpConvertFToU. Was this an oversight in the SPIRV extension?
I was trying to remember why I specified the SYCL extension to require saturation (clamped) conversion for integers. C++ says that FP to integer conversions are undefined when the value cannot be represented after rounding. I think I required saturation because this is how the CUDA API works. If you scroll way down to "Integer notes", you can see that the PTX cvt instruction guarantees saturation when converting to an integer value.
There was a problem hiding this comment.
Updated to use EXT where it is possible
|
@intel/llvm-reviewers-runtime please, review |
There was a problem hiding this comment.
Pull request overview
This PR introduces initial host-side implementations of OneAPI experimental FP8 types (E4M3, E5M2, E8M0) and adds unit tests validating conversions/encodings plus tests that verify the expected SPIR-V builtin call paths (via mocks) for device-only code paths.
Changes:
- Added
sycl::ext::oneapi::experimentalFP8 type definitions and conversion logic in a new public header. - Added unit tests for value/encoding behavior for fp8 E4M3, E5M2, and E8M0 on CPU.
- Added mocked builtin implementations and tests to confirm which builtins are invoked from constructors/conversions.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
sycl/include/sycl/ext/oneapi/experimental/float_8bit/types.hpp |
Adds the FP8 type implementations, conversion helpers, and builtin hooks. |
sycl/unittests/Extensions/fp8/fp8_e4m3.cpp |
Adds host unit tests for E4M3 encoding/conversion behavior. |
sycl/unittests/Extensions/fp8/fp8_e5m2.cpp |
Adds host unit tests for E5M2 encoding/conversion behavior. |
sycl/unittests/Extensions/fp8/fp8_e8m0.cpp |
Adds host unit tests for E8M0 encoding/conversion behavior. |
sycl/unittests/Extensions/fp8/builtin_mocks.hpp |
Provides mock builtin symbols and counters to validate builtin call paths. |
sycl/unittests/Extensions/fp8/builtin_call_tests.cpp |
Adds tests asserting the expected builtin calls occur (via the mocks). |
sycl/unittests/Extensions/fp8/CMakeLists.txt |
Registers the new FP8 unit test target and sources. |
sycl/unittests/Extensions/CMakeLists.txt |
Adds the new fp8 unit test subdirectory to the Extensions suite. |
gmlueck
left a comment
There was a problem hiding this comment.
Here are my comments so far. I haven't looked at E8M0 yet, or at any of the CPU conversions.
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
sycl_ext_oneapi_fp8 for CRI
Co-authored-by: Copilot <copilot@github.com>
maksimsab
left a comment
There was a problem hiding this comment.
SYCLPostLink part LGTM.
Co-authored-by: Copilot <copilot@github.com>
|
E2E tests were added but it is required driver version with bug fixes to pass on CI. Locally all tests pass |
Co-authored-by: Copilot <copilot@github.com>
|
@gmlueck Please, review it |
| explicit fp8_e8m0_x(half const (&in)[N], rounding r = rounding::upward) { | ||
| CheckConstraints(r); | ||
| for (size_t i = 0; i < N; ++i) | ||
| vals[i] = detail::ConvertFloatToE8M0_CPU(in[i], r, saturation::finite); | ||
| } | ||
|
|
||
| explicit fp8_e8m0_x(bfloat16 const (&in)[N], rounding r = rounding::upward) { | ||
| CheckConstraints(r); | ||
| for (size_t i = 0; i < N; ++i) | ||
| vals[i] = detail::ConvertFloatToE8M0_CPU(in[i], r, saturation::finite); | ||
| } | ||
|
|
||
| explicit fp8_e8m0_x(float const (&in)[N], rounding r = rounding::upward) { | ||
| CheckConstraints(r); | ||
| for (size_t i = 0; i < N; ++i) | ||
| vals[i] = detail::ConvertFloatToE8M0_CPU(in[i], r, saturation::finite); | ||
| } | ||
|
|
||
| explicit fp8_e8m0_x(const marray<half, N> &in, | ||
| rounding r = rounding::upward) { | ||
| CheckConstraints(r); | ||
| for (size_t i = 0; i < N; ++i) | ||
| vals[i] = detail::ConvertFloatToE8M0_CPU(in[i], r, saturation::finite); | ||
| } | ||
|
|
||
| explicit fp8_e8m0_x(const marray<bfloat16, N> &in, | ||
| rounding r = rounding::upward) { | ||
| CheckConstraints(r); | ||
| for (size_t i = 0; i < N; ++i) | ||
| vals[i] = detail::ConvertFloatToE8M0_CPU(in[i], r, saturation::finite); | ||
| } | ||
|
|
||
| explicit fp8_e8m0_x(const marray<float, N> &in, | ||
| rounding r = rounding::upward) { | ||
| CheckConstraints(r); | ||
| for (size_t i = 0; i < N; ++i) | ||
| vals[i] = detail::ConvertFloatToE8M0_CPU(in[i], r, saturation::finite); | ||
| } |
There was a problem hiding this comment.
Maybe use templates with #define to avoid code duplication in the entire file?
gmlueck
left a comment
There was a problem hiding this comment.
I didn't finish reviewing this yet, but I wanted to send the comments I have so far. So far, I have just reviewed the E4M3 and E5M2 device-side code.
| #ifdef __SYCL_DEVICE_ONLY__ | ||
| _Float16 v{0}; | ||
| if constexpr (std::is_same_v<std::decay_t<T>, sycl::half>) | ||
| v = static_cast<_Float16>(static_cast<float>(h)); |
There was a problem hiding this comment.
I don't understand this line. Isn't half the same as _Float16? Why do we convert from half to float and then back to _Float16?
| if constexpr (std::is_same_v<std::decay_t<T>, sycl::half>) | ||
| v = static_cast<_Float16>(static_cast<float>(h)); | ||
| else | ||
| v = static_cast<_Float16>(h); |
There was a problem hiding this comment.
When T is float, doesn't this have the same two-step conversion problem that is described here: https://github.com/intel-tools/intel-xpu-backend-for-triton/issues/847 ?
I'm not sure if there are also similar problems when T is an integer type. Did you consider this and decide it is safe?
| if constexpr (((std::is_same_v<std::decay_t<Types>, bfloat16>) && ...)) { | ||
| const bfloat16 in[N] = {static_cast<bfloat16>(v)...}; | ||
| for (size_t i = 0; i < N; ++i) | ||
| vals[i] = ConvertBF16ToFP8(in[i]); |
There was a problem hiding this comment.
The SPIR-V instructions __builtin_spirv_ClampConvertFP16ToE4M3INTEL and __builtin_spirv_ClampConvertBF16ToE4M3INTEL allow the input to be a vector. In the N == 2 case, I wonder if we are losing some performance by calling the instruction twice (each with a scalar input) rather than calling it once (with a vector of two elements).
| explicit operator sycl::marray<sycl::half, N>() const { | ||
| sycl::marray<sycl::half, N> ret; | ||
| for (size_t i = 0; i < N; ++i) | ||
| ret[i] = ConvertFromFP8<sycl::half>(vals[i]); |
There was a problem hiding this comment.
Same comment here about performance. I wonder if we are losing some performance by calling this in a loop rather than using the vector versions of the conversion instructions.
| } | ||
| current_seed = *seed.pseed; | ||
| } | ||
| #endif |
There was a problem hiding this comment.
We should either assert or throw an exception on host.
| #ifdef __SYCL_DEVICE_ONLY__ | ||
| uint32_t current_seed = *seed.pseed; | ||
| for (size_t i = 0; i < N; ++i) { | ||
| const _Float16 v = static_cast<_Float16>(static_cast<float>(in[i])); |
There was a problem hiding this comment.
Same question as earlier, why convert from half to float and back to _Float16?
| // Intentionally public to allow access to the raw values. | ||
| uint8_t vals[N]; | ||
| }; | ||
|
|
There was a problem hiding this comment.
I think the deduction guides are missing for all three of these classes. The spec calls for this deduction guide for the fp8_e4m3_x class, for example:
// Deduction guide available only when the size of the pack is greater than zero.
template<typename... Ts>
fp8_e4m3_x(Ts...) -> fp8_e4m3_x<sizeof...(Ts)>;
gmlueck
left a comment
There was a problem hiding this comment.
All these comments are about the SPIR-V intrinsics. We should use the SPV_EXT_float8 extension as much as possible and only use SPV_INTEL_fp_conversions for the things it does not contain. It's not clear to me which SPIR-V instructions are being emitted from these intrinsics, but some of the names make it sound like you may be emitting SPV_INTEL_fp_conversions instead of SPV_EXT_float8.
I've listed the SPIR-V instructions I think we should use for each of the intrinsics below.
Tagging @bashbaug here also to make sure I got this right. :-)
| // FP8 builtins | ||
|
|
||
| extern __DPCPP_SYCL_EXTERNAL uint8_t | ||
| __builtin_spirv_ClampConvertFP16ToE4M3INTEL(_Float16) noexcept; |
There was a problem hiding this comment.
I think this should be using OpFConvert with Float8E4M3EXT as the destination FP Encoding.
I see that the name of this builtin ends with "INTEL", so I presume it is not emitting OpFConvert with Float8E4M3EXT now?
| __builtin_spirv_ClampConvertFP16ToE4M3INTEL(_Float16) noexcept; | ||
|
|
||
| extern __DPCPP_SYCL_EXTERNAL _Float16 | ||
| __builtin_spirv_ConvertE4M3ToFP16EXT(char) noexcept; |
There was a problem hiding this comment.
And this should be using OpFconvert with Float8E4M3EXT as the source FP Encoding.
I see that the name of this builtin ends with "EXT". Does that mean it is already emitting this SPIR-V instruction?
| __builtin_spirv_ConvertE4M3ToFP16EXT(char) noexcept; | ||
|
|
||
| extern __DPCPP_SYCL_EXTERNAL uint8_t | ||
| __builtin_spirv_ClampConvertBF16ToE4M3INTEL(__bf16) noexcept; |
There was a problem hiding this comment.
This should use OpFconvert with Float8E4M3EXT as the destination FP Encoding and decorated with SaturatedToLargestFloat8NormalConversionEXT.
| extern __DPCPP_SYCL_EXTERNAL uint8_t | ||
| __builtin_spirv_ClampConvertBF16ToE4M3INTEL(__bf16) noexcept; | ||
| extern __DPCPP_SYCL_EXTERNAL __bf16 | ||
| __builtin_spirv_ConvertE4M3ToBF16EXT(uint8_t) noexcept; |
There was a problem hiding this comment.
This should use OpFConvert with Float8E4M3EXT as the source FP Encoding and BFloat16KHR as the destination FP Encoding.
| extern __DPCPP_SYCL_EXTERNAL __bf16 | ||
| __builtin_spirv_ConvertE4M3ToBF16EXT(uint8_t) noexcept; | ||
| extern __DPCPP_SYCL_EXTERNAL uint8_t | ||
| __builtin_spirv_ClampConvertFP16ToE5M2INTEL(_Float16) noexcept; |
There was a problem hiding this comment.
This should use OpFConvert with Float8E5M2EXT as the destination FP Encoding and the SaturatedToLargestFloat8NormalConversionEXT decoration.
| extern __DPCPP_SYCL_EXTERNAL _Float16 | ||
| __builtin_spirv_ConvertE5M2ToFP16EXT(uint8_t) noexcept; | ||
| extern __DPCPP_SYCL_EXTERNAL uint8_t | ||
| __builtin_spirv_ClampConvertBF16ToE5M2INTEL(__bf16) noexcept; |
There was a problem hiding this comment.
This should use OpFConvert with BFloat16KHR as the source FP Encoding and Float8E5M2EXT as the destination FP Encoding and decorated with SaturatedToLargestFloat8NormalConversionEXT.
| extern __DPCPP_SYCL_EXTERNAL uint8_t | ||
| __builtin_spirv_ClampConvertBF16ToE5M2INTEL(__bf16) noexcept; | ||
| extern __DPCPP_SYCL_EXTERNAL uint8_t | ||
| __builtin_spirv_ConvertBF16ToE5M2EXT(__bf16) noexcept; |
There was a problem hiding this comment.
This should use OpFConvert with BFloat16KHR as the source FP Encoding and Float8E5M2EXT as the destination FP Encoding.
| extern __DPCPP_SYCL_EXTERNAL uint8_t | ||
| __builtin_spirv_ConvertBF16ToE5M2EXT(__bf16) noexcept; | ||
| extern __DPCPP_SYCL_EXTERNAL __bf16 | ||
| __builtin_spirv_ConvertE5M2ToBF16EXT(uint8_t) noexcept; |
There was a problem hiding this comment.
This should use OpFConvert with Float8E5M2EXT as the source FP Encoding and BFloat16KHR as the destination FP Encoding.
| __builtin_spirv_ConvertE5M2ToBF16EXT(uint8_t) noexcept; | ||
| extern __DPCPP_SYCL_EXTERNAL uint8_t | ||
| __builtin_spirv_ClampStochasticRoundFP16ToE5M2INTEL(_Float16, uint32_t, | ||
| uint32_t *) noexcept; |
There was a problem hiding this comment.
This should use OpStochasticRoundFToFINTEL with the SaturatedToLargestFloat8NormalConversionEXT decoration.
| uint32_t *) noexcept; | ||
| extern __DPCPP_SYCL_EXTERNAL uint8_t | ||
| __builtin_spirv_ClampStochasticRoundBF16ToE5M2INTEL(__bf16, uint32_t, | ||
| uint32_t *) noexcept; |
There was a problem hiding this comment.
This should use OpStochasticRoundFToFINTEL with BFloat16KHR as the source FP Encoding and the SaturatedToLargestFloat8NormalConversionEXT decoration.
| #include <stdexcept> | ||
| #include <type_traits> | ||
|
|
||
| #define SYCL_EXT_ONEAPI_FP8 1 |
There was a problem hiding this comment.
I think we normally define these macros in "sycl/source/feature_test.hpp.in".
This PR adds FP8 types for CRI docs.
It adds the first part which includes: