Expected behavior
maximum(NaN, x) should return NaN per IEEE 754-2019 §9.6, consistent with NumPy, PyTorch, JAX, and ONNX Runtime.
relu(NaN) should return NaN (since relu = max(x, 0)).
Actual behavior
When NaN is the first operand of T.max / T.min, the result is the second operand instead of NaN. This affects R.maximum, R.minimum, R.nn.relu, and R.clip.
The root cause is that T.max(a, b) compiles to x86 maxss/maxps instructions, which have the hardware behavior: "if src1 is NaN, return src2". IEEE 754 requires returning NaN when either operand is NaN.
Reproducer
import numpy as np
import tvm
from tvm import relax
import tvm.relax.op as R
from tvm.relax.transform import LegalizeOps
bb = relax.BlockBuilder()
a = relax.Var("a", relax.TensorStructInfo((4,), "float32"))
b = relax.Var("b", relax.TensorStructInfo((4,), "float32"))
with bb.function("main", [a, b]):
with bb.dataflow():
gv = bb.emit_output(bb.emit(R.maximum(a, b)))
bb.emit_func_output(gv)
mod = bb.finalize()
pipeline = tvm.ir.transform.Sequential([LegalizeOps()])
exe = tvm.relax.build(pipeline(mod), target="llvm")
vm = tvm.relax.VirtualMachine(exe, device=tvm.cpu())
A = np.array([np.nan, 1.0, np.nan, 0.0], np.float32)
B = np.array([1.0, np.nan, np.nan, np.nan], np.float32)
out = vm["main"](
tvm.runtime.tensor(A, device=tvm.cpu()),
tvm.runtime.tensor(B, device=tvm.cpu()),
).numpy()
print(out) # [1. nan nan nan] — element 0 is WRONG
print(np.maximum(A, B)) # [nan nan nan nan] — all NaN per IEEE 754
The pattern is operand-order-dependent:
| Expression |
TVM |
Expected (IEEE 754) |
max(NaN, 1.0) |
1.0 |
NaN |
max(1.0, NaN) |
NaN |
NaN |
relu(NaN) = max(NaN, 0) |
0.0 |
NaN |
clip(NaN, -1, 1) |
1.0 |
NaN |
Affected operations
R.maximum(a, b) # when a is NaN
R.minimum(a, b) # when a is NaN
R.nn.relu(x) # when x is NaN → returns 0
R.clip(x, lo, hi) # when x is NaN → returns hi
Not affected (correct NaN propagation):
R.add, R.multiply, R.subtract, R.divide — arithmetic propagates NaN correctly
R.nn.leakyrelu — uses comparison path, NaN propagates through multiply
R.nn.silu, R.nn.gelu — sigmoid/erf path propagates NaN
Why this matters
relu is the most common activation function. When an upstream computation produces NaN (e.g., from overflow or division by zero), the NaN should propagate to signal the error. Instead, TVM's relu silently converts NaN to 0, making the error invisible:
# Suppose upstream overflow produces NaN in one element:
x = [[1.0, 2.0, NaN, 4.0]]
relu(x).sum()
# TVM: 7.0 ← NaN silently disappeared
# NumPy: NaN ← correctly signals the problem
This can cause silent wrong results in production models, where NaN detection is a standard debugging/monitoring signal.
Root cause
In the lowered TIR, maximum becomes T.max(a, b), which LLVM lowers to x86 maxss/maxps. These instructions follow "if src1 is NaN, return src2" semantics rather than IEEE 754 "return NaN if either is NaN".
The fix would be to emit NaN-aware max/min, e.g.:
select(isnan(a) | isnan(b), NaN, max(a, b))
Environment
- TVM commit: 0b0afd8 (main, 2026-04-24)
- OS: Ubuntu 20.04
- Target: llvm (CPU, x86-64)
Expected behavior
maximum(NaN, x)should returnNaNper IEEE 754-2019 §9.6, consistent with NumPy, PyTorch, JAX, and ONNX Runtime.relu(NaN)should returnNaN(since relu = max(x, 0)).Actual behavior
When NaN is the first operand of
T.max/T.min, the result is the second operand instead of NaN. This affectsR.maximum,R.minimum,R.nn.relu, andR.clip.The root cause is that
T.max(a, b)compiles to x86maxss/maxpsinstructions, which have the hardware behavior: "if src1 is NaN, return src2". IEEE 754 requires returning NaN when either operand is NaN.Reproducer
The pattern is operand-order-dependent:
max(NaN, 1.0)1.0NaNmax(1.0, NaN)NaNNaNrelu(NaN)=max(NaN, 0)0.0NaNclip(NaN, -1, 1)1.0NaNAffected operations
Not affected (correct NaN propagation):
R.add,R.multiply,R.subtract,R.divide— arithmetic propagates NaN correctlyR.nn.leakyrelu— uses comparison path, NaN propagates through multiplyR.nn.silu,R.nn.gelu— sigmoid/erf path propagates NaNWhy this matters
reluis the most common activation function. When an upstream computation produces NaN (e.g., from overflow or division by zero), the NaN should propagate to signal the error. Instead, TVM'srelusilently converts NaN to 0, making the error invisible:This can cause silent wrong results in production models, where NaN detection is a standard debugging/monitoring signal.
Root cause
In the lowered TIR,
maximumbecomesT.max(a, b), which LLVM lowers to x86maxss/maxps. These instructions follow "if src1 is NaN, return src2" semantics rather than IEEE 754 "return NaN if either is NaN".The fix would be to emit NaN-aware max/min, e.g.:
Environment