Related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106326 . According to the Arm C Language Extension for SVE, when the _x predicate is used, > The compiler can then pick whichever form of instruction seems to give the best code. This includes using unpredicated instructions, where available and suitable Because of this, I'm expecting the following to be optimized to a single add instruction, as if a `svptrue_b64()` predicate is used. ``` svfloat64_t add(svfloat64_t a, svfloat64_t b) { auto und_ok = svcmpge(svptrue_b64(), a, b); return svadd_x(und_ok, a, b); } ``` However, gcc compiles this as _m and generates ``` ptrue p0.b, all fcmge p0.d, p0/z, z0.d, z1.d fadd z0.d, p0/m, z0.d, z1.d ``` In general, is there any reason not to treat an `add_x` (also other side-effect-free functions) with an unknown predicate as unpredicated one?
This is because performing the addition on the inactive lanes could trigger an IEEE exception. The code is optimised to an unpredicated FADD with -ffast-math.