If we need get the minimal of 8 floats in an array. We may have code like this float min(float *x) { float ret = x[0]; for (int i=0; i<8; i++) { // from 0 in this line ret = ret<x[i] ? ret : x[i]; } return ret; } While if we compile it with aarch64-linux-gnu-gcc -O3 -ffast-math -S xx.c We get ldp q0, q1, [x0] ld1r {v31.4s}, [x0] # <-- not needed fminnm v31.4s, v1.4s, v31.4s # <-- not needed fminnm v0.4s, v31.4s, v0.4s fminnmv s0, v0.4s ret And maybe we can also use float min(float *x) { float ret = x[0]; for (int i=1; i<8; i++) { // from 1 in this line ret = ret<x[i] ? ret : x[i]; } return ret; } It will be even worse ldr q31, [x0, 4] ld1r {v30.4s}, [x0] ldp s0, s29, [x0, 20] fminnm v31.4s, v31.4s, v30.4s ldr s30, [x0, 28] fminnm s0, s0, s29 fminnmv s31, v31.4s fminnm s31, s30, s31 fminnm s0, s0, s31 ret
I thought I saw this before.
Yep pr 102512 *** This bug has been marked as a duplicate of bug 102512 ***