Take: ``` void f(short *a, float *b) { a[0] = b[0]; a[1] = b[1]; a[2] = b[2]; a[3] = b[3]; } void f1(float *a, short *b) { a[0] = b[0]; a[1] = b[1]; a[2] = b[2]; a[3] = b[3]; } ``` GCC can SLP f1 (which does V4SF->V4HI) but not f1. LLVM can though: ``` f: ldr q0, [x1] fcvtzs v0.4s, v0.4s xtn v0.4h, v0.4s str d0, [x0] ret ```
The vectoizer has some of these tricks but the intermediate conversion allowed is somewhat hard-coded. I think the C standard says SF -> HI invokes undefined behavior on overflow so the conversion should be valid.
I have someone working on this,
In fact, GCC is able to vectorize through intermediate conversions if we pass -fno-trapping-math. There's a bug (PR54192) open discussing if the flag should be set by default.