This is because complx_t is passed and returned in FPRs, but GCC gives
it DImode. We therefore “need” to assemble a DImode pseudo from the
two individual floats, bitcast it to a vector, do the arithmetic,
bitcast it back to a DImode pseudo, then extract the individual floats.
There are many problems here. The most basic is that we shouldn't
use SLP for such a trivial example. But SLP should in principle be
beneficial for more complicated examples, so preventing SLP for the
example above just changes the reproducer needed. A more fundamental
problem is that it doesn't make sense to use single DImode pseudos in a
testcase like this. I have a WIP patch to allow re and im to be stored
in individual SFmode pseudos instead, but it's quite an invasive change
and might end up going nowhere.
A simpler problem to tackle is that we allow DImode pseudos to be stored
in FPRs, but we don't provide any patterns for inserting values into
them, even though INS makes that easy for element-like insertions.
This patch adds some patterns for that.
Doing that showed that aarch64_modes_tieable_p was too strict:
it didn't allow SFmode and DImode values to be tied, even though
both of them occupy a single GPR and FPR, and even though we allow
both classes to change between the modes.
The *aarch64_bfidi<ALLX:mode>_subreg_<SUBDI_BITS> pattern is
especially ugly, but it's not clear what target-independent
code ought to simplify it to, if it was going to simplify it.
We should probably do the same thing for extractions, but that's left
as future work.
After the patch we generate:
ins v0.s[1], v1.s[0]
ins v2.s[1], v3.s[0]
fadd v0.2s, v0.2s, v2.2s
fmov x0, d0
ushr d1, d0, 32
lsr w0, w0, 0
fmov s0, w0
ret
which seems like a step in the right direction.
All in all, there's nothing elegant about this patchh. It just
seems like the least worst option.
gcc/
PR target/109632
* config/aarch64/aarch64.cc (aarch64_modes_tieable_p): Allow
subregs between any scalars that are 64 bits or smaller.
* config/aarch64/iterators.md (SUBDI_BITS): New int iterator.
(bits_etype): New int attribute.
* config/aarch64/aarch64.md (*insv_reg<mode>_<SUBDI_BITS>)
(*aarch64_bfi<GPI:mode><ALLX:mode>_<SUBDI_BITS>): New patterns.
(*aarch64_bfidi<ALLX:mode>_subreg_<SUBDI_BITS>): Likewise.