lower-subreg.c:compute_costs() only cares about the cost of a (set (reg)
(const_int )) move but I think the intention, at least for now, is to
return extra_cost->vect.alu for all the vector operations.
Almost, what we want at the moment is COSTS_N_INSNS (1) +
extra_cost->vect.alu
Thanks Kyrill for the review.
Regression tested on aarch64-linux-gnu with no new regression. Is this
OK for trunk?
Are you sure it's a (set (reg) (const_int)) that's being costed here? I
thought for moves into vecto registers it would be a (set (reg)
(const_vector)) which we don't handle in our rtx costs currently. I
think the correct approach would be to extend the aarch64_rtx_costs
switch statement to handle the CONST_VECT case. I believe you can use
aarch64_simd_valid_immediate to check whether x is a valid immediate for
a simd instruction and give it a cost of extra_cost->vect.alu. The logic
should be similar to the CONST_INT case.
Sorry about the (set (reg) (const_int)) above. But the actual RTL that
is being split at 220r.subreg2 is
(insn 11 10 12 2 (set (subreg:V4SF (reg/v:OI 77 [ __o ]) 0)
(subreg:V4SF (reg/v:OI 73 [ __o ]) 0))
/home/kugan/work/builds/gcc-fsf-gcc/tools/lib/gcc/aarch64-none-linux-gnu/5.0.0/include/arm_neon.h:22625
800 {*aarch64_simd_movv4sf}
(nil))
And also, if we return RTX cost above COSTS_N_INSNS (1), it will be
split and it dosent recover from there. Therefore we need something like
the below to prevent that happening.