[Bug rtl-optimization/42575] arm-eabi-gcc 64-bit multiply weirdness
ktkachov at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Mon Nov 17 16:23:00 GMT 2014
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42575
ktkachov at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |REOPENED
Resolution|FIXED |---
--- Comment #13 from ktkachov at gcc dot gnu.org ---
So I see this regression still, but only for some -mcpu options.
For example for -mcpu=cortex-a15 we get:
mul r3, r0, r3
strd r4, [sp, #-8]!
umull r4, r5, r0, r2
mla r1, r2, r1, r3
mov r0, r4
add r5, r1, r5
mov r1, r5
ldrd r4, [sp]
add sp, sp, #8
whereas for cortex-a7 we get:
mul r3, r0, r3
mla r3, r2, r1, r3
umull r0, r1, r0, r2
add r1, r3, r1
I think the problem here is reload.
If I look at the the dump of postreload, for the 'bad' RTL I see:
r0(SI) := r0(SI)
r3(SI) := r0(SI) * r3(SI)
r4(DI) := r0(SI) * r2(SI) //with sign extension
r1(SI) := r2(SI) * r1(SI) + r3(SI)
r5(SI) := r1(SI) + r5(SI)
r0(DI) := r4(DI)
whereas for the good one I see:
r0(SI) := r0(SI)
r3(SI) := r0(SI) * r3(SI)
r3(SI) := r2(SI) * r1(SI) + r3(SI)
r0(DI) := r0(SI) * r2(SI) //with sign extension
r1(SI) := r3(SI) + r1(SI)
r0(DI) := r0(DI)
In the good one the final insn is eliminated due to being dead, whereas the in
the bad one the final DImode move is split into two moves.
Sched1 changed the order of the mult and mult-accumulate but it's the register
allocator that causes the bad codegen
More information about the Gcc-bugs
mailing list