This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug middle-end/84067] [8 regression] gcc.dg/wmul-1.c regression on aarch64 after r257077
- From: "rguenther at suse dot de" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Mon, 29 Jan 2018 10:24:21 +0000
- Subject: [Bug middle-end/84067] [8 regression] gcc.dg/wmul-1.c regression on aarch64 after r257077
- Auto-submitted: auto-generated
- References: <bug-84067-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84067
--- Comment #5 from rguenther at suse dot de <rguenther at suse dot de> ---
On Mon, 29 Jan 2018, ktkachov at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84067
>
> --- Comment #3 from ktkachov at gcc dot gnu.org ---
> (In reply to Richard Biener from comment #2)
> > So any hint on whether the code after r257077 is better or worse than before?
>
> Looks worse unfortunately:
> For aarch64 at -O2 it generates:
> foo:
> mov w3, 44
> mov w2, 40
> mov w5, 1
> mov w4, 2
> smull x3, w1, w3
> smull x2, w1, w2
> str w5, [x0, x3]
> add x2, x2, 400
> add x1, x2, x1, sxtw 2
> str w4, [x0, x1]
> ret
>
> whereas with r257077 it generates the shorter:
> foo:
> mov w3, 40
> sxtw x2, w1
> mov w4, 1
> smaddl x0, w1, w3, x0
> mov w3, 2
> add x1, x0, x2, lsl 2
> str w4, [x0, x2, lsl 2]
> str w3, [x1, 400]
> ret
So shorter is worse? Might be because I don't understand the
difference between the 'lsl 2' and the 'sxtw 2' or the cost
of the [x1, 400] addressing.