Bug 49526 - extra move instruction for smmul
Summary: extra move instruction for smmul
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.7.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on: 68536
Blocks:
  Show dependency treegraph
 
Reported: 2011-06-24 22:44 UTC by Siarhei Siamashka
Modified: 2021-08-16 08:13 UTC (History)
2 users (show)

See Also:
Host:
Target: arm
Build:
Known to work:
Known to fail:
Last reconfirmed: 2011-06-27 21:59:03


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Siarhei Siamashka 2011-06-24 22:44:06 UTC
$ cat test.c

int smmul(int a, int b) { return ((long long)a * b) >> 32; }

$ arm-none-linux-gnueabi-gcc -O2 -S -mcpu=cortex-a8 test.c
$ cat test.s
        .cpu cortex-a8
        .fpu softvfp
        .eabi_attribute 20, 1
        .eabi_attribute 21, 1
        .eabi_attribute 23, 3
        .eabi_attribute 24, 1
        .eabi_attribute 25, 1
        .eabi_attribute 26, 2
        .eabi_attribute 30, 2
        .eabi_attribute 18, 4
        .file   "test.c"
        .text
        .align  2
        .global smmul
        .type   smmul, %function
smmul:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        smull   r0, r1, r0, r1
        mov     r0, r1
        bx      lr
        .size   smmul, .-smmul
        .ident  "GCC: (GNU) 4.7.0 20110624 (experimental)"
        .section        .note.GNU-stack,"",%progbits
Comment 1 Siarhei Siamashka 2011-06-24 22:48:46 UTC
And clang 2.9 has no problems optimizing this code:

$ cat test.c

int smmul(int a, int b) { return ((long long)a * b) >> 32; }

$ clang -ccc-host-triple arm-none-linux -O2 -mcpu=cortex-a8 -S test.c
$ cat test.s
        .syntax unified
        .cpu cortex-a8
        .eabi_attribute 6, 10
        .eabi_attribute 7, 65
        .eabi_attribute 8, 1
        .eabi_attribute 9, 2
        .fpu neon
        .eabi_attribute 10, 3
        .eabi_attribute 12, 1
        .eabi_attribute 20, 1
        .eabi_attribute 21, 1
        .eabi_attribute 23, 3
        .eabi_attribute 24, 1
        .eabi_attribute 25, 1
        .file   "test.c"
        .text
        .globl  smmul
        .align  2
        .type   smmul,%function
smmul:
        smmul   r0, r1, r0
        bx      lr
.Ltmp0:
        .size   smmul, .Ltmp0-smmul
Comment 2 Richard Earnshaw 2011-06-27 21:58:26 UTC
Confirmed.  Also need patterns for the accumulate and subtract variants, plus rounding variants.
Comment 3 ktkachov 2015-11-09 11:19:00 UTC
A patch for the smmul, smmla and smmls instructions is proposed at:
https://gcc.gnu.org/ml/gcc-patches/2015-11/msg00686.html

Adding patterns for the rounding variants seems to expose some LRA ICE that I'll analyse and report separate
Comment 4 Richard Earnshaw 2017-10-20 13:46:40 UTC
I'm clearly not working this one...
Comment 5 Matthijs van Duin 2017-11-17 12:19:20 UTC
So... what happened to this patch? Why was it never applied?