This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/82038] New: Very poor optimization of constant multiply on ARM Cortex-M7
- From: "headch at gmail dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Wed, 30 Aug 2017 15:36:07 +0000
- Subject: [Bug target/82038] New: Very poor optimization of constant multiply on ARM Cortex-M7
- Auto-submitted: auto-generated
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82038
Bug ID: 82038
Summary: Very poor optimization of constant multiply on ARM
Cortex-M7
Product: gcc
Version: 7.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: headch at gmail dot com
Target Milestone: ---
Consider the following source code:
#include <stdint.h>
int64_t f(int32_t x) {
return x * 16384LL;
}
int64_t g(int32_t x) {
return static_cast<int64_t>(x) << 14;
}
Compile it with the following command:
armv7m-none-eabihf-g++ -ffreestanding -Wall -Wextra -O2 -mcpu=cortex-m7
-std=c++17 -c test.cpp
It produces the following code:
00000000 <_Z1fl>:
0: b430 push {r4, r5}
2: 17c5 asrs r5, r0, #31
4: 4603 mov r3, r0
6: 0380 lsls r0, r0, #14
8: 03a9 lsls r1, r5, #14
a: bc30 pop {r4, r5}
c: ea41 4193 orr.w r1, r1, r3, lsr #18
10: 4770 bx lr
12: bf00 nop
00000014 <_Z1gl>:
14: 4601 mov r1, r0
16: 0380 lsls r0, r0, #14
18: 1489 asrs r1, r1, #18
1a: 4770 bx lr
The implementation of f could be the same as g, yet it’s really quite awful.
Changing -mcpu=cortex-m7 to -mcpu=cortex-m4 doesn’t affect g. It yields rather
better code for f than the M7 build, but still not as good as g.
I could just use g, but that isn’t really a good option because left-shifting a
negative number is UB.