[Bug rtl-optimization/83628] New: performance regression when accessing arrays on alpha
mikulas at artax dot karlin.mff.cuni.cz
gcc-bugzilla@gcc.gnu.org
Sat Dec 30 14:58:00 GMT 2017
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83628
Bug ID: 83628
Summary: performance regression when accessing arrays on alpha
Product: gcc
Version: 7.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: mikulas at artax dot karlin.mff.cuni.cz
Target Milestone: ---
The alpha architecture has instructions s4add, s8add, s4sub and s8sub. These
instructions shift the first argument left by 4 or 8 bits and add or subtract
the second argument.
GCC version 6 and 7 is not capable of using these instructions to perform
addition and shift. It always generates these instructions with the second
argument zero and generates separate add or sub instruction afterwards. GCC 5
and before use these instructions correctly.
These intructions are used to access arrays and thus this bug causes slowdown
of any code that works with arrays.
Example:
$ cat index.c
int get_int(int *p, long idx)
{
return p[idx];
}
long long get_long_long(long long *p, long idx)
{
return p[idx];
}
$ alpha-linux-gnu-gcc-6 -c -O2 index.c
$ alpha-linux-gnu-objdump -d index.o
0000000000000000 <get_int>:
0: 51 14 20 42 s4addq a1,0,a1
4: 11 04 11 42 addq a0,a1,a1
8: 00 00 11 a0 ldl v0,0(a1)
c: 01 80 fa 6b ret
0000000000000010 <get_long_long>:
10: 51 16 20 42 s8addq a1,0,a1
14: 11 04 11 42 addq a0,a1,a1
18: 00 00 11 a4 ldq v0,0(a1)
1c: 01 80 fa 6b ret
$ cat s4add.c
unsigned long s4a(unsigned long a, unsigned long b)
{
return a + b * 4;
}
unsigned long s8a(unsigned long a, unsigned long b)
{
return a + b * 8;
}
$ alpha-linux-gnu-gcc-6 -c -O2 s4add.c
$ alpha-linux-gnu-objdump -d s4add.o
0000000000000000 <s4a>:
0: 51 14 20 42 s4addq a1,0,a1
4: 00 04 30 42 addq a1,a0,v0
8: 01 80 fa 6b ret
c: 00 00 fe 2f unop
0000000000000010 <s8a>:
10: 51 16 20 42 s8addq a1,0,a1
14: 00 04 30 42 addq a1,a0,v0
18: 01 80 fa 6b ret
1c: 00 00 fe 2f unop
With gcc 5 and previous, optimal code is generated:
$ alpha-linux-gnu-gcc-5 -c -O2 index.c
$ alpha-linux-gnu-objdump -d index.o
0000000000000000 <get_int>:
0: 51 04 30 42 s4addq a1,a0,a1
4: 00 00 11 a0 ldl v0,0(a1)
8: 01 80 fa 6b ret
c: 00 00 fe 2f unop
0000000000000010 <get_long_long>:
10: 51 06 30 42 s8addq a1,a0,a1
14: 00 00 11 a4 ldq v0,0(a1)
18: 01 80 fa 6b ret
1c: 00 00 fe 2f unop
$ alpha-linux-gnu-gcc-5 -c -O2 s4add.c
$ alpha-linux-gnu-objdump -d s4add.o
0000000000000000 <s4a>:
0: 40 04 30 42 s4addq a1,a0,v0
4: 01 80 fa 6b ret
8: 1f 04 ff 47 nop
c: 00 00 fe 2f unop
0000000000000010 <s8a>:
10: 40 06 30 42 s8addq a1,a0,v0
14: 01 80 fa 6b ret
18: 1f 04 ff 47 nop
1c: 00 00 fe 2f unop
I bisected the problem and it is caused by this commit:
commit fabf26080cb4cc3fecd30d409ec9c63f0ec42eff
Author: vekumar <vekumar@138bc75d-0d04-0410-961f-82ee72b054a4>
Date: Thu May 7 10:47:54 2015 +0000
2015-05-07 Venkataramanan Kumar <venkataramanan.kumar@amd.com>
* combine.c (make_compound_operation): Remove checks for PLUS/MINUS
rtx type.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@222874
138bc75d-0d04-0410-961f-82ee72b054a4
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2015-05-07 Venkataramanan Kumar <venkataramanan.kumar@amd.com>
+
+ * combine.c (make_compound_operation): Remove checks for PLUS/MINUS
+ rtx type.
+
2015-05-07 Richard Biener <rguenther@suse.de>
PR tree-optimization/66002
diff --git a/gcc/combine.c b/gcc/combine.c
index c04146ae645..9e3eb030a63 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -7723,9 +7723,8 @@ extract_left_shift (rtx x, int count)
We try, as much as possible, to re-use rtl expressions to save memory.
IN_CODE says what kind of expression we are processing. Normally, it is
- SET. In a memory address (inside a MEM, PLUS or minus, the latter two
- being kludges), it is MEM. When processing the arguments of a comparison
- or a COMPARE against zero, it is COMPARE. */
+ SET. In a memory address it is MEM. When processing the arguments of
+ a comparison or a COMPARE against zero, it is COMPARE. */
rtx
make_compound_operation (rtx x, enum rtx_code in_code)
@@ -7745,8 +7744,6 @@ make_compound_operation (rtx x, enum rtx_code in_code)
but once inside, go back to our default of SET. */
next_code = (code == MEM ? MEM
- : ((code == PLUS || code == MINUS)
- && SCALAR_INT_MODE_P (mode)) ? MEM
: ((code == COMPARE || COMPARISON_P (x))
&& XEXP (x, 1) == const0_rtx) ? COMPARE
: in_code == COMPARE ? SET : in_code);
More information about the Gcc-bugs
mailing list