This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/74585] SRA forces parameters to memory causing awful code generation
- From: "rguenth at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 12 Aug 2016 09:30:33 +0000
- Subject: [Bug tree-optimization/74585] SRA forces parameters to memory causing awful code generation
- Authentication-results: sourceware.org; auth=none
- Auto-submitted: auto-generated
- References: <bug-74585-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=74585
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Bad case:
(mem/c:BLK (plus:DI (reg/f:DI 150 virtual-stack-vars)
(const_int 64 [0x40])) [1 a+0 S64 A128])
#0 set_decl_rtl (t=<parm_decl 0x2aaaac199000 a>, x=0x2aaaac1ae960)
at /space/rguenther/src/svn/trunk/gcc/emit-rtl.c:1282
#1 0x00000000008ad5ca in set_rtl (t=<parm_decl 0x2aaaac199000 a>,
x=0x2aaaac1ae960) at /space/rguenther/src/svn/trunk/gcc/cfgexpand.c:302
#2 0x00000000008b0236 in set_parm_rtl (parm=<parm_decl 0x2aaaac199000 a>,
x=0x2aaaac1ae960) at /space/rguenther/src/svn/trunk/gcc/cfgexpand.c:1275
#3 0x0000000000a7477b in assign_parm_setup_block (all=0x7fffffffd5c0,
parm=<parm_decl 0x2aaaac199000 a>, data=0x7fffffffd540)
at /space/rguenther/src/svn/trunk/gcc/function.c:3109
#4 0x0000000000a76f92 in assign_parms (
fndecl=<function_decl 0x2aaaac171a00 test_vecd8_rotate_left>)
at /space/rguenther/src/svn/trunk/gcc/function.c:3775
#5 0x0000000000a7afc6 in expand_function_start (
subr=<function_decl 0x2aaaac171a00 test_vecd8_rotate_left>)
at /space/rguenther/src/svn/trunk/gcc/function.c:5211
Good case:
(mem/c:BLK (plus:DI (reg/f:DI 150 virtual-stack-vars)
(const_int 128 [0x80])) [1 a+0 S64 A128])
so no difference. We have in the good case
;; _1 = a.vx0;
(insn 17 16 18 (set (reg:V2DF 176 [ _1 ])
(vec_select:V2DF (mem/c:V2DF (plus:DI (reg/f:DI 150 virtual-stack-vars)
(const_int 128 [0x80])) [2 a.vx0+0 S16 A128])
(parallel [
(const_int 1 [0x1])
(const_int 0 [0])
]))) t.c:18 -1
(nil))
and in the bad case
;; a$vx0_22 = MEM[(struct *)&a];
(insn 17 16 18 (set (reg:V2DF 191 [ a$vx0 ])
(vec_select:V2DF (mem/c:V2DF (plus:DI (reg/f:DI 150 virtual-stack-vars)
(const_int 64 [0x40])) [1 MEM[(struct *)&a]+0 S16 A128])
(parallel [
(const_int 1 [0x1])
(const_int 0 [0])
]))) -1
(nil))
again almost the same. Parameter setup in the bad case:
(insn 2 15 3 2 (set (reg:V4SI 183)
(reg:V4SI 79 2 [ a ])) t.c:14 -1
(nil))
(insn 3 2 4 2 (set (reg:V4SI 184)
(reg:V4SI 80 3 [ a+16 ])) t.c:14 -1
(nil))
(insn 4 3 5 2 (set (reg:V4SI 185)
(reg:V4SI 81 4 [ a+32 ])) t.c:14 -1
(nil))
(insn 5 4 6 2 (set (reg:V4SI 186)
(reg:V4SI 82 5 [ a+48 ])) t.c:14 -1
(nil))
(insn 6 5 7 2 (set (reg:V4SI 187)
(vec_select:V4SI (reg:V4SI 183)
(parallel [
(const_int 2 [0x2])
(const_int 3 [0x3])
(const_int 0 [0])
(const_int 1 [0x1])
]))) t.c:14 -1
(nil))
(insn 7 6 8 2 (set (mem/c:V4SI (plus:DI (reg/f:DI 150 virtual-stack-vars)
(const_int 64 [0x40])) [1 a+0 S16 A128])
(vec_select:V4SI (reg:V4SI 187)
(parallel [
(const_int 2 [0x2])
(const_int 3 [0x3])
(const_int 0 [0])
(const_int 1 [0x1])
]))) t.c:14 -1
(nil))
(insn 8 7 9 2 (set (reg:V4SI 188)
(vec_select:V4SI (reg:V4SI 184)
(parallel [
(const_int 2 [0x2])
(const_int 3 [0x3])
(const_int 0 [0])
(const_int 1 [0x1])
]))) t.c:14 -1
(nil))
(insn 9 8 10 2 (set (mem/c:V4SI (plus:DI (reg/f:DI 150 virtual-stack-vars)
(const_int 80 [0x50])) [1 a+16 S16 A128])
(vec_select:V4SI (reg:V4SI 188)
(parallel [
(const_int 2 [0x2])
(const_int 3 [0x3])
(const_int 0 [0])
(const_int 1 [0x1])
]))) t.c:14 -1
(nil))
(insn 10 9 11 2 (set (reg:V4SI 189)
(vec_select:V4SI (reg:V4SI 185)
(parallel [
(const_int 2 [0x2])
(const_int 3 [0x3])
(const_int 0 [0])
(const_int 1 [0x1])
]))) t.c:14 -1
(nil))
(insn 11 10 12 2 (set (mem/c:V4SI (plus:DI (reg/f:DI 150 virtual-stack-vars)
(const_int 96 [0x60])) [1 a+32 S16 A128])
(vec_select:V4SI (reg:V4SI 189)
(parallel [
(const_int 2 [0x2])
(const_int 3 [0x3])
(const_int 0 [0])
(const_int 1 [0x1])
]))) t.c:14 -1
(nil))
(insn 12 11 13 2 (set (reg:V4SI 190)
(vec_select:V4SI (reg:V4SI 186)
(parallel [
(const_int 2 [0x2])
(const_int 3 [0x3])
(const_int 0 [0])
(const_int 1 [0x1])
]))) t.c:14 -1
(nil))
(insn 13 12 14 2 (set (mem/c:V4SI (plus:DI (reg/f:DI 150 virtual-stack-vars)
(const_int 112 [0x70])) [1 a+48 S16 A128])
(vec_select:V4SI (reg:V4SI 190)
(parallel [
(const_int 2 [0x2])
(const_int 3 [0x3])
(const_int 0 [0])
(const_int 1 [0x1])
]))) t.c:14 -1
(nil))
(note 14 13 17 2 NOTE_INSN_FUNCTION_BEG)
and in the good case:
(insn 2 15 3 2 (set (reg:V4SI 168)
(reg:V4SI 79 2 [ a ])) t.c:14 -1
(nil))
(insn 3 2 4 2 (set (reg:V4SI 169)
(reg:V4SI 80 3 [ a+16 ])) t.c:14 -1
(nil))
(insn 4 3 5 2 (set (reg:V4SI 170)
(reg:V4SI 81 4 [ a+32 ])) t.c:14 -1
(nil))
(insn 5 4 6 2 (set (reg:V4SI 171)
(reg:V4SI 82 5 [ a+48 ])) t.c:14 -1
(nil))
(insn 6 5 7 2 (set (reg:V4SI 172)
(vec_select:V4SI (reg:V4SI 168)
(parallel [
(const_int 2 [0x2])
(const_int 3 [0x3])
(const_int 0 [0])
(const_int 1 [0x1])
]))) t.c:14 -1
(nil))
(insn 7 6 8 2 (set (mem/c:V4SI (plus:DI (reg/f:DI 150 virtual-stack-vars)
(const_int 128 [0x80])) [1 a+0 S16 A128])
(vec_select:V4SI (reg:V4SI 172)
(parallel [
(const_int 2 [0x2])
(const_int 3 [0x3])
(const_int 0 [0])
(const_int 1 [0x1])
]))) t.c:14 -1
(nil))
(insn 8 7 9 2 (set (reg:V4SI 173)
(vec_select:V4SI (reg:V4SI 169)
(parallel [
(const_int 2 [0x2])
(const_int 3 [0x3])
(const_int 0 [0])
(const_int 1 [0x1])
]))) t.c:14 -1
(nil))
(insn 9 8 10 2 (set (mem/c:V4SI (plus:DI (reg/f:DI 150 virtual-stack-vars)
(const_int 144 [0x90])) [1 a+16 S16 A128])
(vec_select:V4SI (reg:V4SI 173)
(parallel [
(const_int 2 [0x2])
(const_int 3 [0x3])
(const_int 0 [0])
(const_int 1 [0x1])
]))) t.c:14 -1
(nil))
(insn 10 9 11 2 (set (reg:V4SI 174)
(vec_select:V4SI (reg:V4SI 170)
(parallel [
(const_int 2 [0x2])
(const_int 3 [0x3])
(const_int 0 [0])
(const_int 1 [0x1])
]))) t.c:14 -1
(nil))
(insn 11 10 12 2 (set (mem/c:V4SI (plus:DI (reg/f:DI 150 virtual-stack-vars)
(const_int 160 [0xa0])) [1 a+32 S16 A128])
(vec_select:V4SI (reg:V4SI 174)
(parallel [
(const_int 2 [0x2])
(const_int 3 [0x3])
(const_int 0 [0])
(const_int 1 [0x1])
]))) t.c:14 -1
(nil))
(insn 12 11 13 2 (set (reg:V4SI 175)
(vec_select:V4SI (reg:V4SI 171)
(parallel [
(const_int 2 [0x2])
(const_int 3 [0x3])
(const_int 0 [0])
(const_int 1 [0x1])
]))) t.c:14 -1
(nil))
(insn 13 12 14 2 (set (mem/c:V4SI (plus:DI (reg/f:DI 150 virtual-stack-vars)
(const_int 176 [0xb0])) [1 a+48 S16 A128])
(vec_select:V4SI (reg:V4SI 175)
(parallel [
(const_int 2 [0x2])
(const_int 3 [0x3])
(const_int 0 [0])
(const_int 1 [0x1])
]))) t.c:14 -1
(nil))
(note 14 13 17 2 NOTE_INSN_FUNCTION_BEG)
again exactly the same. There must be downstream effects that cause the whole
issue during RTL optimization.