Test case: -- #include <stdint.h> struct twoints { uint64_t a, b; } foo(); void bar(uint64_t a, uint64_t b); void func() { struct twoints s = foo(); bar(s.a, s.b); } -- $ gcc -save-temps -Wall -c -o testbad.o -msse2 -O3 -fomit-frame-pointer testbad.c $ objdump -d -r -M intel testbad.o testbad.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <func>: 0: 48 83 ec 28 sub rsp,0x28 4: 31 c0 xor eax,eax 6: e8 00 00 00 00 call b <func+0xb> 7: R_X86_64_PC32 foo-0x4 b: 48 89 04 24 mov QWORD PTR [rsp],rax f: 48 89 54 24 08 mov QWORD PTR [rsp+0x8],rdx 14: 48 89 d6 mov rsi,rdx 17: 48 89 44 24 10 mov QWORD PTR [rsp+0x10],rax 1c: 48 89 54 24 18 mov QWORD PTR [rsp+0x18],rdx 21: 48 89 c7 mov rdi,rax 24: 48 83 c4 28 add rsp,0x28 28: e9 00 00 00 00 jmp 2d <func+0x2d> 29: R_X86_64_PC32 bar-0x4 -- As you can see above, rax and rdx are stored to the stack twice, but these stores are unnecessary. $ gcc -v Using built-in specs. Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.4.3-4ubuntu5' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 --program-suffix=-4.4 --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-plugin --enable-objc-gc --disable-werror --with-arch-32=i486 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5)
Confirmed. We already expand it that way: ;; s = foo (); (insn 5 4 6 t.c:7 (set (reg:QI 0 ax) (const_int 0 [0x0])) -1 (nil)) (call_insn 6 5 7 t.c:7 (set (parallel:BLK [ (expr_list:REG_DEP_TRUE (reg:DI 0 ax) (const_int 0 [0x0])) (expr_list:REG_DEP_TRUE (reg:DI 1 dx) (const_int 8 [0x8])) ]) (call (mem:QI (symbol_ref:DI ("foo") [flags 0x41] <function_decl 0x7ffff5aed500 foo>) [0 S1 A8]) (const_int 0 [0x0]))) -1 (nil) (expr_list:REG_DEP_TRUE (use (reg:QI 0 ax)) (nil))) (insn 7 6 8 t.c:7 (set (reg:DI 60) (reg:DI 0 ax)) -1 (nil)) (insn 8 7 9 t.c:7 (set (reg:DI 61) (reg:DI 1 dx)) -1 (nil)) (insn 9 8 10 t.c:7 (set (mem/s/c:DI (plus:DI (reg/f:DI 54 virtual-stack-vars) (const_int -32 [0xffffffffffffffe0])) [2 S8 A128]) (reg:DI 60)) -1 (nil)) (insn 10 9 11 t.c:7 (set (mem/s/c:DI (plus:DI (reg/f:DI 54 virtual-stack-vars) (const_int -24 [0xffffffffffffffe8])) [2 S8 A64]) (reg:DI 61)) -1 (nil)) (insn 11 10 12 t.c:7 (set (reg:DI 62) (mem/s/c:DI (plus:DI (reg/f:DI 54 virtual-stack-vars) (const_int -32 [0xffffffffffffffe0])) [2 S8 A128])) -1 (nil)) (insn 12 11 13 t.c:7 (set (mem/s/c:DI (plus:DI (reg/f:DI 54 virtual-stack-vars) (const_int -16 [0xfffffffffffffff0])) [2 s+0 S8 A128]) (reg:DI 62)) -1 (nil)) (insn 13 12 14 t.c:7 (set (reg:DI 63) (mem/s/c:DI (plus:DI (reg/f:DI 54 virtual-stack-vars) (const_int -24 [0xffffffffffffffe8])) [2 S8 A64])) -1 (nil)) (insn 14 13 0 t.c:7 (set (mem/s/c:DI (plus:DI (reg/f:DI 54 virtual-stack-vars) (const_int -8 [0xfffffffffffffff8])) [2 s+8 S8 A64]) (reg:DI 63)) -1 (nil)) So we create a stack representation to copy it to the stack temporary (which both are unneeded). We should see that we can avoid the temporary at all as there is no aggregate use of the struct left. At least we should avoid the 2nd temporary. I'm very sure there is a duplicate for this bug somewhere. Also I wonder why RTL DSE cannot remove all the stores to the stack.
RTL DSE doesn't handle this because the call to bar, which isn't a const function, is considered a wild read and thus makes all stores necessary in the global as well as local algorithm. RTL DSE doesn't consider whether a frame based address could have possibly address taken or not and whether a call thus might read it or not. For tail calls before reload, perhaps we could handle tail calls similarly to const calls (be a read of all argument stores) with the addition that it would act as a read for all constant address stores (basically wild read for anything but frame based stores for the global algorithm, given that the stack is unwound before the tail call).
At leas the stores to s have alias info: (insn 12 10 14 2 t.c:7 (set (mem/s/c:DI (plus:DI (reg/f:DI 20 frame) (const_int -16 [0xfffffffffffffff0])) [2 s+0 S8 A128]) (reg:DI 60)) 89 {*movdi_1_rex64} (nil)) (insn 14 12 17 2 t.c:7 (set (mem/s/c:DI (plus:DI (reg/f:DI 20 frame) (const_int -8 [0xfffffffffffffff8])) [2 s+8 S8 A64]) (reg:DI 61)) 89 {*movdi_1_rex64} (nil)) so RTL DSE could check whether the stack slot is aliased at all. The other memory temporary should be avoided at expansion time already.
This seems to happen even with POD return types: int foo(); void bar(int a); void func() { bar(foo()); } In 32-bit mode it spills the return value to the stack for no reason. It also seems to overallocate the stack (28 bytes allocated, only 4 used): 00000000 <func>: 0: 83 ec 1c sub esp,0x1c 3: e8 fc ff ff ff call 4 <func+0x4> 4: R_386_PC32 foo 8: 89 04 24 mov DWORD PTR [esp],eax b: e8 fc ff ff ff call c <func+0xc> c: R_386_PC32 bar 10: 83 c4 1c add esp,0x1c 13: c3 ret In 64-bit mode there is no store, but it *does* allocate 8 bytes of stack that it never uses: 0000000000000000 <func>: 0: 48 83 ec 08 sub rsp,0x8 4: 31 c0 xor eax,eax 6: e8 00 00 00 00 call b <func+0xb> 7: R_X86_64_PC32 foo+0xfffffffffffffffc b: 48 83 c4 08 add rsp,0x8 f: 89 c7 mov edi,eax 11: e9 00 00 00 00 jmp 16 <func+0x16> 12: R_X86_64_PC32 bar+0xfffffffffffffffc Any idea how hard this bug is to fix?
>In 32-bit mode it spills the return value to the stack for no reason. Huh? arguments are passed via the stack in 32bit mode.
>In 64-bit mode there is no store, but it *does* allocate 8 bytes of stack that it never uses: Oh no that is called aligning the stack to be 16 byte aligned. >It also seems to overallocate the stack (28 bytes allocated, only 4 used): No, it is not over allocating the stack really, it does align it be to 16 byte aligned though.
I must have been on crack when I wrote that last comment. Sorry for the noise. Though I do wonder how difficult the original bug is to fix. This seems to make it more expensive to return structures by value.
I found another test case for this. I thought I'd post it since it's extremely different than the original one. -- class Foo { public: virtual ~Foo() {} virtual void DoSomething() = 0; }; void foo(Foo *f, void (Foo::*member)()) { (f->*member)(); } -- $ g++ -c -O3 -fomit-frame-pointer test.cc $ objdump -M intel -d test.o test.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <_Z3fooP3FooMS_FvvE>: 0: 40 f6 c6 01 test sil,0x1 4: 48 89 74 24 e8 mov QWORD PTR [rsp-0x18],rsi 9: 48 89 54 24 f0 mov QWORD PTR [rsp-0x10],rdx e: 74 10 je 20 <_Z3fooP3FooMS_FvvE+0x20> 10: 48 01 d7 add rdi,rdx 13: 48 8b 07 mov rax,QWORD PTR [rdi] 16: 48 8b 74 30 ff mov rsi,QWORD PTR [rax+rsi*1-0x1] 1b: ff e6 jmp rsi 1d: 0f 1f 00 nop DWORD PTR [rax] 20: 48 01 d7 add rdi,rdx 23: ff e6 jmp rsi -- We spilled rsi and rdx to the stack (in the red zone, it appears) for no reason (AFAICS).
Created attachment 23968 [details] Patch to dse.c to be less conservative with calls. Currently dse kills all stores on a call since call can do a wild read. But calls can not read off frame unless it is a local variable that can escape. This patch ensures frame based stores are not killed on a call if they can't escape. For the first struct return case, this removes the redundant stores to the local variable. Does this look reasonable?
Created attachment 23987 [details] Fixes a bug in the previous patch
Richard, did you have something like that in mind when writing comment #3? What aliasing info do we have at the RTL level now?
Yes, something like that. Though can_escape can be made simpler and more precise by bool can_escape (tree expr) { tree base; if (!expr) return true; base = get_base_address (expr); if (DECL_P (base) && (!may_be_aliased (base) || !pt_solution_includes (&cfun->gimple_df.escaped, base))) return true; return false; }
Richard, did you mean to write static bool can_escape (tree expr) { tree base; if (!expr) return true; base = get_base_address (expr); if (DECL_P (base) && (!may_be_aliased (base) && !pt_solution_includes (&cfun->gimple_df->escaped, base))) return false; return true; } Only case when we know it doesn't escape is if bas is a DECL_P and is not in cfun->gimple_df->escaped and not aliased, right? Actually, I'm wondering if it is sufficient to test just DECL_P (base) && !pt_solution_includes (&cfun->gimple_df->escaped, base).
On Fri, 15 Apr 2011, eraman at google dot com wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194 > > Easwaran Raman <eraman at google dot com> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |eraman at google dot com > > --- Comment #13 from Easwaran Raman <eraman at google dot com> 2011-04-15 19:18:25 UTC --- > Richard, did you mean to write > > static bool > can_escape (tree expr) > { > tree base; > if (!expr) > return true; > base = get_base_address (expr); > if (DECL_P (base) > && (!may_be_aliased (base) > && !pt_solution_includes (&cfun->gimple_df->escaped, base))) > return false; > return true; > } > > Only case when we know it doesn't escape is if bas is a DECL_P and is not in > cfun->gimple_df->escaped and not aliased, right? Actually, I'm wondering if it > is sufficient to test just > DECL_P (base) && !pt_solution_includes (&cfun->gimple_df->escaped, base). No, because if the escaped solution for example includes ANYTHING then the test will return true. That !may-aliased variables are not contained in ANYTHING isn't known w/o context. Richard.
(In reply to comment #14) > On Fri, 15 Apr 2011, eraman at google dot com wrote: > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194 > > > > Easwaran Raman <eraman at google dot com> changed: > > > > What |Removed |Added > > ---------------------------------------------------------------------------- > > CC| |eraman at google dot com > > > > --- Comment #13 from Easwaran Raman <eraman at google dot com> 2011-04-15 19:18:25 UTC --- > > Richard, did you mean to write > > > > static bool > > can_escape (tree expr) > > { > > tree base; > > if (!expr) > > return true; > > base = get_base_address (expr); > > if (DECL_P (base) > > && (!may_be_aliased (base) > > && !pt_solution_includes (&cfun->gimple_df->escaped, base))) > > return false; > > return true; > > } > > > > Only case when we know it doesn't escape is if bas is a DECL_P and is not in > > cfun->gimple_df->escaped and not aliased, right? Actually, I'm wondering if it > > is sufficient to test just > > DECL_P (base) && !pt_solution_includes (&cfun->gimple_df->escaped, base). > > No, because if the escaped solution for example includes ANYTHING then > the test will return true. That !may-aliased variables are not > contained in ANYTHING isn't known w/o context. > > Richard. Correct me if I am wrong. If I understand you right, just using DECL_P (base) && !pt_solution_includes is conservative since pt_solution_includes may return true if the escaped solution contains ANYTHING. To make it less conservative, you're suggesting if (DECL_P (base) && (!may_be_aliased (base) || !pt_solution_includes (&cfun->gimple_df->escaped, base))) return false; I tried that and most Fortran tests are failing. One of the tests (default_format_1.f90) has the following RTL sequence: (insn 30 29 32 4 (set (mem/s/c:SI (plus:DI (reg/f:DI 20 frame) (const_int -608 [0xfffffffffffffda0])) [2 dt_parm.0.common.flags+0 S4 A64]) (const_int 16512 [0x4080])) default_format_1.inc:56 64 {*movsi_internal} (nil)) (insn 32 30 33 4 (set (reg:DI 5 di) (reg/f:DI 106)) default_format_1.inc:56 62 {*movdi_internal_rex64} (expr_list:REG_EQUAL (plus:DI (reg/f:DI 20 frame) (const_int -608 [0xfffffffffffffda0])) (nil))) (call_insn 33 32 36 4 (call (mem:QI (symbol_ref:DI ("_gfortran_st_write") [flags 0x41] <function_decl 0x7f301ed12e00 _gfortran_st_write>) [0 _gfortran_st_write S1 A8]) (const_int 0 [0])) default_format_1.inc:56 618 {*call_0} (expr_list:REG_DEAD (reg:DI 5 di) (nil)) (expr_list:REG_DEP_TRUE (use (reg:DI 5 di)) (nil))) For the DECL dt_parm, pt_solution_includes (&cfun->gimple_df->escaped, base) returns false, even though its location is passed as a parameter to _gfortran_st_write. I did test with if (DECL_P (base) && (!may_be_aliased (base) && !pt_solution_includes (&cfun->gimple_df->escaped, base))) which has no regressions. Is that what you suggest?
On Fri, 15 Apr 2011, eraman at google dot com wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194 > > --- Comment #15 from Easwaran Raman <eraman at google dot com> 2011-04-15 22:22:15 UTC --- > (In reply to comment #14) > > On Fri, 15 Apr 2011, eraman at google dot com wrote: > > > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194 > > > > > > Easwaran Raman <eraman at google dot com> changed: > > > > > > What |Removed |Added > > > ---------------------------------------------------------------------------- > > > CC| |eraman at google dot com > > > > > > --- Comment #13 from Easwaran Raman <eraman at google dot com> 2011-04-15 19:18:25 UTC --- > > > Richard, did you mean to write > > > > > > static bool > > > can_escape (tree expr) > > > { > > > tree base; > > > if (!expr) > > > return true; > > > base = get_base_address (expr); > > > if (DECL_P (base) > > > && (!may_be_aliased (base) > > > && !pt_solution_includes (&cfun->gimple_df->escaped, base))) > > > return false; > > > return true; > > > } > > > > > > Only case when we know it doesn't escape is if bas is a DECL_P and is not in > > > cfun->gimple_df->escaped and not aliased, right? Actually, I'm wondering if it > > > is sufficient to test just > > > DECL_P (base) && !pt_solution_includes (&cfun->gimple_df->escaped, base). > > > > No, because if the escaped solution for example includes ANYTHING then > > the test will return true. That !may-aliased variables are not > > contained in ANYTHING isn't known w/o context. > > > > Richard. > > Correct me if I am wrong. If I understand you right, just using DECL_P (base) > && !pt_solution_includes is conservative since pt_solution_includes may return > true if the escaped solution contains ANYTHING. To make it less conservative, > you're suggesting > > if (DECL_P (base) > && (!may_be_aliased (base) > || !pt_solution_includes (&cfun->gimple_df->escaped, base))) > return false; > > I tried that and most Fortran tests are failing. One of the tests > (default_format_1.f90) has the following RTL sequence: > > > (insn 30 29 32 4 (set (mem/s/c:SI (plus:DI (reg/f:DI 20 frame) > (const_int -608 [0xfffffffffffffda0])) [2 > dt_parm.0.common.flags+0 S4 A64]) > (const_int 16512 [0x4080])) default_format_1.inc:56 64 > {*movsi_internal} > (nil)) > > (insn 32 30 33 4 (set (reg:DI 5 di) > (reg/f:DI 106)) default_format_1.inc:56 62 {*movdi_internal_rex64} > (expr_list:REG_EQUAL (plus:DI (reg/f:DI 20 frame) > (const_int -608 [0xfffffffffffffda0])) > (nil))) > > (call_insn 33 32 36 4 (call (mem:QI (symbol_ref:DI ("_gfortran_st_write") > [flags 0x41] <function_decl 0x7f301ed12e00 _gfortran_st_write>) [0 > _gfortran_st_write S1 A8]) > (const_int 0 [0])) default_format_1.inc:56 618 {*call_0} > (expr_list:REG_DEAD (reg:DI 5 di) > (nil)) > (expr_list:REG_DEP_TRUE (use (reg:DI 5 di)) > (nil))) > > For the DECL dt_parm, pt_solution_includes (&cfun->gimple_df->escaped, base) > returns false, even though its location is passed as a parameter to > _gfortran_st_write. > > I did test with > if (DECL_P (base) > && (!may_be_aliased (base) > && !pt_solution_includes (&cfun->gimple_df->escaped, base))) > > which has no regressions. Is that what you suggest? No, the version with || should be ok. The dt_parm argument does not escape at the _gfortran_st_write call site because this intrinsic function has a ".wW" fnspec attribute which specifies the arguments do not escape. What you indeed need to do in addition to the escaped solution query is walk over all function arguments and see if there is one that aliases 'base'. That may not be easily possible on RTL though. On the tree level we have a separate points-to set for such call clobbers/uses but we do not preserve it for RTL.
On Sun, Apr 17, 2011 at 3:45 AM, rguenther at suse dot de <gcc-bugzilla@gcc.gnu.org> wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194 > > --- Comment #16 from rguenther at suse dot de <rguenther at suse dot de> 2011-04-17 10:44:02 UTC --- > On Fri, 15 Apr 2011, eraman at google dot com wrote: > >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194 >> >> --- Comment #15 from Easwaran Raman <eraman at google dot com> 2011-04-15 22:22:15 UTC --- >> (In reply to comment #14) >> > On Fri, 15 Apr 2011, eraman at google dot com wrote: >> > >> > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194 >> > > >> > > Easwaran Raman <eraman at google dot com> changed: >> > > >> > > What |Removed |Added >> > > ---------------------------------------------------------------------------- >> > > CC| |eraman at google dot com >> > > >> > > --- Comment #13 from Easwaran Raman <eraman at google dot com> 2011-04-15 19:18:25 UTC --- >> > > Richard, did you mean to write >> > > >> > > static bool >> > > can_escape (tree expr) >> > > { >> > > tree base; >> > > if (!expr) >> > > return true; >> > > base = get_base_address (expr); >> > > if (DECL_P (base) >> > > && (!may_be_aliased (base) >> > > && !pt_solution_includes (&cfun->gimple_df->escaped, base))) >> > > return false; >> > > return true; >> > > } >> > > >> > > Only case when we know it doesn't escape is if bas is a DECL_P and is not in >> > > cfun->gimple_df->escaped and not aliased, right? Actually, I'm wondering if it >> > > is sufficient to test just >> > > DECL_P (base) && !pt_solution_includes (&cfun->gimple_df->escaped, base). >> > >> > No, because if the escaped solution for example includes ANYTHING then >> > the test will return true. That !may-aliased variables are not >> > contained in ANYTHING isn't known w/o context. >> > >> > Richard. >> >> Correct me if I am wrong. If I understand you right, just using DECL_P (base) >> && !pt_solution_includes is conservative since pt_solution_includes may return >> true if the escaped solution contains ANYTHING. To make it less conservative, >> you're suggesting >> >> if (DECL_P (base) >> && (!may_be_aliased (base) >> || !pt_solution_includes (&cfun->gimple_df->escaped, base))) >> return false; >> >> I tried that and most Fortran tests are failing. One of the tests >> (default_format_1.f90) has the following RTL sequence: >> >> >> (insn 30 29 32 4 (set (mem/s/c:SI (plus:DI (reg/f:DI 20 frame) >> (const_int -608 [0xfffffffffffffda0])) [2 >> dt_parm.0.common.flags+0 S4 A64]) >> (const_int 16512 [0x4080])) default_format_1.inc:56 64 >> {*movsi_internal} >> (nil)) >> >> (insn 32 30 33 4 (set (reg:DI 5 di) >> (reg/f:DI 106)) default_format_1.inc:56 62 {*movdi_internal_rex64} >> (expr_list:REG_EQUAL (plus:DI (reg/f:DI 20 frame) >> (const_int -608 [0xfffffffffffffda0])) >> (nil))) >> >> (call_insn 33 32 36 4 (call (mem:QI (symbol_ref:DI ("_gfortran_st_write") >> [flags 0x41] <function_decl 0x7f301ed12e00 _gfortran_st_write>) [0 >> _gfortran_st_write S1 A8]) >> (const_int 0 [0])) default_format_1.inc:56 618 {*call_0} >> (expr_list:REG_DEAD (reg:DI 5 di) >> (nil)) >> (expr_list:REG_DEP_TRUE (use (reg:DI 5 di)) >> (nil))) >> >> For the DECL dt_parm, pt_solution_includes (&cfun->gimple_df->escaped, base) >> returns false, even though its location is passed as a parameter to >> _gfortran_st_write. >> >> I did test with >> if (DECL_P (base) >> && (!may_be_aliased (base) >> && !pt_solution_includes (&cfun->gimple_df->escaped, base))) >> >> which has no regressions. Is that what you suggest? > > No, the version with || should be ok. The dt_parm argument does > not escape at the _gfortran_st_write call site because this > intrinsic function has a ".wW" fnspec attribute which specifies > the arguments do not escape. What you indeed need to do in > addition to the escaped solution query is walk over all function > arguments and see if there is one that aliases 'base'. That > may not be easily possible on RTL though. On the tree level > we have a separate points-to set for such call clobbers/uses > but we do not preserve it for RTL. Is it ok to make calls whose arg(s) have EAF_NOESCAPE kill all locations off the frame in addition to killing all locations that potentially escape (using the || case you suggested)? Will it be better or worse than just checking !may_be_aliased (base) alone? Thanks, Easwaran > -- > Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email > ------- You are receiving this mail because: ------- > You are on the CC list for the bug. >
On Thu, 21 Apr 2011, eraman at google dot com wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194 > > --- Comment #17 from Easwaran Raman <eraman at google dot com> 2011-04-21 00:20:51 UTC --- > On Sun, Apr 17, 2011 at 3:45 AM, rguenther at suse dot de > <gcc-bugzilla@gcc.gnu.org> wrote: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194 > > > > --- Comment #16 from rguenther at suse dot de <rguenther at suse dot de> 2011-04-17 10:44:02 UTC --- > > On Fri, 15 Apr 2011, eraman at google dot com wrote: > > > >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194 > >> > >> --- Comment #15 from Easwaran Raman <eraman at google dot com> 2011-04-15 22:22:15 UTC --- > >> (In reply to comment #14) > >> > On Fri, 15 Apr 2011, eraman at google dot com wrote: > >> > > >> > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194 > >> > > > >> > > Easwaran Raman <eraman at google dot com> changed: > >> > > > >> > > What |Removed |Added > >> > > ---------------------------------------------------------------------------- > >> > > CC| |eraman at google dot com > >> > > > >> > > --- Comment #13 from Easwaran Raman <eraman at google dot com> 2011-04-15 19:18:25 UTC --- > >> > > Richard, did you mean to write > >> > > > >> > > static bool > >> > > can_escape (tree expr) > >> > > { > >> > > tree base; > >> > > if (!expr) > >> > > return true; > >> > > base = get_base_address (expr); > >> > > if (DECL_P (base) > >> > > && (!may_be_aliased (base) > >> > > && !pt_solution_includes (&cfun->gimple_df->escaped, base))) > >> > > return false; > >> > > return true; > >> > > } > >> > > > >> > > Only case when we know it doesn't escape is if bas is a DECL_P and is not in > >> > > cfun->gimple_df->escaped and not aliased, right? Actually, I'm wondering if it > >> > > is sufficient to test just > >> > > DECL_P (base) && !pt_solution_includes (&cfun->gimple_df->escaped, base). > >> > > >> > No, because if the escaped solution for example includes ANYTHING then > >> > the test will return true. That !may-aliased variables are not > >> > contained in ANYTHING isn't known w/o context. > >> > > >> > Richard. > >> > >> Correct me if I am wrong. If I understand you right, just using DECL_P (base) > >> && !pt_solution_includes is conservative since pt_solution_includes may return > >> true if the escaped solution contains ANYTHING. To make it less conservative, > >> you're suggesting > >> > >> if (DECL_P (base) > >> && (!may_be_aliased (base) > >> || !pt_solution_includes (&cfun->gimple_df->escaped, base))) > >> return false; > >> > >> I tried that and most Fortran tests are failing. One of the tests > >> (default_format_1.f90) has the following RTL sequence: > >> > >> > >> (insn 30 29 32 4 (set (mem/s/c:SI (plus:DI (reg/f:DI 20 frame) > >> (const_int -608 [0xfffffffffffffda0])) [2 > >> dt_parm.0.common.flags+0 S4 A64]) > >> (const_int 16512 [0x4080])) default_format_1.inc:56 64 > >> {*movsi_internal} > >> (nil)) > >> > >> (insn 32 30 33 4 (set (reg:DI 5 di) > >> (reg/f:DI 106)) default_format_1.inc:56 62 {*movdi_internal_rex64} > >> (expr_list:REG_EQUAL (plus:DI (reg/f:DI 20 frame) > >> (const_int -608 [0xfffffffffffffda0])) > >> (nil))) > >> > >> (call_insn 33 32 36 4 (call (mem:QI (symbol_ref:DI ("_gfortran_st_write") > >> [flags 0x41] <function_decl 0x7f301ed12e00 _gfortran_st_write>) [0 > >> _gfortran_st_write S1 A8]) > >> (const_int 0 [0])) default_format_1.inc:56 618 {*call_0} > >> (expr_list:REG_DEAD (reg:DI 5 di) > >> (nil)) > >> (expr_list:REG_DEP_TRUE (use (reg:DI 5 di)) > >> (nil))) > >> > >> For the DECL dt_parm, pt_solution_includes (&cfun->gimple_df->escaped, base) > >> returns false, even though its location is passed as a parameter to > >> _gfortran_st_write. > >> > >> I did test with > >> if (DECL_P (base) > >> && (!may_be_aliased (base) > >> && !pt_solution_includes (&cfun->gimple_df->escaped, base))) > >> > >> which has no regressions. Is that what you suggest? > > > > No, the version with || should be ok. The dt_parm argument does > > not escape at the _gfortran_st_write call site because this > > intrinsic function has a ".wW" fnspec attribute which specifies > > the arguments do not escape. What you indeed need to do in > > addition to the escaped solution query is walk over all function > > arguments and see if there is one that aliases 'base'. That > > may not be easily possible on RTL though. On the tree level > > we have a separate points-to set for such call clobbers/uses > > but we do not preserve it for RTL. > > Is it ok to make calls whose arg(s) have EAF_NOESCAPE kill all > locations off the frame in addition to killing all locations that > potentially escape (using the || case you suggested)? Will it be > better or worse than just checking !may_be_aliased (base) alone? It's probably easiest and safest to just check !may_be_aliased for now. Richard.
Author: eraman Date: Tue Jun 14 22:58:20 2011 New Revision: 175063 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=175063 Log: 2011-06-14 Easwaran Raman <eraman@google.com> PR rtl-optimization/44194 * dse.c: Include tree-flow.h (insn_info): Add new field non_frame_wild_read. (group_info): Add new fields escaped_n and escaped_p. (kill_on_calls): New variable. (get_group_info): Initialize gi->escaped_n and gi->escaped_p. (dse_step0): Initialize kill_on_calls. (can_escape): New function. (set_usage_bits): Add additional parameter; record information about escaped locations. (record_store): Pass EXPR corresponding to MEM to set_usage_bits. (dse_step2_nospill): Set kill_on_calls based on group->escaped_n and group->escaped_n. (add_wild_read): Refactor into... (reset_active_stores): ... New method, and (free_read_records): ... New method. (add_non_frame_wild_read): New method. (scan_insn): Call add_non_frame_wild_read on non-const calls. (scan_reads_nospill): Handle instructions with non_frame_wild_read. (dse_step5_nospill): Call scan_reads_nospill for instructions marked as non_frame_wild_read. (dse_step7): Free escaped_n, escaped_p and kill_on_calls bitmaps. testsuite/ChangeLog 2011-06-14 Easwaran Raman <eraman@google.com> PR rtl-optimization/44194 * gcc.dg/pr44194-1.c: New test. * gcc.dg/pr44194-2.c: New test. Added: trunk/gcc/testsuite/gcc.dg/pr44194-1.c trunk/gcc/testsuite/gcc.dg/pr44194-2.c Modified: trunk/gcc/ChangeLog trunk/gcc/dse.c trunk/gcc/testsuite/ChangeLog
Fixed on trunk. Thanks for tackling this, Easwaran.
The DSE patch still leaves 2 redundant stores. The following patch will enable DSE to remove those two stores. Does this look ok? Index: gcc/testsuite/gcc.dg/pr44194-1.c =================================================================== --- gcc/testsuite/gcc.dg/pr44194-1.c (revision 175082) +++ gcc/testsuite/gcc.dg/pr44194-1.c (working copy) @@ -13,5 +13,5 @@ void func() { struct ints s = foo(); bar(s.a, s.b); } -/* { dg-final { scan-rtl-dump "global deletions = 2" "dse1" } } */ +/* { dg-final { scan-rtl-dump "global deletions = 4" "dse1" } } */ /* { dg-final { cleanup-rtl-dump "dse1" } } */ Index: gcc/calls.c =================================================================== --- gcc/calls.c (revision 175081) +++ gcc/calls.c (working copy) @@ -3005,8 +3005,9 @@ expand_call (tree exp, rtx target, int ignore) tree nt = build_qualified_type (rettype, (TYPE_QUALS (rettype) | TYPE_QUAL_CONST)); - + tree target_expr = create_tmp_reg (rettype, NULL); target = assign_temp (nt, 0, 1, 1); + set_mem_expr (target, target_expr); } if (! rtx_equal_p (target, valreg))
> The DSE patch still leaves 2 redundant stores. OK, I missed this, reopening... > The following patch will enable DSE to remove those two stores. Does this > look ok? Calling into the gimplifier from the RTL expander doesn't look appropriate. More fundamentally, it's a little unfortunate to spill to memory a value returned in registers. Can we try to use emit_group_move_into_temps here instead (under the appropriate circumstances)?
(In reply to comment #22) > > The DSE patch still leaves 2 redundant stores. > > OK, I missed this, reopening... > > > The following patch will enable DSE to remove those two stores. Does this > > look ok? > > Calling into the gimplifier from the RTL expander doesn't look appropriate. > > More fundamentally, it's a little unfortunate to spill to memory a value > returned in registers. Can we try to use emit_group_move_into_temps here > instead (under the appropriate circumstances)? It would be nice if the expander does not spill the return into memory in the first place if possible. On other hand tagging compiler created memory location with temp decls so that aliaser has the symbolic information seems a useful mechanism. Easwaran, can you post the patch to gcc-patches for more comments? Thanks, David
> It would be nice if the expander does not spill the return into memory in the > first place if possible. On other hand tagging compiler created memory > location with temp decls so that aliaser has the symbolic information seems a > useful mechanism. But a poor kludge if you can avoid spilling to memory in the first place.
On Wed, Jun 15, 2011 at 10:26 PM, ebotcazou at gcc dot gnu.org < gcc-bugzilla@gcc.gnu.org> wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194 > > --- Comment #24 from Eric Botcazou <ebotcazou at gcc dot gnu.org> > 2011-06-16 05:26:56 UTC --- > > It would be nice if the expander does not spill the return into memory in > the > > first place if possible. On other hand tagging compiler created memory > > location with temp decls so that aliaser has the symbolic information > seems a > > useful mechanism. > > But a poor kludge if you can avoid spilling to memory in the first place. > No I am talking about more general situation where temporary memory is created in RTL pass. David > > -- > Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email > ------- You are receiving this mail because: ------- > You are on the CC list for the bug. >
On Wed, 15 Jun 2011, xinliangli at gmail dot com wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194 > > davidxl <xinliangli at gmail dot com> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |xinliangli at gmail dot com > > --- Comment #23 from davidxl <xinliangli at gmail dot com> 2011-06-15 23:14:50 UTC --- > (In reply to comment #22) > > > The DSE patch still leaves 2 redundant stores. > > > > OK, I missed this, reopening... > > > > > The following patch will enable DSE to remove those two stores. Does this > > > look ok? > > > > Calling into the gimplifier from the RTL expander doesn't look appropriate. It also should use create_tmp_var, not create_tmp_reg. But I wonder why memory allocated via assign_temp isn't marked in a way to let dse do its job (I guess dse thinks that memory escapes?). > > More fundamentally, it's a little unfortunate to spill to memory a value > > returned in registers. Can we try to use emit_group_move_into_temps here > > instead (under the appropriate circumstances)? > > It would be nice if the expander does not spill the return into memory in the > first place if possible. On other hand tagging compiler created memory > location with temp decls so that aliaser has the symbolic information seems a > useful mechanism. Sure - but I wonder why assign_temp doesn't do something equivalent that doesn't require a automatic VAR_DECL to be created. Where does the aliaser catch things with the VAR_DECL around that it doesn't without it? Richard.
(In reply to comment #26) > On Wed, 15 Jun 2011, xinliangli at gmail dot com wrote: > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194 > > > > davidxl <xinliangli at gmail dot com> changed: > > > > What |Removed |Added > > ---------------------------------------------------------------------------- > > CC| |xinliangli at gmail dot com > > > > --- Comment #23 from davidxl <xinliangli at gmail dot com> 2011-06-15 23:14:50 UTC --- > > (In reply to comment #22) > > > > The DSE patch still leaves 2 redundant stores. > > > > > > OK, I missed this, reopening... > > > > > > > The following patch will enable DSE to remove those two stores. Does this > > > > look ok? > > > > > > Calling into the gimplifier from the RTL expander doesn't look appropriate. > > It also should use create_tmp_var, not create_tmp_reg. But I wonder why > memory allocated via assign_temp isn't marked in a way to let dse > do its job (I guess dse thinks that memory escapes?). If the mem rtx doesn't have a tree_expression associated with it, DSE assumes the memory escapes. > > > > More fundamentally, it's a little unfortunate to spill to memory a value > > > returned in registers. Can we try to use emit_group_move_into_temps here > > > instead (under the appropriate circumstances)? > > > > It would be nice if the expander does not spill the return into memory in the > > first place if possible. On other hand tagging compiler created memory > > location with temp decls so that aliaser has the symbolic information seems a > > useful mechanism. > > Sure - but I wonder why assign_temp doesn't do something equivalent > that doesn't require a automatic VAR_DECL to be created. > > Where does the aliaser catch things with the VAR_DECL around that > it doesn't without it? Is it just that when I create a VAR_DECL, TREE_ADDRESSABLE is false and may_be_aliased returns true? > Richard.
On Thu, 16 Jun 2011, eraman at google dot com wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194 > > --- Comment #27 from Easwaran Raman <eraman at google dot com> 2011-06-16 17:14:38 UTC --- > (In reply to comment #26) > > On Wed, 15 Jun 2011, xinliangli at gmail dot com wrote: > > > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194 > > > > > > davidxl <xinliangli at gmail dot com> changed: > > > > > > What |Removed |Added > > > ---------------------------------------------------------------------------- > > > CC| |xinliangli at gmail dot com > > > > > > --- Comment #23 from davidxl <xinliangli at gmail dot com> 2011-06-15 23:14:50 UTC --- > > > (In reply to comment #22) > > > > > The DSE patch still leaves 2 redundant stores. > > > > > > > > OK, I missed this, reopening... > > > > > > > > > The following patch will enable DSE to remove those two stores. Does this > > > > > look ok? > > > > > > > > Calling into the gimplifier from the RTL expander doesn't look appropriate. > > > > It also should use create_tmp_var, not create_tmp_reg. But I wonder why > > memory allocated via assign_temp isn't marked in a way to let dse > > do its job (I guess dse thinks that memory escapes?). > If the mem rtx doesn't have a tree_expression associated with it, DSE assumes > the memory escapes. Hmm, ok. I guess it can't really do better. > > > > More fundamentally, it's a little unfortunate to spill to memory a value > > > > returned in registers. Can we try to use emit_group_move_into_temps here > > > > instead (under the appropriate circumstances)? > > > > > > It would be nice if the expander does not spill the return into memory in the > > > first place if possible. On other hand tagging compiler created memory > > > location with temp decls so that aliaser has the symbolic information seems a > > > useful mechanism. > > > > Sure - but I wonder why assign_temp doesn't do something equivalent > > that doesn't require a automatic VAR_DECL to be created. > > > > Where does the aliaser catch things with the VAR_DECL around that > > it doesn't without it? > > Is it just that when I create a VAR_DECL, TREE_ADDRESSABLE is false and > may_be_aliased returns true? false, yes. Richard.
Is Easwaran's patch a reasonable way to go? David (In reply to comment #28) > On Thu, 16 Jun 2011, eraman at google dot com wrote: > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194 > > > > --- Comment #27 from Easwaran Raman <eraman at google dot com> 2011-06-16 17:14:38 UTC --- > > (In reply to comment #26) > > > On Wed, 15 Jun 2011, xinliangli at gmail dot com wrote: > > > > > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194 > > > > > > > > davidxl <xinliangli at gmail dot com> changed: > > > > > > > > What |Removed |Added > > > > ---------------------------------------------------------------------------- > > > > CC| |xinliangli at gmail dot com > > > > > > > > --- Comment #23 from davidxl <xinliangli at gmail dot com> 2011-06-15 23:14:50 UTC --- > > > > (In reply to comment #22) > > > > > > The DSE patch still leaves 2 redundant stores. > > > > > > > > > > OK, I missed this, reopening... > > > > > > > > > > > The following patch will enable DSE to remove those two stores. Does this > > > > > > look ok? > > > > > > > > > > Calling into the gimplifier from the RTL expander doesn't look appropriate. > > > > > > It also should use create_tmp_var, not create_tmp_reg. But I wonder why > > > memory allocated via assign_temp isn't marked in a way to let dse > > > do its job (I guess dse thinks that memory escapes?). > > If the mem rtx doesn't have a tree_expression associated with it, DSE assumes > > the memory escapes. > > Hmm, ok. I guess it can't really do better. > > > > > > More fundamentally, it's a little unfortunate to spill to memory a value > > > > > returned in registers. Can we try to use emit_group_move_into_temps here > > > > > instead (under the appropriate circumstances)? > > > > > > > > It would be nice if the expander does not spill the return into memory in the > > > > first place if possible. On other hand tagging compiler created memory > > > > location with temp decls so that aliaser has the symbolic information seems a > > > > useful mechanism. > > > > > > Sure - but I wonder why assign_temp doesn't do something equivalent > > > that doesn't require a automatic VAR_DECL to be created. > > > > > > Where does the aliaser catch things with the VAR_DECL around that > > > it doesn't without it? > > > > Is it just that when I create a VAR_DECL, TREE_ADDRESSABLE is false and > > may_be_aliased returns true? > > false, yes. > > Richard.
> Is Easwaran's patch a reasonable way to go? So, in the end, you weren't talking about the general situation, were you? Let's try first to avoid spilling to memory, if possible.
(In reply to comment #30) > > Is Easwaran's patch a reasonable way to go? > > So, in the end, you weren't talking about the general situation, were you? > Let's try first to avoid spilling to memory, if possible. I think these two are totally independent of each other -- one should not be gated against each other. If Eawaran's approach is completely flawed, that is different story. With this change, we at least make incremental improvement. Not familiar with the rtl expander, but I guess the spilling was there probably for a deeper reason. If you have an insight, you can of course point it out. Again, I think these two things are not in conflict. David
On Sat, 18 Jun 2011, xinliangli at gmail dot com wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194 > > --- Comment #29 from davidxl <xinliangli at gmail dot com> 2011-06-18 09:05:10 UTC --- > Is Easwaran's patch a reasonable way to go? The approach looks reasonable (with create_tmp_var instead of _reg), whether it is ok in the calls.c context I didn't look at. Richard.
> I think these two are totally independent of each other -- one should not be > gated against each other. If Eawaran's approach is completely flawed, that is > different story. With this change, we at least make incremental improvement. > Not familiar with the rtl expander, but I guess the spilling was there probably > for a deeper reason. If you have an insight, you can of course point it out. See comment #22. It's an incremental improvement, but maybe we can avoid wasting time and memory by creating RTXes and Trees that will be thrown away immediately after. I don't really see what we risk by trying.
(In reply to comment #33) > > I think these two are totally independent of each other -- one should not be > > gated against each other. If Eawaran's approach is completely flawed, that is > > different story. With this change, we at least make incremental improvement. > > Not familiar with the rtl expander, but I guess the spilling was there probably > > for a deeper reason. If you have an insight, you can of course point it out. > > See comment #22. It's an incremental improvement, but maybe we can avoid > wasting time and memory by creating RTXes and Trees that will be thrown away > immediately after. I don't really see what we risk by trying. yes, of course -- since you have explicit suggestion, it can be tried. Easwaran might have looked into this more .. David
(In reply to comment #33) > > I think these two are totally independent of each other -- one should not be > > gated against each other. If Eawaran's approach is completely flawed, that is > > different story. With this change, we at least make incremental improvement. > > Not familiar with the rtl expander, but I guess the spilling was there probably > > for a deeper reason. If you have an insight, you can of course point it out. > > See comment #22. It's an incremental improvement, but maybe we can avoid > wasting time and memory by creating RTXes and Trees that will be thrown away > immediately after. I don't really see what we risk by trying. There is a comment in calls.c that says /* Handle calls that return values in multiple non-contiguous locations. The Irix 6 ABI has examples of this. */ I don't know if avoiding the copy breaks that ABI in any way so I didn't try that approach. In general, if the TARGET is non-NULL, I don't see how the copy can be avoided (but then, the tree EXPR corresponding to the target hopefully has the addressable flag set). In this particular case though TARGET is NULL. Is it just a matter of setting VALREG and let expand_assignment deal with it? Irrespective of how this case is handled, I think there may be other instances where a store generated during expansion may be redundant, but we don't know it at the point of generation. In such cases, is this approach of associating a tree expr with the temp rtx generated by the expanded reasonable?
> There is a comment in calls.c that says > /* Handle calls that return values in multiple non-contiguous locations. > The Irix 6 ABI has examples of this. */ > > I don't know if avoiding the copy breaks that ABI in any way so I didn't try > that approach. In general, if the TARGET is non-NULL, I don't see how the copy > can be avoided (but then, the tree EXPR corresponding to the target hopefully > has the addressable flag set). In this particular case though TARGET is NULL. > Is it just a matter of setting VALREG and let expand_assignment deal with it? It isn't a matter of avoiding the copy, it is matter of avoiding to spill the copy to memory if possible, i.e. to copy to pseudo registers instead. There may be prerequisites. See comment #22 for a possible approach. > Irrespective of how this case is handled, I think there may be other instances > where a store generated during expansion may be redundant, but we don't know it > at the point of generation. In such cases, is this approach of associating a > tree expr with the temp rtx generated by the expanded reasonable? Probably, yes, albeit as a last resort solution.
*** Bug 49646 has been marked as a duplicate of this bug. ***
Taking it over.
GCC 4.7.0 is being released, adjusting target milestone.
(In reply to comment #38) > Taking it over. Any plans w.r.t the fix to this problem? This PR still fails on alphaec68-unknown-linux-gnu [1], I will be glad to test any proposed change(s). [1] http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg02428.html
> Any plans w.r.t the fix to this problem? Yes, I have plans, but less time unfortunately. But I'll definitely tackle it during this stage #1.
Is bug #28831 a dup of this?
> Is bug #28831 a dup of this? Not exactly, PR middle-end/28831 is a generic problem while this one is specific to architectures that can return structures in registers.
Note that the x86 target has been changed in svn to use TImode for 128-bit structures, and structures bigger than 128 bits may not be passed in registers, so triggering this bug may be quite different now.
> Note that the x86 target has been changed in svn to use TImode for 128-bit > structures, and structures bigger than 128 bits may not be passed in registers, > so triggering this bug may be quite different now. That's largely orthogonal, not all structures can have an integer mode.
Author: ebotcazou Date: Fri Sep 14 13:28:44 2012 New Revision: 191302 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=191302 Log: PR rtl-optimization/44194 * calls.c (expand_call): In the PARALLEL case, copy the return value into pseudos instead of spilling it onto the stack. * emit-rtl.c (adjust_address_1): Rename ADJUST into ADJUST_ADDRESS and add new ADJUST_OBJECT parameter. If ADJUST_OBJECT is set, drop the underlying object if it cannot be proved that the adjusted memory access is still within its bounds. (adjust_automodify_address_1): Adjust call to adjust_address_1. (widen_memory_access): Likewise. * expmed.c (store_bit_field_1): Call adjust_bitfield_address instead of adjust_address. Do not drop the underlying object of a MEM. (store_fixed_bit_field): Likewise. (extract_bit_field_1): Likewise. Fix oversight in recursion. (extract_fixed_bit_field): Likewise. * expr.h (adjust_address_1): Adjust prototype. (adjust_address): Adjust call to adjust_address_1. (adjust_address_nv): Likewise. (adjust_bitfield_address): New macro. (adjust_bitfield_address_nv): Likewise. * expr.c (expand_assignment): Handle a PARALLEL in more cases. (store_expr): Likewise. (store_field): Likewise. * dse.c: Fix typos in the head comment. Modified: trunk/gcc/ChangeLog trunk/gcc/calls.c trunk/gcc/dse.c trunk/gcc/emit-rtl.c trunk/gcc/expmed.c trunk/gcc/expr.c trunk/gcc/expr.h trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/pr44194-1.c
At long last.
May Shub-Internet not see you as you pass.
Author: ebotcazou Date: Sun Oct 21 12:36:16 2012 New Revision: 192651 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=192651 Log: PR rtl-optimization/44194 * calls.c (expand_call): Allow sibling calls in the PARALLEL case. Modified: trunk/gcc/ChangeLog trunk/gcc/calls.c
Author: eraman Date: Fri Nov 2 00:28:40 2012 New Revision: 193085 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=193085 Log: Backport 191302 and 192651 from trunk: 2012-09-14 Eric Botcazou <ebotcazou@adacore.com> PR rtl-optimization/44194 * calls.c (expand_call): In the PARALLEL case, copy the return value into pseudos instead of spilling it onto the stack. * emit-rtl.c (adjust_address_1): Rename ADJUST into ADJUST_ADDRESS and add new ADJUST_OBJECT parameter. If ADJUST_OBJECT is set, drop the underlying object if it cannot be proved that the adjusted memory access is still within its bounds. (adjust_automodify_address_1): Adjust call to adjust_address_1. (widen_memory_access): Likewise. * expmed.c (store_bit_field_1): Call adjust_bitfield_address instead of adjust_address. Do not drop the underlying object of a MEM. (store_fixed_bit_field): Likewise. (extract_bit_field_1): Likewise. Fix oversight in recursion. (extract_fixed_bit_field): Likewise. * expr.h (adjust_address_1): Adjust prototype. (adjust_address): Adjust call to adjust_address_1. (adjust_address_nv): Likewise. (adjust_bitfield_address): New macro. (adjust_bitfield_address_nv): Likewise. * expr.c (expand_assignment): Handle a PARALLEL in more cases. (store_expr): Likewise. (store_field): Likewise. * dse.c: Fix typos in the head comment. 2012-10-21 Eric Botcazou <ebotcazou@adacore.com> PR rtl-optimization/44194 * calls.c (expand_call): Allow sibling calls in the PARALLEL case. gcc/testsuite/ChangeLog: 2012-09-14 Eric Botcazou <ebotcazou@adacore.com> * gcc.dg/pr44194-1.c: Check that there are no memory accesses left. Modified: branches/google/gcc-4_7/gcc/ChangeLog.google-4_7 branches/google/gcc-4_7/gcc/calls.c branches/google/gcc-4_7/gcc/dse.c branches/google/gcc-4_7/gcc/emit-rtl.c branches/google/gcc-4_7/gcc/expmed.c branches/google/gcc-4_7/gcc/expr.c branches/google/gcc-4_7/gcc/expr.h branches/google/gcc-4_7/gcc/testsuite/ChangeLog.google-4_7 branches/google/gcc-4_7/gcc/testsuite/gcc.dg/pr44194-1.c