Given the following code: ------------------------------------------------------------------------------ typedef unsigned int size_t; typedef unsigned long long uint64; extern "C" { extern void *memcpy (void *__restrict __dest, __const void *__restrict __src, size_t __n) /*throw ()*/; } extern void foo (void* p); inline uint64 ghtonll(uint64 x) { // __r is allocated the same stack slot as dest below union { unsigned long long int __ll; unsigned long int __l[2]; } __w, __r; __w.__ll = x; __r.__l[0] = ( { register unsigned int __v; __asm__ __volatile__ ("bswap %0" : "=r" (__v) : "0" ((unsigned int) (__w.__l[1]))); __v; }); __r.__l[1] = ( { register unsigned int __v; __asm__ __volatile__ ("bswap %0" : "=r" (__v) : "0" ((unsigned int) (__w.__l[0]))); __v; }); return __r.__ll; } inline uint64 double_2_uint64 (const double *source) { uint64 dest; // allocated the same stack slot as __r above memcpy(&dest, source, sizeof(dest)); return dest; } inline void KeyFromUint64(uint64 fp) { uint64 norder; norder = ghtonll (fp); foo((char*)(&norder)); } void KeyFromDouble(double x) { uint64 n = double_2_uint64 (&x); if (n >= 42) { n += 1; } KeyFromUint64(n); } --- Here is what gcc -O2 -fdump-tree-all (version 4.2) in at the end of the tree passes. Please take note of the assignment to dest after that of norder. ;; Function void KeyFromDouble(double) (_Z13KeyFromDoubled) void KeyFromDouble(double) (x) { uint64 n.50; char * norder.2; uint64 norder; register unsigned int __v; register unsigned int __v; union ._0 __r; union ._0 __w; uint64 dest; uint64 n; <bb 2>: n.50 = VIEW_CONVERT_EXPR<uint64>(x); if (n.50 > 41) goto <L0>; else goto <L5>; <L5>:; n = n.50; goto <bb 5> (<L1>); <L0>:; n = n.50 + 1; <L1>:; __w.__ll = n; __asm__ __volatile__("bswap %0":"=r" __v:"0" __w.__l[1]); __r.__l[0] = __v; __asm__ __volatile__("bswap %0":"=r" __v:"0" __w.__l[0]); __r.__l[1] = __v; norder = __r.__ll; norder.2 = (char *) &norder; dest = n.50; foo (norder.2); return; } ---- Here is part of the RTL expansion: (insn 45 43 47 9 (set (mem/c/i:DI (plus:SI (reg/f:SI 54 virtual-stack-vars) (const_int -8 [0xfffffff8])) [3 norder+0 S8 A32]) (reg/v:DI 64 [ __r ])) -1 (nil) (nil)) (insn 47 45 49 9 (parallel [ (set (reg:SI 59 [ norder.2 ]) (plus:SI (reg/f:SI 54 virtual-stack-vars) (const_int -8 [0xfffffff8]))) (clobber (reg:CC 17 flags)) ]) -1 (nil) (nil)) (insn 49 47 51 9 (set (mem/c/i:DI (plus:SI (reg/f:SI 54 virtual-stack-vars) (const_int -8 [0xfffffff8])) [3 dest+0 S8 A32]) (reg/v:DI 58 [ n.50 ])) -1 (nil) (nil)) Both dest & norder are assigned the same memory location (virtual-stack-vars - 8). So later lifeness analysis thinks that the first assignment is dead and it removes it. norder contains results of bswap but gcc cannot remove the asm statement. However, since the output of the asms are dead so they both got eax as the output register.
Why is the store to dest not being removed?
The address of dest has been passed to memcpy() and the alias analysis considers the varaible to escape. So potentially foo() can see the value of dest if memcpy() stash the pointer somewhere. This is impossible but the compiler cannot prove this so it has to be conservative and treat foo() as a potential reader of dest.
No, it should be marked as non escaping later on.
On the trunk, dest is changed to be non addressable in alias5, why is it not doing it in 4.2?
(In reply to comment #2) > The address of dest has been passed to memcpy() and the alias analysis > considers the varaible to escape. So potentially foo() can see the value of > dest if memcpy() stash the pointer somewhere. This is impossible but the > compiler cannot prove this so it has to be conservative and treat foo() as a > potential reader of dest. We remove the call to memcpy (in 4.2) in .064t.fab and then alias5 should have changed it (dest) to be non ADDRESSABLE but it is not for some reason. I think may_alias is messed up in 4.2.0. Also sink maybe should not be sinking stores.
Subject: Re: [4.2 Regression] Incorrect stack sharing causing removal of live code Fixing alias analysis in 4.2.0 will make this problem go away but it does not change the underlying issue in the stack local sharing code. 13 Jun 2007 21:42:01 -0000, pinskia at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org>: > > > ------- Comment #5 from pinskia at gcc dot gnu dot org 2007-06-13 21:42 ------- > (In reply to comment #2) > > The address of dest has been passed to memcpy() and the alias analysis > > considers the varaible to escape. So potentially foo() can see the value of > > dest if memcpy() stash the pointer somewhere. This is impossible but the > > compiler cannot prove this so it has to be conservative and treat foo() as a > > potential reader of dest. > We remove the call to memcpy (in 4.2) in .064t.fab and then alias5 should have > changed it (dest) to be non ADDRESSABLE but it is not for some reason. > > I think may_alias is messed up in 4.2.0. Also sink maybe should not be sinking > stores. > > > -- > > pinskia at gcc dot gnu dot org changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > Known to work|4.3.0 |4.3.0 4.1.1 > Summary|Incorrect stack sharing |[4.2 Regression] Incorrect > |causing removal of live code|stack sharing causing > | |removal of live code > Target Milestone|--- |4.2.1 > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32327 > > ------- You are receiving this mail because: ------- > You are on the CC list for the bug, or are watching someone who is. >
(In reply to comment #6) > Fixing alias analysis in 4.2.0 will make this problem go away but it > does not change the underlying issue in the stack local sharing code. Is that really true? The underlying issue is sinking the store is causing the variables to in the same "scope". I think that is the real underlying issue and nothing to do with local stack sharing.
Subject: Re: [4.2 Regression] Incorrect stack sharing causing removal of live code I thought like you do initially. But then I was told that in gimple everything is promoted to function scope. So dest is not out of scope. -Doug 13 Jun 2007 21:53:15 -0000, pinskia at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org>: > > > ------- Comment #7 from pinskia at gcc dot gnu dot org 2007-06-13 21:53 ------- > Is that really true? The underlying issue is sinking the store is causing the > variables to in the same "scope". I think that is the real underlying issue > and nothing to do with local stack sharing. > > > -- > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32327 > > ------- You are receiving this mail because: ------- > You are on the CC list for the bug, or are watching someone who is. >
(In reply to comment #8) > Subject: Re: [4.2 Regression] Incorrect stack sharing causing removal of live > code > > I thought like you do initially. But then I was told that in gimple > everything is promoted to function scope. So dest is not out of scope. Who said that? We keep around the scopes, how else do we get stack sharing in the first place? Also you cannot turn off stack sharing because that would cause some programs no longer to be useful (LLVM is one, the stack would just blow up).
Subject: Re: [4.2 Regression] Incorrect stack sharing causing removal of live code Talked to Dan Berlin and Diego Novillo here at Google. They told me that all locals are promoted to function scope. In generally, the tree passes can do abitrary transformation like hoisting or loop invariant motion that can move a variable out of its original block scope. If we keep the scopes around. There must be code to fix up all block scoping information before the tree passes end. It may be just a matter to find why the scope of dest is not fixed up properly. 14 Jun 2007 00:28:27 -0000, pinskia at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org>: > > Who said that? We keep around the scopes, how else do we get stack sharing in > the first place? Also you cannot turn off stack sharing because that would > cause some programs no longer to be useful (LLVM is one, the stack would just > blow up). >
Look at the code in cfgexpand.c, expand_used_vars, this looks at the source blocks.
This code has been there since 4.0.0 so this has been true since then.
Subject: Re: [4.2 Regression] Incorrect stack sharing causing removal of live code Yes, I saw the code there yesterday and I was wondering if the block scope got fixed up somehow. 14 Jun 2007 00:42:23 -0000, pinskia at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org>: > > > ------- Comment #11 from pinskia at gcc dot gnu dot org 2007-06-14 00:42 ------- > Look at the code in cfgexpand.c, expand_used_vars, this looks at the source > blocks.
Subject: Re: [4.2 Regression] Incorrect stack sharing causing removal of live code On 14 Jun 2007 00:46:28 -0000, dougkwan at google dot com <gcc-bugzilla@gcc.gnu.org> wrote: > Yes, I saw the code there yesterday and I was wondering if the block > scope got fixed up somehow. It can't because the debugging information will be messed up. -- Pinski
Subject: Re: [4.2 Regression] Incorrect stack sharing causing removal of live code Then the stack local sharing code is very broken. It should do a real live-range analysis instead of using block scoping information form the source tree. -Doug 14 Jun 2007 00:52:45 -0000, pinskia at gmail dot com <gcc-bugzilla@gcc.gnu.org>: > It can't because the debugging information will be messed up. >
The problem is that it needs also source style scoping also: take: int f(int *a); int g(int b) { { int c; f(&c); } { int c1; f(&c1); } } Without source based ones, we don't know if c/c1 can ever be shared. I have code here at Sony where we actually depend on this behavior with large structs (and arrays).
Subject: Re: [4.2 Regression] Incorrect stack sharing causing removal of live code Unless the compiler can prove that f() does not save the pointer to c and use it later, it cannot share a stack slot for c & c1. This is true regardless of any block scoping in the source. Yeah, it looks like accessing c outside of the first block was undefined but I was told me that GIMPLE promote c & c1 all the function scope. -Doug 14 Jun 2007 01:02:48 -0000, pinskia at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org>: > > > ------- Comment #16 from pinskia at gcc dot gnu dot org 2007-06-14 01:02 ------- > The problem is that it needs also source style scoping also: > take: > int f(int *a); > int g(int b) > { > { > int c; > f(&c); > } > { > int c1; > f(&c1); > } > } > > Without source based ones, we don't know if c/c1 can ever be shared. > I have code here at Sony where we actually depend on this behavior with large > structs (and arrays). > > > -- > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32327 > > ------- You are receiving this mail because: ------- > You are on the CC list for the bug, or are watching someone who is. >
Subject: Re: [4.2 Regression] Incorrect stack sharing causing removal of live code On 14 Jun 2007 01:09:16 -0000, dougkwan at google dot com <gcc-bugzilla@gcc.gnu.org> wrote: > Unless the compiler can prove that f() does not save the pointer to c > and use it later, it cannot share a stack slot for c & c1. This is > true regardless of any block scoping in the source. Yeah, it looks > like accessing c outside of the first block was undefined but I was > told me that GIMPLE promote c & c1 all the function scope. Yes but I don't think GIMPLE does that, it might promote register based ones but not addressable ones (as mentioned before). Really if we change the stack sharing, we are going to cause a large number of regressions, I already have some regressions between 4.0 and 4.1 due to stack sharing changes with two large structs not being in the same slot even though we can put them there with one having an offset. And yes I still need to file that bug. -- Pinski
Adding Richard in case he has any comment. I don't agree with comment #17 from Doug. In the code in comment #16, if f saves &c, there is no way that it could validly use it after the block scope exits. In general if we expand the scope of a variable outside of its block scope, then the stack sharing code in cfgexpand.c will do the wrong thing. There is normally no reason for the tree code to expand the scope of a variable. But there is also nothing specifically to stop it from doing so. In this particular case store sinking is extending the life of a variable. This is a meaningless move, as there is no valid way for the stored value to be accessed. It is happening because alias analysis thinks that the function call can refer to the stored variable, although this actually can not happen. Part of the reason alias analysis is wrong is that we retain the list of addressable variables in a statement when we shouldn't. Interestingly, if I apply the tree-ssa-operands.c patch from http://gcc.gnu.org/ml/gcc-patches/2006-12/msg01807.html then the problem goes away. But there are a few complex issues here: 1) Alias analysis thinks that a function can refer to a variable in a different block scope. 2) Nothing systematically prevents the tree code from referencing or setting variables outside of their block scope. 3) The stack sharing code in cfgexpand assumes that variables are only referenced or set within their block scope. This is going to require more investigation. I think we are mainly lucky because the tree optimizers tend to not move variables outside of their block scope. This test case just happens to show a way in which that can happen.
Subject: Re: [4.2 Regression] Incorrect stack sharing causing removal of live code That was my initial opinion too but Diego and Danny told me there is only one scope in the tree SSA form. So it is okay. About your comment of us being lucky, I believe our luck is running out. As we do more and more things in the tree passes, we will be likely to see this problem more often. 14 Jun 2007 17:58:01 -0000, ian at airs dot com <gcc-bugzilla@gcc.gnu.org>: > > > ------- Comment #19 from ian at airs dot com 2007-06-14 17:57 ------- > I don't agree with comment #17 from Doug. In the code in comment #16, if f > saves &c, there is no way that it could validly use it after the block scope > exits.
Subject: Re: [4.2 Regression] Incorrect stack sharing causing removal of live code On Thu, 14 Jun 2007, ian at airs dot com wrote: > ------- Comment #19 from ian at airs dot com 2007-06-14 17:57 ------- > This is going to require more investigation. I think we are mainly lucky > because the tree optimizers tend to not move variables outside of their block > scope. This test case just happens to show a way in which that can happen. We may finally need to compute life for on-stack variables instead of relying on something not happening that is not explicitly forbidden in our IL (placement new anyone? ;)) (the other) Richard.
With the test program, gcc 4.1.2 generates correct code and gcc 4.2.0 generates wrong code. Can someone increase the priority and severity of this PR?
Working on this today.
Subject: Bug 32327 Author: dnovillo Date: Fri Jun 15 22:10:09 2007 New Revision: 125748 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=125748 Log: PR 32327 * tree-ssa-operands.c (build_ssa_operands): Initially assume that the statement does not take any addresses. testsuite/ChangeLog PR 32327 * g++.dg/tree-ssa/pr32327-1.C: New test. * g++.dg/tree-ssa/pr32327.C: New test. Added: branches/gcc-4_2-branch/gcc/testsuite/g++.dg/tree-ssa/pr32327-1.C branches/gcc-4_2-branch/gcc/testsuite/g++.dg/tree-ssa/pr32327.C Modified: branches/gcc-4_2-branch/gcc/ChangeLog branches/gcc-4_2-branch/gcc/testsuite/ChangeLog branches/gcc-4_2-branch/gcc/tree-ssa-operands.c
Fixed symptoms with http://gcc.gnu.org/ml/gcc-patches/2007-06/msg01081.html. Real fix still being discussed.
(In reply to comment #10) > Talked to Dan Berlin and Diego Novillo here at Google. They told me > that all locals are promoted to function scope. That *only* applies to register variables, not stack variables. We very very much want to preserve scope of stack variables, because we very very much want to share stack space between stack variables of different scopes. Failure to do so causes bad interactions with inlining, and causes stack space consumtion to grow to unacceptable levels. I.e. you can't build kernels anymore.
Subject: Re: [4.2 Regression] Incorrect stack sharing causing removal of live code On 6/19/07 1:26 PM, rth at gcc dot gnu dot org wrote: > ------- Comment #26 from rth at gcc dot gnu dot org 2007-06-19 17:26 ------- > (In reply to comment #10) >> Talked to Dan Berlin and Diego Novillo here at Google. They told me >> that all locals are promoted to function scope. > > That *only* applies to register variables, not stack variables. Yes, but our GIMPLE optimizers don't know that and we do not have a way of enforcing that in GIMPLE. There just are no scope boundaries in the SSA web. So, the end result is that in SSA we only have function scope. > We very very much want to preserve scope of stack variables, because > we very very much want to share stack space between stack variables > of different scopes. Failure to do so causes bad interactions with > inlining, and causes stack space consumtion to grow to unacceptable > levels. I.e. you can't build kernels anymore. Agreed. We have to find a way of either tying the hands of code motion transformations by making them use the SSA web *and* the TREE_BLOCKs, or make stack slot sharing aware of live ranges. Right now we have the unfortunate situation that an optimizer is making a valid transformation which breaks the scope assumption that stack sharing uses. I am not sure if it would be practical to force code motion optimizations to use anything other than the SSA web to do their analysis. Perhaps we could force scopes by building barriers on the SSA web itself (say, by putting PHI nodes at scope boundaries, or something). Alternatively, we could incorporate live-range analysis to stack slot sharing. That way, if code motion creates undue stack slot pressure, we would merely emit suboptimal code, instead of wrong code.
Won't solve the problem currently, but I think the long term solution is to do stack analysis when out-of-ssa and expand have been merged into a single entity. The live range info out-of-ssa calculates can be used to do a decent job of stack sharing regardless of scoping info. Short term, we might be able to check to see if a stack variable is used outside its scope when going out of ssa, and flag it for no sharing, or promote it to file level, or the next appropriate scope level. That would work as long as we don't end up promoting too many things in the wrong routines :-). I'm not fond of trying to make the optimization passes aware of scopes on top of everything else they are aware of, Id rather see it taken care of in one place, and not dependent on scoping info.
Diego, does this problem still exist on the 4.2 branch? I see that you checked in a patch, which you suggest may not be a complete fix, but does this test case still fail?
Subject: Re: [4.2 Regression] Incorrect stack sharing causing removal of live code On 7/3/07 11:28 PM, mmitchel at gcc dot gnu dot org wrote: > ------- Comment #29 from mmitchel at gcc dot gnu dot org 2007-07-04 03:28 ------- > Diego, does this problem still exist on the 4.2 branch? I see that you checked > in a patch, which you suggest may not be a complete fix, but does this test > case still fail? Yes, the problem still exists on 4.2 and mainline. The patch is valid in that it fixes a codegen deficiency, but it only works around this bug. The test case does not fail anymore, and getting another one to fail may be tricky (it's a fairly rare bug, unfortunately). A real fix is in the works, but it's not clear when it might be ready.
Change target milestone to 4.2.3, as 4.2.2 has been released.
4.2.3 is being released now, changing milestones of open bugs to 4.2.4.
4.2.4 is being released, changing milestones to 4.2.5.
Closing 4.2 branch, fixed in 4.3. (There's a lot of discussion here of a possible underlying problem that might or might not be fixed but isn't 4.2-only; if still present, please open a separate bug report for it.)