Bug 32327 - [4.2 Regression] Incorrect stack sharing causing removal of live code
Summary: [4.2 Regression] Incorrect stack sharing causing removal of live code
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.2.0
: P1 normal
Target Milestone: 4.3.0
Assignee: Diego Novillo
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2007-06-13 19:57 UTC by Diego Novillo
Modified: 2009-03-30 22:06 UTC (History)
8 users (show)

See Also:
Host: x86_64-unknown-linux-gnu
Target: x86_64-unknown-linux-gnu
Build: x86_64-unknown-linux-gnu
Known to work: 4.3.0 4.1.1
Known to fail: 4.2.0 4.2.5
Last reconfirmed: 2007-06-15 13:27:21


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Diego Novillo 2007-06-13 19:57:21 UTC
Given the following code:

------------------------------------------------------------------------------
typedef unsigned int size_t;
typedef unsigned long long uint64;

extern "C" {
extern void *memcpy (void *__restrict __dest,
      __const void *__restrict __src, size_t __n) /*throw ()*/;
}

extern void foo (void* p);

inline uint64
ghtonll(uint64 x)
{
 // __r is allocated the same stack slot as dest below
 union { unsigned long long int __ll;
         unsigned long int __l[2]; } __w, __r;
 __w.__ll = x;
 __r.__l[0] = (
   {
     register unsigned int __v;
     __asm__ __volatile__ ("bswap %0" : "=r" (__v) :
                           "0" ((unsigned int) (__w.__l[1])));
     __v; });

 __r.__l[1] = (
   {
     register unsigned int __v;
     __asm__ __volatile__ ("bswap %0" : "=r" (__v) :
                           "0" ((unsigned int) (__w.__l[0])));
     __v; });

 return __r.__ll;
}

inline uint64
double_2_uint64 (const double *source)
{
 uint64 dest;  // allocated the same stack slot as __r above
 memcpy(&dest, source, sizeof(dest));
 return dest;
}

inline void
KeyFromUint64(uint64 fp) {
 uint64 norder;
 norder = ghtonll (fp);
 foo((char*)(&norder));
}

void
KeyFromDouble(double x) {
 uint64 n = double_2_uint64 (&x);
 if (n >= 42) {
   n += 1;
 }

 KeyFromUint64(n);
}
---

Here is what gcc -O2 -fdump-tree-all (version 4.2) in at the end of the tree passes.
Please take note of the assignment to dest after that of norder.

;; Function void KeyFromDouble(double) (_Z13KeyFromDoubled)

void KeyFromDouble(double) (x)
{
 uint64 n.50;
 char * norder.2;
 uint64 norder;
 register unsigned int __v;
 register unsigned int __v;
 union ._0 __r;
 union ._0 __w;
 uint64 dest;
 uint64 n;

<bb 2>:
 n.50 = VIEW_CONVERT_EXPR<uint64>(x);
 if (n.50 > 41) goto <L0>; else goto <L5>;

<L5>:;
 n = n.50;
 goto <bb 5> (<L1>);

<L0>:;
 n = n.50 + 1;

<L1>:;
 __w.__ll = n;
 __asm__ __volatile__("bswap %0":"=r" __v:"0" __w.__l[1]);
 __r.__l[0] = __v;
 __asm__ __volatile__("bswap %0":"=r" __v:"0" __w.__l[0]);
 __r.__l[1] = __v;
 norder = __r.__ll;
 norder.2 = (char *) &norder;
 dest = n.50;
 foo (norder.2);
 return;

}
----
Here is part of the RTL expansion:

(insn 45 43 47 9 (set (mem/c/i:DI (plus:SI (reg/f:SI 54 virtual-stack-vars)
               (const_int -8 [0xfffffff8])) [3 norder+0 S8 A32])
       (reg/v:DI 64 [ __r ])) -1 (nil)
   (nil))

(insn 47 45 49 9 (parallel [
           (set (reg:SI 59 [ norder.2 ])
               (plus:SI (reg/f:SI 54 virtual-stack-vars)
                   (const_int -8 [0xfffffff8])))
           (clobber (reg:CC 17 flags))
       ]) -1 (nil)
   (nil))

(insn 49 47 51 9 (set (mem/c/i:DI (plus:SI (reg/f:SI 54 virtual-stack-vars)
               (const_int -8 [0xfffffff8])) [3 dest+0 S8 A32])
       (reg/v:DI 58 [ n.50 ])) -1 (nil)
   (nil))

Both dest & norder are assigned the same memory location
(virtual-stack-vars - 8). So later lifeness analysis thinks that the
first assignment is dead and it removes it. norder contains results of
bswap but gcc cannot remove the asm statement. However, since the
output of the asms are dead so they both got eax as the output
register.
Comment 1 Andrew Pinski 2007-06-13 21:20:40 UTC
Why is the store to dest not being removed?
Comment 2 Doug Kwan 2007-06-13 21:25:40 UTC
The address of dest has been passed to memcpy() and the alias analysis considers the varaible to escape. So potentially foo() can see the value of dest if memcpy() stash the pointer somewhere. This is impossible but the compiler cannot prove this so it has to be conservative and treat foo() as a potential reader of dest.
Comment 3 Andrew Pinski 2007-06-13 21:32:33 UTC
No, it should be marked as non escaping later on.
Comment 4 Andrew Pinski 2007-06-13 21:37:05 UTC
On the trunk, dest is changed to be non addressable in alias5, why is it not doing it in 4.2?
Comment 5 Andrew Pinski 2007-06-13 21:42:00 UTC
(In reply to comment #2)
> The address of dest has been passed to memcpy() and the alias analysis
> considers the varaible to escape. So potentially foo() can see the value of
> dest if memcpy() stash the pointer somewhere. This is impossible but the
> compiler cannot prove this so it has to be conservative and treat foo() as a
> potential reader of dest.
We remove the call to memcpy (in 4.2) in .064t.fab and then alias5 should have changed it (dest) to be non ADDRESSABLE but it is not for some reason.

I think may_alias is messed up in 4.2.0.  Also sink maybe should not be sinking stores.
Comment 6 Doug Kwan 2007-06-13 21:50:09 UTC
Subject: Re:  [4.2 Regression] Incorrect stack sharing causing removal of live code

Fixing alias analysis in 4.2.0 will make this problem go away but it
does not change the underlying issue in the stack local sharing code.

13 Jun 2007 21:42:01 -0000, pinskia at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org>:
>
>
> ------- Comment #5 from pinskia at gcc dot gnu dot org  2007-06-13 21:42 -------
> (In reply to comment #2)
> > The address of dest has been passed to memcpy() and the alias analysis
> > considers the varaible to escape. So potentially foo() can see the value of
> > dest if memcpy() stash the pointer somewhere. This is impossible but the
> > compiler cannot prove this so it has to be conservative and treat foo() as a
> > potential reader of dest.
> We remove the call to memcpy (in 4.2) in .064t.fab and then alias5 should have
> changed it (dest) to be non ADDRESSABLE but it is not for some reason.
>
> I think may_alias is messed up in 4.2.0.  Also sink maybe should not be sinking
> stores.
>
>
> --
>
> pinskia at gcc dot gnu dot org changed:
>
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>       Known to work|4.3.0                       |4.3.0 4.1.1
>             Summary|Incorrect stack sharing     |[4.2 Regression] Incorrect
>                    |causing removal of live code|stack sharing causing
>                    |                            |removal of live code
>    Target Milestone|---                         |4.2.1
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32327
>
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
>
Comment 7 Andrew Pinski 2007-06-13 21:53:14 UTC
(In reply to comment #6)
> Fixing alias analysis in 4.2.0 will make this problem go away but it
> does not change the underlying issue in the stack local sharing code.

Is that really true?  The underlying issue is sinking the store is causing the variables to in the same "scope".  I think that is the real underlying issue and nothing to do with local stack sharing.
Comment 8 Doug Kwan 2007-06-14 00:14:40 UTC
Subject: Re:  [4.2 Regression] Incorrect stack sharing causing removal of live code

I thought like you do initially. But then I was told that in gimple
everything is promoted to function scope. So dest is not out of scope.

-Doug

13 Jun 2007 21:53:15 -0000, pinskia at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org>:
>
>
> ------- Comment #7 from pinskia at gcc dot gnu dot org  2007-06-13 21:53 -------

> Is that really true?  The underlying issue is sinking the store is causing the
> variables to in the same "scope".  I think that is the real underlying issue
> and nothing to do with local stack sharing.
>
>
> --
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32327
>
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
>
Comment 9 Andrew Pinski 2007-06-14 00:28:26 UTC
(In reply to comment #8)
> Subject: Re:  [4.2 Regression] Incorrect stack sharing causing removal of live
> code
> 
> I thought like you do initially. But then I was told that in gimple
> everything is promoted to function scope. So dest is not out of scope.

Who said that?  We keep around the scopes, how else do we get stack sharing in the first place?  Also you cannot turn off stack sharing because that would cause some programs no longer to be useful (LLVM is one, the stack would just blow up).
Comment 10 Doug Kwan 2007-06-14 00:35:11 UTC
Subject: Re:  [4.2 Regression] Incorrect stack sharing causing removal of live code

Talked to Dan Berlin and Diego Novillo here at Google. They told me
that all locals are promoted to function scope.

In generally, the tree passes can do abitrary transformation like
hoisting or loop invariant motion that can move a variable out of its
original block scope.

If we keep the scopes around. There must be code to fix up all block
scoping information before the tree passes end. It may be just a
matter to find why the scope of dest is not fixed up properly.

14 Jun 2007 00:28:27 -0000, pinskia at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org>:

>
> Who said that?  We keep around the scopes, how else do we get stack sharing in
> the first place?  Also you cannot turn off stack sharing because that would
> cause some programs no longer to be useful (LLVM is one, the stack would just
> blow up).
>
Comment 11 Andrew Pinski 2007-06-14 00:42:22 UTC
Look at the code in cfgexpand.c, expand_used_vars, this looks at the source blocks.
Comment 12 Andrew Pinski 2007-06-14 00:42:46 UTC
This code has been there since 4.0.0 so this has been true since then.
Comment 13 Doug Kwan 2007-06-14 00:46:27 UTC
Subject: Re:  [4.2 Regression] Incorrect stack sharing causing removal of live code

Yes, I saw the code there yesterday and I was wondering if the block
scope got fixed up somehow.

14 Jun 2007 00:42:23 -0000, pinskia at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org>:
>
>
> ------- Comment #11 from pinskia at gcc dot gnu dot org  2007-06-14 00:42 -------
> Look at the code in cfgexpand.c, expand_used_vars, this looks at the source
> blocks.
Comment 14 pinskia@gmail.com 2007-06-14 00:52:44 UTC
Subject: Re:  [4.2 Regression] Incorrect stack sharing causing removal of live code

On 14 Jun 2007 00:46:28 -0000, dougkwan at google dot com
<gcc-bugzilla@gcc.gnu.org> wrote:
> Yes, I saw the code there yesterday and I was wondering if the block
> scope got fixed up somehow.

It can't because the debugging information will be messed up.

-- Pinski
Comment 15 Doug Kwan 2007-06-14 00:59:12 UTC
Subject: Re:  [4.2 Regression] Incorrect stack sharing causing removal of live code

Then the stack local sharing code is very broken. It should do a real
live-range analysis instead of using block scoping information form
the source tree.

-Doug

14 Jun 2007 00:52:45 -0000, pinskia at gmail dot com <gcc-bugzilla@gcc.gnu.org>:

> It can't because the debugging information will be messed up.
>
Comment 16 Andrew Pinski 2007-06-14 01:02:47 UTC
The problem is that it needs also source style scoping also:
take:
int f(int *a);
int g(int b)
{
  {
    int c;
    f(&c);
  }
  {
    int c1;
    f(&c1);
  }
}

Without source based ones, we don't know if c/c1 can ever be shared.
I have code here at Sony where we actually depend on this behavior with large structs (and arrays).
Comment 17 Doug Kwan 2007-06-14 01:09:16 UTC
Subject: Re:  [4.2 Regression] Incorrect stack sharing causing removal of live code

Unless the compiler can prove that f() does not save the pointer to c
and use it later, it cannot share a stack slot for c & c1. This is
true regardless of any block scoping in the source. Yeah, it looks
like accessing c outside of the first block was undefined but I was
told me that GIMPLE promote c & c1 all the function scope.

-Doug

14 Jun 2007 01:02:48 -0000, pinskia at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org>:
>
>
> ------- Comment #16 from pinskia at gcc dot gnu dot org  2007-06-14 01:02 -------
> The problem is that it needs also source style scoping also:
> take:
> int f(int *a);
> int g(int b)
> {
>   {
>     int c;
>     f(&c);
>   }
>   {
>     int c1;
>     f(&c1);
>   }
> }
>
> Without source based ones, we don't know if c/c1 can ever be shared.
> I have code here at Sony where we actually depend on this behavior with large
> structs (and arrays).
>
>
> --
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32327
>
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
>
Comment 18 pinskia@gmail.com 2007-06-14 01:14:46 UTC
Subject: Re:  [4.2 Regression] Incorrect stack sharing causing removal of live code

On 14 Jun 2007 01:09:16 -0000, dougkwan at google dot com
<gcc-bugzilla@gcc.gnu.org> wrote:
> Unless the compiler can prove that f() does not save the pointer to c
> and use it later, it cannot share a stack slot for c & c1. This is
> true regardless of any block scoping in the source. Yeah, it looks
> like accessing c outside of the first block was undefined but I was
> told me that GIMPLE promote c & c1 all the function scope.

Yes but I don't think GIMPLE does that, it might promote register
based ones but not addressable ones (as mentioned before).

Really if we change the stack sharing, we are going to cause a large
number of regressions, I already have some regressions between 4.0 and
4.1 due to stack sharing changes with two large structs not being in
the same slot even though we can put them there with one having an
offset.
And yes I still need to file that bug.

-- Pinski
Comment 19 Ian Lance Taylor 2007-06-14 17:57:59 UTC
Adding Richard in case he has any comment.

I don't agree with comment #17 from Doug.  In the code in comment #16, if f saves &c, there is no way that it could validly use it after the block scope exits.

In general if we expand the scope of a variable outside of its block scope, then the stack sharing code in cfgexpand.c will do the wrong thing.  There is normally no reason for the tree code to expand the scope of a variable.  But there is also nothing specifically to stop it from doing so.

In this particular case store sinking is extending the life of a variable.  This is a meaningless move, as there is no valid way for the stored value to be accessed.  It is happening because alias analysis thinks that the function call can refer to the stored variable, although this actually can not happen.

Part of the reason alias analysis is wrong is that we retain the list of addressable variables in a statement when we shouldn't.  Interestingly, if I apply the tree-ssa-operands.c patch from http://gcc.gnu.org/ml/gcc-patches/2006-12/msg01807.html then the problem goes away.

But there are a few complex issues here:

1) Alias analysis thinks that a function can refer to a variable in a different block scope.

2) Nothing systematically prevents the tree code from referencing or setting variables outside of their block scope.

3) The stack sharing code in cfgexpand assumes that variables are only referenced or set within their block scope.

This is going to require more investigation.  I think we are mainly lucky because the tree optimizers tend to not move variables outside of their block scope.  This test case just happens to show a way in which that can happen.
Comment 20 Doug Kwan 2007-06-14 18:05:39 UTC
Subject: Re:  [4.2 Regression] Incorrect stack sharing causing removal of live code

That was my initial opinion too but Diego and Danny told me there is
only one scope in the tree SSA form. So it is okay.

About your comment of us being lucky, I believe our luck is running
out. As we do more and more things in the tree passes, we will be
likely to see this problem more often.

14 Jun 2007 17:58:01 -0000, ian at airs dot com <gcc-bugzilla@gcc.gnu.org>:
>
>
> ------- Comment #19 from ian at airs dot com  2007-06-14 17:57 -------
> I don't agree with comment #17 from Doug.  In the code in comment #16, if f
> saves &c, there is no way that it could validly use it after the block scope
> exits.
Comment 21 rguenther@suse.de 2007-06-14 18:12:54 UTC
Subject: Re:  [4.2 Regression] Incorrect stack sharing
 causing removal of live code

On Thu, 14 Jun 2007, ian at airs dot com wrote:

> ------- Comment #19 from ian at airs dot com  2007-06-14 17:57 -------
> This is going to require more investigation.  I think we are mainly lucky
> because the tree optimizers tend to not move variables outside of their block
> scope.  This test case just happens to show a way in which that can happen.

We may finally need to compute life for on-stack variables instead of
relying on something not happening that is not explicitly forbidden
in our IL (placement new anyone? ;))

(the other) Richard.
Comment 22 Michael Elizabeth Chastain 2007-06-15 13:15:36 UTC
With the test program, gcc 4.1.2 generates correct code and gcc 4.2.0 generates wrong code.  Can someone increase the priority and severity of this PR?


Comment 23 Diego Novillo 2007-06-15 13:27:21 UTC
Working on this today.
Comment 24 Diego Novillo 2007-06-15 22:10:21 UTC
Subject: Bug 32327

Author: dnovillo
Date: Fri Jun 15 22:10:09 2007
New Revision: 125748

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=125748
Log:

	PR 32327
	* tree-ssa-operands.c (build_ssa_operands): Initially assume
	that the statement does not take any addresses.
 

testsuite/ChangeLog

	PR 32327
	* g++.dg/tree-ssa/pr32327-1.C: New test.
	* g++.dg/tree-ssa/pr32327.C: New test.


Added:
    branches/gcc-4_2-branch/gcc/testsuite/g++.dg/tree-ssa/pr32327-1.C
    branches/gcc-4_2-branch/gcc/testsuite/g++.dg/tree-ssa/pr32327.C
Modified:
    branches/gcc-4_2-branch/gcc/ChangeLog
    branches/gcc-4_2-branch/gcc/testsuite/ChangeLog
    branches/gcc-4_2-branch/gcc/tree-ssa-operands.c

Comment 25 Diego Novillo 2007-06-18 12:30:47 UTC
Fixed symptoms with http://gcc.gnu.org/ml/gcc-patches/2007-06/msg01081.html.  Real fix still being discussed.
Comment 26 Richard Henderson 2007-06-19 17:26:01 UTC
(In reply to comment #10)
> Talked to Dan Berlin and Diego Novillo here at Google. They told me
> that all locals are promoted to function scope.

That *only* applies to register variables, not stack variables.

We very very much want to preserve scope of stack variables, because
we very very much want to share stack space between stack variables
of different scopes.  Failure to do so causes bad interactions with
inlining, and causes stack space consumtion to grow to unacceptable
levels.  I.e. you can't build kernels anymore.
Comment 27 dnovillo@google.com 2007-06-19 17:39:56 UTC
Subject: Re:  [4.2 Regression] Incorrect stack sharing
 causing removal of live code

On 6/19/07 1:26 PM, rth at gcc dot gnu dot org wrote:
> ------- Comment #26 from rth at gcc dot gnu dot org  2007-06-19 17:26 -------
> (In reply to comment #10)
>> Talked to Dan Berlin and Diego Novillo here at Google. They told me
>> that all locals are promoted to function scope.
> 
> That *only* applies to register variables, not stack variables.

Yes, but our GIMPLE optimizers don't know that and we do not have a way
of enforcing that in GIMPLE.  There just are no scope boundaries in the
SSA web.  So, the end result is that in SSA we only have function scope.


> We very very much want to preserve scope of stack variables, because
> we very very much want to share stack space between stack variables
> of different scopes.  Failure to do so causes bad interactions with
> inlining, and causes stack space consumtion to grow to unacceptable
> levels.  I.e. you can't build kernels anymore.

Agreed.  We have to find a way of either tying the hands of code motion
transformations by making them use the SSA web *and* the TREE_BLOCKs, or
make stack slot sharing aware of live ranges.

Right now we have the unfortunate situation that an optimizer is making
a valid transformation which breaks the scope assumption that stack
sharing uses.

I am not sure if it would be practical to force code motion
optimizations to use anything other than the SSA web to do their
analysis.  Perhaps we could force scopes by building barriers on the SSA
web itself (say, by putting PHI nodes at scope boundaries, or something).

Alternatively, we could incorporate live-range analysis to stack slot
sharing.  That way, if code motion creates undue stack slot pressure, we
would merely emit suboptimal code, instead of wrong code.
Comment 28 Andrew Macleod 2007-06-19 18:57:58 UTC
Won't solve the problem currently, but I think the long term solution is to do stack analysis when out-of-ssa and expand have been merged into a single entity. The live range info out-of-ssa calculates can be used to do a decent job of stack sharing regardless of scoping info.

Short term, we might be able to check to see if a stack variable is used outside its scope when going out of ssa, and flag it for no sharing, or promote it to file level, or the next appropriate scope level.  That would work as long as we don't end up promoting too many things in the wrong routines :-). 

I'm not fond of trying to make the optimization passes aware of scopes on top of everything else they are aware of, Id rather see it taken care of in one place, and not dependent on scoping info.
Comment 29 Mark Mitchell 2007-07-04 03:28:50 UTC
Diego, does this problem still exist on the 4.2 branch?  I see that you checked in a patch, which you suggest may not be a complete fix, but does this test case still fail?
Comment 30 dnovillo@google.com 2007-07-04 11:22:50 UTC
Subject: Re:  [4.2 Regression] Incorrect stack sharing
 causing removal of live code

On 7/3/07 11:28 PM, mmitchel at gcc dot gnu dot org wrote:
> ------- Comment #29 from mmitchel at gcc dot gnu dot org  2007-07-04 03:28 -------
> Diego, does this problem still exist on the 4.2 branch?  I see that you checked
> in a patch, which you suggest may not be a complete fix, but does this test
> case still fail?

Yes, the problem still exists on 4.2 and mainline.  The patch is valid
in that it fixes a codegen deficiency, but it only works around this
bug.  The test case does not fail anymore, and getting another one to
fail may be tricky (it's a fairly rare bug, unfortunately).

A real fix is in the works, but it's not clear when it might be ready.
Comment 31 Mark Mitchell 2007-10-09 19:20:26 UTC
Change target milestone to 4.2.3, as 4.2.2 has been released.
Comment 32 Joseph S. Myers 2008-02-01 16:54:14 UTC
4.2.3 is being released now, changing milestones of open bugs to 4.2.4.
Comment 33 Joseph S. Myers 2008-05-19 20:23:17 UTC
4.2.4 is being released, changing milestones to 4.2.5.
Comment 34 Joseph S. Myers 2009-03-30 22:06:16 UTC
Closing 4.2 branch, fixed in 4.3.  (There's a lot of discussion here of a
possible underlying problem that might or might not be fixed but isn't
4.2-only; if still present, please open a separate bug report for it.)