Bug 51528 - [4.6 Regression] SRA should not create BOOLEAN_TYPE replacements
Summary: [4.6 Regression] SRA should not create BOOLEAN_TYPE replacements
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.7.0
: P2 normal
Target Milestone: 4.7.0
Assignee: Richard Biener
URL:
Keywords: wrong-code
: 52244 (view as bug list)
Depends on: 52115 52244
Blocks:
  Show dependency treegraph
 
Reported: 2011-12-13 15:24 UTC by Martin Jambor
Modified: 2013-04-12 16:18 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work: 4.7.0
Known to fail:
Last reconfirmed: 2011-12-13 00:00:00


Attachments
Testcase (268 bytes, text/plain)
2011-12-13 15:26 UTC, Martin Jambor
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Jambor 2011-12-13 15:24:14 UTC
We already do not create ENUMERAL_TYPE replacements in SRA and neither
we should be creating BOOLEAN_TYPE ones because that can lead to
miscompilations, as the attached testcase shows.  Like in the enumeral
case, we should only create replacements of types with precision that
matches their mode size.

I'll take care of that.
Comment 1 Martin Jambor 2011-12-13 15:26:14 UTC
Created attachment 26071 [details]
Testcase

This is the aforementioned test case.
Comment 2 Andrew Pinski 2012-01-04 00:40:53 UTC
I saw this on mips64-linux-gnu with one of the fortran testcases.
Comment 3 Richard Biener 2012-01-27 13:05:40 UTC
SRA creates, for

struct S { int i; int j; };

struct S bar (struct S);

struct S foo1 (int i)
{
  struct S x;
  x.i = i;
  x = bar (x);
  /* LHS disqualified, RHS partially scalarized.  */
  return x;
}

for the aggregate copy to the return slot

  x$i_8 = x.i;
  D.1720 = x;
  D.1720.i = x$i_8;

but that's needlessly complicated as it should just have created

  x.i = x$i_8;
  D.1720 = x;

or for the similar case

struct S g;
struct S foo3 (int i)
{
  struct S x;
  /* RHS disqualified, LHS partially scalarized.  */
  x = g;
  x.i = i;
  x = bar (x);
  return x;
}

it creates

  x = g;
  x$i_4 = g.i;
  x$i_8 = i_1(D);
  x.i = x$i_8;

which is just a complicated from of

  x = g;
  x$i_4 = x.i;

a sub-case which it handles fine when handling calls!

  x = bar (x);
  x$i_10 = x.i;


That copy-in-out game to the unscalarized parts causes this regression.
Comment 4 Richard Biener 2012-01-30 13:26:48 UTC
Author: rguenth
Date: Mon Jan 30 13:26:45 2012
New Revision: 183720

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=183720
Log:
2012-01-30  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/51528
	* tree-sra.c (sra_modify_assign): Re-factor in preparation
	for PR51528 fix.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/tree-sra.c
Comment 5 Richard Biener 2012-01-31 09:46:36 UTC
Author: rguenth
Date: Tue Jan 31 09:46:29 2012
New Revision: 183752

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=183752
Log:
2012-01-31  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/51528
	* tree-sra.c (sra_modify_assign): Avoid copy-in/out for aggregate
	assigns.

	* gcc.dg/torture/pr51528.c: New testcase.

Added:
    trunk/gcc/testsuite/gcc.dg/torture/pr51528.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-sra.c
Comment 6 Richard Biener 2012-01-31 10:43:09 UTC
The testcase is worked around.  Maybe reliably enough, maybe not ...

At least the patches seem to be backportable eventually.
Comment 7 Dominique d'Humieres 2012-02-04 19:14:34 UTC
On powerpc-apple-darwin9 testing revision 183874 I get with -m32:

FAIL: gcc.dg/torture/pr51528.c  -O2  execution test
FAIL: gcc.dg/torture/pr51528.c  -O3 -fomit-frame-pointer  execution test
FAIL: gcc.dg/torture/pr51528.c  -O3 -g  execution test
FAIL: gcc.dg/torture/pr51528.c  -Os  execution test
FAIL: gcc.dg/torture/pr51528.c  -O2 -flto -flto-partition=none  execution test
FAIL: gcc.dg/torture/pr51528.c  -O2 -flto  execution test

These execution failures are due to a.i==0.
Comment 8 Richard Biener 2012-02-06 14:13:48 UTC
On ppc the main SRA pass does nothing to foo () because of differences in
how the return value is returned.  Still the IL at expansion time looks
great:

foo ()
{
<bb 2>:
  <retval>.b = 1;
  use_bool (<retval>);
  <retval>.i = 65534;
  return <retval>;

}

compared to x86_64 where we have the uglier

foo ()
{
  union U u;
  union U D.1733;

<bb 2>:
  u.b = 1;
  use_bool (u);
  u = VIEW_CONVERT_EXPR<union U>(65534);
  D.1733 = u;
  u ={v} {CLOBBER};
  return D.1733;

Compared to -fno-tree-sra on PPC we have

foo ()
{
  union U u;

<bb 2>:
  <retval>.b = 1;
  use_bool (<retval>);
  u.i = 65534;
  <retval> = u;
  u ={v} {CLOBBER};
  return <retval>;

so SRA managed to remove the aggregate temporary and its copy.

So, I don't see what's wrong on PPC (with SRA, that is).  This must be
a target bug if at all (note I'm only inspecting dumps and did not
reproduce the execute fail).
Comment 9 Iain Sandoe 2012-02-13 17:46:42 UTC
the target has a different sized bool if that is relevant (int 32 sized)...
Comment 10 Iain Sandoe 2012-02-13 20:40:43 UTC
(In reply to comment #8)
 
> So, I don't see what's wrong on PPC (with SRA, that is).  This must be
> a target bug if at all (note I'm only inspecting dumps and did not
> reproduce the execute fail).

yes, seems to be .. the tree-ssa dump is unchanged between -O1 (OK) and -O2 (fails).

this seems to be a 4.5, 4.6 and 4.7 regression [4.4. is OK] - the call to bar () is being dropped from foo ().
will move to a different PR.  Thanks for the test-case ;)
Comment 11 Richard Biener 2012-02-14 15:33:03 UTC
*** Bug 52244 has been marked as a duplicate of this bug. ***
Comment 12 Richard Biener 2012-02-14 15:34:02 UTC
Author: rguenth
Date: Tue Feb 14 15:33:56 2012
New Revision: 184214

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=184214
Log:
2012-02-14  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/52244
	PR tree-optimization/51528
	* tree-sra.c (analyze_access_subtree): Only create INTEGER_TYPE
	replacements for integral types.

	* gcc.dg/torture/pr52244.c: New testcase.

Added:
    trunk/gcc/testsuite/gcc.dg/torture/pr52244.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-sra.c
Comment 13 Richard Biener 2012-07-02 12:16:53 UTC
The 4.5 branch is being closed, adjusting target milestone.
Comment 14 Jakub Jelinek 2013-04-12 16:18:04 UTC
The 4.6 branch has been closed, fixed in GCC 4.7.0.