52244 – [4.5/4.6/4.7 Regression] wrong code for function returning union between int and _Bool at O > 2, with no-early-inlining

Bug 52244 - [4.5/4.6/4.7 Regression] wrong code for function returning union between int and _Bool at O > 2, with no-early-inlining

Summary: [4.5/4.6/4.7 Regression] wrong code for function returning union between int ...

Status:	RESOLVED DUPLICATE of bug 51528

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	middle-end (show other bugs)
Version:	4.7.0

Importance:	P1 normal
Target Milestone:	4.5.4
Assignee:	Richard Biener

URL:
Keywords:	wrong-code

Depends on:
Blocks:	51528
	Show dependency tree / graph

Reported:	2012-02-14 10:10 UTC by Iain Sandoe
Modified:	2020-03-20 17:58 UTC (History)
CC List:	3 users (show)

See Also:	80635
Host:
Target:	--darwin9 x86_64-unknown-linux-gnu
Build:
Known to work:	4.4.7
Known to fail:	4.5.4, 4.6.3, 4.7.0
Last reconfirmed:	2012-02-14 00:00:00

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Iain Sandoe 2012-02-14 10:10:04 UTC

This was revealed by the test-case attached to PR51528.

in the following:

extern void abort (void);

typedef union u_r 
{
  _Bool b;
  int c;
} u_t;

u_t
bar (void)
{
  u_t u;
  u.c = 0x1234;
  return u;
}

u_t  __attribute__ ((noinline))
foo (void)
{
  u_t u;

  u.b = 1;
  u = bar ();

  return u;
}

int main (int argc, char **argv)
{
  u_t u = foo ();
  if (u.c != 0x1234)
    abort ();
  return 0;
}

====

for  -O2 -fno-early-inlining (at m32)

foo is compiled as (wrong):

        .globl _foo
_foo:
        li r9,0
        stw r9,0(r3)
        blr

instead of (correct):

        .globl _foo
_foo:
        li r2,4660
        stw r2,0(r3)
        blr

======

It only appears to happen for the case where the union is with an item of the same size as _Bool (which is 32 bits on powerpc-darwin).  Replacing 'c' with a short or a long long or even char[4] will cause the problem to vanish.

======

Comment 1 Iain Sandoe 2012-02-14 10:24:50 UTC

hm. this might be more serious/wide-ranging...
... it also fails on x86 when  the union is  {char, _Bool}

Comment 2 Iain Sandoe 2012-02-14 10:31:10 UTC

also fails on a cross from darwin9 to x86_64-unknown-linux-gnu.
at m64 on this target it fails even without the -fno-early-inlining.

Comment 3 Iain Sandoe 2012-02-14 10:46:35 UTC

inline-union-ret-val.c.064t.retslot has...

;; Function foo (foo, funcdef_no=1, decl_uid=2009, cgraph_uid=1)

foo ()
{
  _Bool u$b;
  union u_t u;

<bb 2>:
  u.c = 4660;
  u$b_6 = MEM[(union u_r *)&u].b;
  MEM[(union u_r *)&<retval>].b = u$b_6;
  return <retval>;

}


inline-union-ret-val.c.068t.mergephi2 contains....

;; Function foo (foo, funcdef_no=1, decl_uid=2009, cgraph_uid=1)

foo ()
{
  _Bool u$b;
  union u_t u;

<bb 2>:
  u.c = 4660;
  <retval>.b = 0;
  return <retval>;

}

inline-union-ret-val.c.149t.optimized ...

;; Function foo (foo, funcdef_no=1, decl_uid=2009, cgraph_uid=1)

foo ()
{
<bb 2>:
  <retval>.b = 0;
  return <retval>;

}

Comment 4 Jakub Jelinek 2012-02-14 10:51:53 UTC

Confirmed, testcase for x86_64-linux:
extern void abort (void);

union U { _Bool b; unsigned char c; };

union U
bar (void)
{
  union U u;
  u.c = 0xaa;
  return u;
}

union U __attribute__ ((noinline))
foo (void)
{
  union U u;
  u.b = 1;
  u = bar ();
  return u;
}

int
main ()
{
  union U u = foo ();
  if (u.c != 0xaa)
    abort ();
  return 0;
}

Started with http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=147980

Comment 5 Richard Biener 2012-02-14 10:56:57 UTC

Basically PR51528 again.  I'll take a look.

Comment 6 Richard Biener 2012-02-14 12:10:10 UTC

The issue is that SRA thinks

  u.b = 1;

is

access { base = (1716)'u', offset = 0, size = 8, expr = u.b, type = _Bool, ...

note 'size = 8'.  That is what get_ref_base_and_extent says, but it in the
end boils down to the question of how we handle type precision vs. mode
precision.  For example we happily fold VIEW_CONVERT_EXPR<_Bool>(18)
to 0.  So, as SRA will create value replacements it cannot consider
u.b an access of size 8.  OTOH aliasing has to assume that a store to u.b
conflicts with any other QImode store at the same location, so it cannot
simply say "well, it's only one bit".

With a patch SRA to make SRA not assume size 8 but size 1 in this case we
generate

<bb 2>:
  u.b = 1;
  u$b_6 = MEM[(_Bool *)&1].b;
  SR.3_8 = 18;
  u = VIEW_CONVERT_EXPR<union u_t>(SR.3_8);
  u$b_9 = MEM[(union u_r *)&u].b;
  MEM[(union u_r *)&u].b = u$b_9;
  D.1729 = u;
  u ={v} {CLOBBER};
  return D.1729;

instead which happens to work (double-ugh for the MEM[(_Bool *)&1].b though,
fortunately it's unused).

With a patch that makes SRA generate replacements that cover the whole
size with their replacements we generate

foo ()
{
  unsigned char SR.3;
  <unnamed-unsigned:8> u;
  union u_t D.1741;
  union u_t u;
  union u_t D.1729;

<bb 2>:
  u_7 = 1;
  SR.3_6 = 18;
  u_2 = SR.3_6;
  MEM[(union u_r *)&D.1729] = u_2;
  return D.1729;

Another alternative would be to somehow disqualify the whole aggregate
when the situation (scalar field with size != precision and parent that
does not have all fields scalarized) happens.

I'm testing the replacement change.

Comment 7 Richard Biener 2012-02-14 13:25:10 UTC

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c      (revision 184203)
+++ gcc/tree-sra.c      (working copy)
@@ -2172,11 +2172,16 @@ analyze_access_subtree (struct access *r
              && (root->grp_scalar_write || root->grp_assignment_write))))
     {
       bool new_integer_type;
-      if (TREE_CODE (root->type) == ENUMERAL_TYPE)
+      if (INTEGRAL_TYPE_P (root->type)
+         && (TREE_CODE (root->type) != INTEGER_TYPE
+             || TYPE_PRECISION (root->type) != root->size))
        {
          tree rt = root->type;
-         root->type = build_nonstandard_integer_type (TYPE_PRECISION (rt),
+         root->type = build_nonstandard_integer_type (root->size,
                                                       TYPE_UNSIGNED (rt));
+         root->expr = build_ref_for_offset (UNKNOWN_LOCATION,
+                                            root->base, root->offset,
+                                            root->type, NULL, false);
          new_integer_type = true;
        }
       else

that is.

Comment 8 Richard Biener 2012-02-14 15:33:03 UTC

So it's really a dup.  Marking as such to avoid confusion with backports.

*** This bug has been marked as a duplicate of bug 51528 ***

Comment 9 Richard Biener 2012-02-14 15:34:02 UTC

Author: rguenth
Date: Tue Feb 14 15:33:56 2012
New Revision: 184214

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=184214
Log:
2012-02-14  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/52244
	PR tree-optimization/51528
	* tree-sra.c (analyze_access_subtree): Only create INTEGER_TYPE
	replacements for integral types.

	* gcc.dg/torture/pr52244.c: New testcase.

Added:
    trunk/gcc/testsuite/gcc.dg/torture/pr52244.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-sra.c