This was revealed by the test-case attached to PR51528. in the following: extern void abort (void); typedef union u_r { _Bool b; int c; } u_t; u_t bar (void) { u_t u; u.c = 0x1234; return u; } u_t __attribute__ ((noinline)) foo (void) { u_t u; u.b = 1; u = bar (); return u; } int main (int argc, char **argv) { u_t u = foo (); if (u.c != 0x1234) abort (); return 0; } ==== for -O2 -fno-early-inlining (at m32) foo is compiled as (wrong): .globl _foo _foo: li r9,0 stw r9,0(r3) blr instead of (correct): .globl _foo _foo: li r2,4660 stw r2,0(r3) blr ====== It only appears to happen for the case where the union is with an item of the same size as _Bool (which is 32 bits on powerpc-darwin). Replacing 'c' with a short or a long long or even char[4] will cause the problem to vanish. ======
hm. this might be more serious/wide-ranging... ... it also fails on x86 when the union is {char, _Bool}
also fails on a cross from darwin9 to x86_64-unknown-linux-gnu. at m64 on this target it fails even without the -fno-early-inlining.
inline-union-ret-val.c.064t.retslot has... ;; Function foo (foo, funcdef_no=1, decl_uid=2009, cgraph_uid=1) foo () { _Bool u$b; union u_t u; <bb 2>: u.c = 4660; u$b_6 = MEM[(union u_r *)&u].b; MEM[(union u_r *)&<retval>].b = u$b_6; return <retval>; } inline-union-ret-val.c.068t.mergephi2 contains.... ;; Function foo (foo, funcdef_no=1, decl_uid=2009, cgraph_uid=1) foo () { _Bool u$b; union u_t u; <bb 2>: u.c = 4660; <retval>.b = 0; return <retval>; } inline-union-ret-val.c.149t.optimized ... ;; Function foo (foo, funcdef_no=1, decl_uid=2009, cgraph_uid=1) foo () { <bb 2>: <retval>.b = 0; return <retval>; }
Confirmed, testcase for x86_64-linux: extern void abort (void); union U { _Bool b; unsigned char c; }; union U bar (void) { union U u; u.c = 0xaa; return u; } union U __attribute__ ((noinline)) foo (void) { union U u; u.b = 1; u = bar (); return u; } int main () { union U u = foo (); if (u.c != 0xaa) abort (); return 0; } Started with http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=147980
Basically PR51528 again. I'll take a look.
The issue is that SRA thinks u.b = 1; is access { base = (1716)'u', offset = 0, size = 8, expr = u.b, type = _Bool, ... note 'size = 8'. That is what get_ref_base_and_extent says, but it in the end boils down to the question of how we handle type precision vs. mode precision. For example we happily fold VIEW_CONVERT_EXPR<_Bool>(18) to 0. So, as SRA will create value replacements it cannot consider u.b an access of size 8. OTOH aliasing has to assume that a store to u.b conflicts with any other QImode store at the same location, so it cannot simply say "well, it's only one bit". With a patch SRA to make SRA not assume size 8 but size 1 in this case we generate <bb 2>: u.b = 1; u$b_6 = MEM[(_Bool *)&1].b; SR.3_8 = 18; u = VIEW_CONVERT_EXPR<union u_t>(SR.3_8); u$b_9 = MEM[(union u_r *)&u].b; MEM[(union u_r *)&u].b = u$b_9; D.1729 = u; u ={v} {CLOBBER}; return D.1729; instead which happens to work (double-ugh for the MEM[(_Bool *)&1].b though, fortunately it's unused). With a patch that makes SRA generate replacements that cover the whole size with their replacements we generate foo () { unsigned char SR.3; <unnamed-unsigned:8> u; union u_t D.1741; union u_t u; union u_t D.1729; <bb 2>: u_7 = 1; SR.3_6 = 18; u_2 = SR.3_6; MEM[(union u_r *)&D.1729] = u_2; return D.1729; Another alternative would be to somehow disqualify the whole aggregate when the situation (scalar field with size != precision and parent that does not have all fields scalarized) happens. I'm testing the replacement change.
Index: gcc/tree-sra.c =================================================================== --- gcc/tree-sra.c (revision 184203) +++ gcc/tree-sra.c (working copy) @@ -2172,11 +2172,16 @@ analyze_access_subtree (struct access *r && (root->grp_scalar_write || root->grp_assignment_write)))) { bool new_integer_type; - if (TREE_CODE (root->type) == ENUMERAL_TYPE) + if (INTEGRAL_TYPE_P (root->type) + && (TREE_CODE (root->type) != INTEGER_TYPE + || TYPE_PRECISION (root->type) != root->size)) { tree rt = root->type; - root->type = build_nonstandard_integer_type (TYPE_PRECISION (rt), + root->type = build_nonstandard_integer_type (root->size, TYPE_UNSIGNED (rt)); + root->expr = build_ref_for_offset (UNKNOWN_LOCATION, + root->base, root->offset, + root->type, NULL, false); new_integer_type = true; } else that is.
So it's really a dup. Marking as such to avoid confusion with backports. *** This bug has been marked as a duplicate of bug 51528 ***
Author: rguenth Date: Tue Feb 14 15:33:56 2012 New Revision: 184214 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=184214 Log: 2012-02-14 Richard Guenther <rguenther@suse.de> PR tree-optimization/52244 PR tree-optimization/51528 * tree-sra.c (analyze_access_subtree): Only create INTEGER_TYPE replacements for integral types. * gcc.dg/torture/pr52244.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/torture/pr52244.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-sra.c