Bug 27799 - adding unused char field inhibits optimization
Summary: adding unused char field inhibits optimization
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.2.0
: P3 enhancement
Target Milestone: 4.4.0
Assignee: Richard Biener
URL:
Keywords: alias, missed-optimization
Depends on: 2480 35472
Blocks:
  Show dependency treegraph
 
Reported: 2006-05-29 17:49 UTC by Dan Nicolaescu
Modified: 2008-04-23 14:09 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2008-03-04 21:00:03


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dan Nicolaescu 2006-05-29 17:49:04 UTC
For this code:

struct X {double m; int x;};
struct Y {int y; short d;};
struct YY {int y; short d; char c;};

int foo(struct X *x,  struct Y *y)
{
  x->x =  0;
  y->y =  1;
  if (x->x != 0)
    abort ();
}

int foo_no(struct X *x,  struct YY *y)
{
  x->x =  0;
  y->y =  1;
  if (x->x != 0)
    abort ();
}

the "if" does not get optimized away (by the dom1 pass) for the "foo_no" function, but it is optimized for "foo" 
The only difference between the 2 functions is that foo_no takes as a parameter a pointer to a struct that has a "char" field that is not accessed in this function.

It would be nice if both functions were optimized in the same way.
Comment 1 Andrew Pinski 2006-05-29 18:50:17 UTC
Confirmed, the problem is that char is recognized as something which can alias anything which is why it is not optimized.
Comment 2 Richard Biener 2006-05-30 12:41:36 UTC
Variable: x, UID 1535, struct X *, symbol memory tag: SMT.4, default def: x_1
Variable: y, UID 1536, struct YY *, symbol memory tag: SMT.5, default def: y_2
Variable: SMT.4, UID 1549, struct X, is addressable, is global, call clobbered (, is global var, is incoming pointer ), may aliases: { SMT.5 }

but this aliasing cannot really happen.

Flow-insensitive alias information for foo_no

Aliased symbols

SMT.5, UID 1550, struct YY, is aliased, is addressable, is global, call clobbered (, is global var, is incoming pointer )
SMT.4, UID 1549, struct X, is addressable, is global, call clobbered (, is global var, is incoming pointer ), may aliases: { SMT.5 }

record_component_aliases is likely too conservative here.
Comment 3 Richard Biener 2008-03-04 20:35:50 UTC
Actually alias_sets_conflict_p is the cuplrit, as we have in the alias subset
of struct YY alias set zero (entered for the char member, accesses conflict
with accesses through char *):

  /* See if the first alias set is a subset of the second.  */
  ase = get_alias_set_entry (set1);
  if (ase != 0
      && (ase->has_zero_child
          || splay_tree_lookup (ase->children,
                                (splay_tree_key) set2)))
    return 1;

which IMHO should do

  if (ase != 0
      && ((ase->has_zero_child && set2 == 0)
          || splay_tree_lookup (ase->children, (splay_tree_key) set2))
    return 1;

where then both cases are optimized?
Comment 5 Dan Nicolaescu 2008-03-04 21:19:49 UTC
(In reply to comment #4)
> http://gcc.gnu.org/ml/gcc-patches/2008-03/msg00243.html

Thanks for working on this!
Have you looked at the impact?
Probably the generated code won't too different because the RTL alias analysis probably catches this.
But it would be interesting to see what is the difference for the tree dumps before and after this patch.
Comment 6 Richard Biener 2008-03-04 21:22:28 UTC
Actually RTL alias is just using the same routines. The IL difference is

  # SMT.4_6 = VDEF <SMT.4_4(D)>
  # SMT.5_7 = VDEF <SMT.5_5(D)>
  x_1(D)->x = 0;
  # SMT.5_8 = VDEF <SMT.5_7>
  y_2(D)->y = 1;

vs.

  # SMT.18_5 = VDEF <SMT.18_4(D)>
  x_1(D)->x = 0;
  # SMT.19_7 = VDEF <SMT.19_6(D)>
  y_2(D)->y = 1;
Comment 7 Dan Nicolaescu 2008-03-04 21:32:29 UTC
(In reply to comment #6)
> Actually RTL alias is just using the same routines.
Might be, but the RTL level code that optimizes away the abort() in both testcases (if I remember well nonoverlapping_component_refs_p). 

> 
>   # SMT.4_6 = VDEF <SMT.4_4(D)>
>   # SMT.5_7 = VDEF <SMT.5_5(D)>
>   x_1(D)->x = 0;
>   # SMT.5_8 = VDEF <SMT.5_7>
>   y_2(D)->y = 1;
> 
> vs.
> 
>   # SMT.18_5 = VDEF <SMT.18_4(D)>
>   x_1(D)->x = 0;
>   # SMT.19_7 = VDEF <SMT.19_6(D)>
>   y_2(D)->y = 1;

That is for this testcase, but what about the impact on .final_cleanup for something big like combine.c? 
Comment 8 rguenther@suse.de 2008-03-04 21:35:11 UTC
Subject: Re:  adding unused char field inhibits
 optimization

On Tue, 4 Mar 2008, dann at godzilla dot ics dot uci dot edu wrote:

> ------- Comment #7 from dann at godzilla dot ics dot uci dot edu  2008-03-04 21:32 -------
> (In reply to comment #6)
> > Actually RTL alias is just using the same routines.
> Might be, but the RTL level code that optimizes away the abort() in both
> testcases (if I remember well nonoverlapping_component_refs_p). 

I still have the abort () with -O2.

> > 
> >   # SMT.4_6 = VDEF <SMT.4_4(D)>
> >   # SMT.5_7 = VDEF <SMT.5_5(D)>
> >   x_1(D)->x = 0;
> >   # SMT.5_8 = VDEF <SMT.5_7>
> >   y_2(D)->y = 1;
> > 
> > vs.
> > 
> >   # SMT.18_5 = VDEF <SMT.18_4(D)>
> >   x_1(D)->x = 0;
> >   # SMT.19_7 = VDEF <SMT.19_6(D)>
> >   y_2(D)->y = 1;
> 
> That is for this testcase, but what about the impact on .final_cleanup for
> something big like combine.c? 

No idea, but feel free to check.

Richard.
Comment 9 Dan Nicolaescu 2008-03-04 21:43:08 UTC
(In reply to comment #8)
> Subject: Re:  adding unused char field inhibits
>  optimization
> 
> On Tue, 4 Mar 2008, dann at godzilla dot ics dot uci dot edu wrote:
> 
> > ------- Comment #7 from dann at godzilla dot ics dot uci dot edu  2008-03-04 21:32 -------
> > (In reply to comment #6)
> > > Actually RTL alias is just using the same routines.
> > Might be, but the RTL level code that optimizes away the abort() in both
> > testcases (if I remember well nonoverlapping_component_refs_p). 
> 
> I still have the abort () with -O2.

Argghh, sorry, my bad: typo in the "grep abort file.s" command ...


> > That is for this testcase, but what about the impact on .final_cleanup for
> > something big like combine.c? 
> 
> No idea, but feel free to check.

I don't have a recent build... 
Comment 10 Richard Biener 2008-04-23 14:09:14 UTC
Subject: Bug 27799

Author: rguenth
Date: Wed Apr 23 14:08:25 2008
New Revision: 134598

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=134598
Log:
2008-04-23  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/27799
	PR tree-optimization/32921
	PR tree-optimization/32624
	* tree-ssa-structalias.c (merge_smts_into): Only merge the
	SMTs aliases and the tag itself into the solution.
	* tree-ssa-alias.c (compute_flow_sensitive_aliasing): Do not
	merge the points-to solution back into the SMT aliases.
	(may_alias_p): Use alias_set_subset_of instead of
	aliases_conflict_p.  A pointer which points to
	memory with alias set zero may access any variable.

	* gcc.dg/tree-ssa/pr27799.c: New testcase.
	* gcc.dg/tree-ssa/20030807-7.c: Remove xfail, scan vrp dump.

Added:
    trunk/gcc/testsuite/gcc.dg/tree-ssa/pr27799.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.dg/tree-ssa/20030807-7.c
    trunk/gcc/tree-ssa-alias.c
    trunk/gcc/tree-ssa-structalias.c

Comment 11 Richard Biener 2008-04-23 14:09:14 UTC
Fixed.