This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/60172] [4.9/4.10 Regression] ARM performance regression from trunk at 207239
- From: "thomas.preudhomme at arm dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Thu, 15 May 2014 09:51:51 +0000
- Subject: [Bug tree-optimization/60172] [4.9/4.10 Regression] ARM performance regression from trunk at 207239
- Auto-submitted: auto-generated
- References: <bug-60172-4 at http dot gcc dot gnu dot org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60172
--- Comment #20 from Thomas Preud'homme <thomas.preudhomme at arm dot com> ---
(In reply to rguenther@suse.de from comment #19)
> On Thu, 15 May 2014, thomas.preudhomme at arm dot com wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60172
> >
> > --- Comment #18 from Thomas Preud'homme <thomas.preudhomme at arm dot com> ---
> > (In reply to Richard Biener from comment #17)
> > >
> > > Citing myself:
> > >
> > > On the GIMPLE level before expansion we have
> > >
> > > +40 = Arr_2_Par_Ref_22(D) + (_41 + pretmp_20);
> > >
> > > _51 = Arr_2_Par_Ref_22(D) + (_41 + (pretmp_20 + 1000));
> > >
> > > so if _51 were Arr_2_Par_Ref_22(D) + ((_41 + pretmp_20) + 1000);
> > >
> > > then _41 + pretmp_20 would be fully redundant with the expression needed
> > > by _40.
> >
> > Yes I saw that but I was wondering why would reassoc try this association
> > rather than another since the header of the file doesn't mention any special
> > treatment of explicit integer constants.
> >
> > Besides, wouldn't it still misses that fact that _51 = _40 + 1000?
>
> Yes. But reassoc doesn't associate across POINTER_PLUS_EXPRs.
Is there a reason for that?
>
> RTL CSE could catch it, but for it the association would have to
> be the same for both. If we start from the proposed form
> then at RTL expansion time we could associate
> pointer + (X + CST) to (pointer + X) + CST.
Right.
>
> Feels all somewhat hacky, of course (and relies on TER). There
> may be cases where doing the opposite is better (for example
> if you have ptr1 + (X + 1000) and ptr2 + (X + 1000)). Association
> to make CSE possible is always hard if CSE itself cannot associate
> to maximize the number of CSE opportunities. So at the moment
> any choice is just canonicalization.
Exactly my thought. I'm not sure if that's what you have in mind when you write
association for CSE but I was thinking about a scheme that ressemble what
tree_to_aff_combination_expand does and organize all expanded expression to
compare them easily (read efficiently). With such a capability it would then
not be necessary to do the first replacement with forprop+reassoc+dom as
everything could be done in CSE.