This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug ipa/65478] [5 regression] crafty performance regression

From: "jamborm at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: Tue, 24 Mar 2015 13:38:59 +0000
Subject: [Bug ipa/65478] [5 regression] crafty performance regression
Auto-submitted: auto-generated
References: <bug-65478-4 at http dot gcc dot gnu dot org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65478

Martin Jambor <jamborm at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jamborm at gcc dot gnu.org

--- Comment #6 from Martin Jambor <jamborm at gcc dot gnu.org> ---
I can confirm I can see a fairly consistent 4% run time increase
caused by r219863 on my desktop (from ~22.74s to ~23.64s).  However,
when I disable cloning of the Search function, for example by using
the attribute noclone, I only get to, ~23.31s which is still 2.5%
slower.  (All the times are of course subject to noise but I have
measured them repeatedly and as I said, they are fairly consistent).
This suggests that cloning of function Search and not inlining
NextMove is only part of the story.


> I would suggest we may disable/add negative hint for cloning in the
> case where the specialized function will end up calling
> unspecialized version of itself with non-cold edge.

Recursion is handled by iterating over SCCs in call graph in IPA-CP,
and the redirection of the final call to "close" the SCCs is done in a
different iteration than the first cloning.  This unfortunately means
that when function decide_about_value reasons about cloning or not, it
does not know what recursive calls are going to be redirected and
which are not.  Making it aware of this would require a hack in
cgraph_edge_brings_value_p functions.  I may try writing it but I
wonder whether it is really easier than undoing all cloning in an SCC,
which is the right way to implement this as it would also work for
recursions involving two or more functions.

> We also may consider adding bit of negative hints for cases where
> cloning would turn function called once (by noncold edge) to a
> function called twice.

This would be much easier, although the penalty would have to be quite
big because the goodness number calculated by
good_cloning_opportunity_p is 830 and the threshold is 500.

But given the above, perhaps, for gcc 5 at least, we might want to
introduce a 0.7 factor penalty for this and another 0.7 factor penalty
just for being within an SCC?

References:
- [Bug tree-optimization/65478] New: crafty performance regression
  - From: hubicka at gcc dot gnu.org

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]