EON regression due to pass ordering problem (PR tree-optimize/24653)

Jan Hubicka jh@suse.cz
Thu Nov 3 13:31:00 GMT 2005


Hi,
we have over 10% regression in Eon that I think it typical enought for
C++ so we should solve it for 4.1.  The problem is caused by missed SRA
oppurtunity on startPoint variable appearing in many of it's internal
functions.  The missed SRA (we did SRA it in 4.0) is caused by fact that
originally the code after inlining functions that manipulate with it is:

this = &startPoint;
this->e[0] = code;

and so on.  Until dom1 pass we still maintain pointer "this" so the
startPoint must live in memory.  DCE properly transforms it into:

this = &startPoint;
startPoint.e[0] = code;

and because we never run dce after first DOM and before next SRA, we end
up with this still being considered as with address taken by the alias
analysis pass we do before SRA.

Even with new aliasing tricks this end up in dead stores and intermixed
uses, like

e0 = code;
startPoint.e[0] = e0

and later on e0 and startingPoint both being used.

This scheme repeats couple hounderds times in eon and I think in about
every C++ program, so I think we should solve it.
I am testing the attached patch that adds one extra dce pass just before
the aliasing analysis.  Of course it would be better to trade one of
earlier DCEs than introduce new pass, so I would like to ask what are
the reasons for DCE being done earlier than previously (ie 4.0 does DCE
just before and after first DOM).  Or alternatively teach some of our
earlier passes to do this transformation (I am rather surprised that it
survives up to dom)

The patch solves the regression.  I get 2081 eon SPEC score out of 3.3
hammer branch, 1993 out of 4.0 and 2108 out of 4.1 with the patch
attached.  I noticed one extra problem being in the fact that our
inlining decisions changed so we no longer inline constructor of
iterator used in the same function.  This is because the constructor
calls further functions and inliner now thinks it is better to inline
these (as they are called very many times from single place) and
constructor becomes too large for further inlining.  Adding alwaysinline
flag makes us score 2242, so perhaps I can add some kind of heuristics
that inlining functions returing structures is more important.
I will do full benchmarking tonight.

Does the attached patch look applicable for 4.1 assuming it passes or
shall we try removing some of the earlier dces?

2005-11-03  Jan Hubicka  <jh@suse.cz>

	PR tree-optimize/24653
	* passes.c: Schedule pass_dce before may_alias done before SRA.

Index: passes.c
===================================================================
--- passes.c	(revision 106422)
+++ passes.c	(working copy)
@@ -501,6 +501,7 @@
   NEXT_PASS (pass_phi_only_copy_prop);
 
   NEXT_PASS (pass_phiopt);
+  NEXT_PASS (pass_dce);
   NEXT_PASS (pass_may_alias);
   NEXT_PASS (pass_tail_recursion);
   NEXT_PASS (pass_profile);



More information about the Gcc-patches mailing list