[Bug ipa/65076] [5 Regression] 16% tramp3d-v4.cpp compile time regression
rguenth at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Tue Mar 24 14:56:00 GMT 2015
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rguenth at gcc dot gnu.org
--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jan Hubicka from comment #11)
> Sorry, the number of clobbers drops at DSE1, not during ehcleanup2, I just
> messed up my grep.
>
> I tried the following patch:
>
> Index: passes.def
> ===================================================================
> --- passes.def (revision 221541)
> +++ passes.def (working copy)
> @@ -87,6 +87,7 @@ along with GCC; see the file COPYING3.
> NEXT_PASS (pass_build_ealias);
> NEXT_PASS (pass_fre);
> NEXT_PASS (pass_merge_phi);
> + NEXT_PASS (pass_dse);
> NEXT_PASS (pass_cd_dce);
> NEXT_PASS (pass_early_ipa_sra);
> NEXT_PASS (pass_tail_recursion);
>
> This brings number of CLOBBER statements at release_ssa time down to 7392
> (50% reduction). A nice effect of this patch is that it tends to simplify
> destructors often to empty to make them more inlinable:
>
> ObserverEvent::~ObserverEvent() (struct ObserverEvent * const this)
> {
> <bb 2>:
> - this_2(D)->_vptr.ObserverEvent = &MEM[(void *)&_ZTV13ObserverEvent + 16B];
> MEM[(struct &)this_2(D)] ={v} {CLOBBER};
> return;
>
> saves a lot of the clobbers:
> Engine<3, double, ExpressionTag<UnaryNode<FnNorm, BinaryNode<OpSubtract,
> Reference<Field<NoMesh<3>, Vector<3, double, Full>, ViewEngine<3,
> IndexFunction<GenericURM<MeshTraits<3, double, UniformRectilinearTag,
> CartesianTag, 3> >::PositionsFunctor> > > >, Scalar<Vector<3, double, Full>
> > > > > >::~Engine() (struct Engine * const this)
> {
> <bb 2>:
> - MEM[(struct &)this_2(D) + 32] ={v} {CLOBBER};
> - MEM[(struct &)this_2(D) + 32] ={v} {CLOBBER};
> - MEM[(struct &)this_2(D) + 8] ={v} {CLOBBER};
> - MEM[(struct &)this_2(D) + 8] ={v} {CLOBBER};
> - MEM[(struct &)this_2(D) + 8] ={v} {CLOBBER};
> - MEM[(struct &)this_2(D)] ={v} {CLOBBER};
> - MEM[(struct &)this_2(D)] ={v} {CLOBBER};
> - MEM[(struct &)this_2(D)] ={v} {CLOBBER};
> + MEM[(struct &)this_1(D)] ={v} {CLOBBER};
> return;
>
> which is especially nice for LTO streaming.
>
> and saves about 7% of code apparently after inlining:
>
> $ wc -l *copyprop2
> 200189 tramp3d-v4.ii.085t.copyprop2
> $ wc -l ../5/*copyprop2
> 215060 ../5/tramp3d-v4.ii.084t.copyprop2
>
> Even though the inline decisions does not seem to be changed considerably
> (at least on tramp3d).
Yeah, clobbers don't account for anything for size/inline estimates
(well, I hope so!).
And yes, doing DSE early is quite an old idea... we should revisit it
next stage1.
> On unrelated note I noticed PR65502
>
> Still I guess this does not really explain the origin of regression in
> statement count relative to 4.9...
No idea. I'll have to look myself - the &X + 4 vs. &MEM[&X, 4] is very
reecent so it can't be blamed for the regression. But it might be blamed
for the number of stmt differences - but only from the very beginning.
That is, I can't see how the difference shows in .ssa but not in .cfg.
More information about the Gcc-bugs
mailing list