[Bug tree-optimization/80155] [7/8/9 regression] Performance regression with code hoisting enabled

amker at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Tue May 22 10:14:00 GMT 2018


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80155

--- Comment #37 from bin cheng <amker at gcc dot gnu.org> ---
(In reply to rguenther@suse.de from comment #36)
> On Tue, 22 May 2018, amker at gcc dot gnu.org wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80155
> > 
> > bin cheng <amker at gcc dot gnu.org> changed:
> > 
> >            What    |Removed                     |Added
> > ----------------------------------------------------------------------------
> >                  CC|                            |amker at gcc dot gnu.org
> > 
> > --- Comment #35 from bin cheng <amker at gcc dot gnu.org> ---
> > (In reply to prathamesh3492 from comment #33)
> > > Created attachment 42341 [details]
> > > Test-case to reproduce regression with cortex-m7
> > > 
> > > I have attached an artificial test-case that is fairly representative of the
> > > regression we are seeing in a benchmark. The test-case mimics a
> > > deterministic finite automaton. With code-hoisting there's an additional
> > > spill of r5 near beginning of the function.
> > > 
> > ...
> > > 
> > > Without code-hoisting it is reusing r3 to store a + 1, while due to code
> > > hoisting it uses the extra register 'r2' to store the value of hoisted
> > > expression a + 1.
> > > 
> > > Would it be a good idea to somehow "limit" the distance (in terms of number
> > > of basic blocks maybe?) between the definition of hoisted variable and it's
> > > furthest use during PRE ? If that exceeds a certain threshold then PRE
> > > should choose not to hoist that expression. The threshold could be a param
> > > that can be set by backends.
> > > Does this analysis look reasonable ?
> > 
> > It might be more accurate to calculate register pressure and use that to guide
> > code hoisting.  I introduced register pressure hoisting for RTL under option
> > -fira-hoist-pressure, basically similar thing needs to be done here.
> > 
> > The proposed Tree-SSA register pressure patch set is still under review, but
> > please note it only does minimal now by only computing register pressure.  To
> > make it useful in this case, it may need to be improved by
> > calculating/recording live range for statements (I did that in previous version
> > patch).  We would also need interfaces updating live range information in line
> > with code motion.
> 
> One important thing on GIMPLE is that stmt order (inside a BB at least)
> is quite arbitrary and thus LIVE should consider the stmts ordered
> in LIVE-optimal way to not introduce too much noise.  It might be that
> we should only consider live-through [loops] for heuristics in some 
> places plus the obvious changes in liveness that transforms induce.

Yes, I actually did experiments only counting live ranges at bb in/out when
computing max pressure for the current implementation.  It doesn't make much
difference for the only use in predcom, but could be important for case like
this one.


More information about the Gcc-bugs mailing list