This is the mail archive of the
mailing list for the GCC project.
Re: Speedup CSE by 5%
On Mon, 17 Jan 2005, Jeffrey A Law wrote:
> > > > This patch integrates approx_reg_cost() and approx_reg_cost_1() into one
> > > > function by not using for_each_rtx(): The overhead of the additional
> > > > function calls and some additional branches of the for_each_rtx()
> > > > construction turn out to be significant performance-wise. I don't think
> > > > the resulting code is less clear.
> > >
> > > Why is this not optimized by gcc itself? Does marking approx_reg_cost_1
> > > inline help?
> > Apart from the fact that this would need intermodule optimization, the
> > problem is:
> > GCC would first need to inline for_each_rtx, a recursive function, into
> > approx_reg_cost, and change the recursive calls to for_each_rtx into
> > recursive calls to approx_reg_cost. I would be highly surprised if you
> > told me that GCC is able to do that.
> Presumably the real gain here is the inlining of for_each_rtx, not
> the inlining of approx_reg_cost_1 into approx_reg_cost. Right?
If by "inlining for_each_rtx" you include the constant propagation that avoids
the indirect function call to approx_reg_cost_1, then probably yes.