This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [csl-asm?] PR middle-end/11821: Tweak arm_rtx_costs_1


Roger,

> > But before you check it in, why did you use a cost of two insns?  A div
> > call is typically going to be about four insns long (two to set up the
> > arguments, one for the call and one to move the result somewhere useful.
> > In addition to this, we have to consider the fact that making a call will
> > clobber the six call-clobbered registers and potentially turn a leaf
> > function into a non-leaf one.  I would expect (though it would need
> > measuring on real code) that any division by a power of two, whether
> > signed or unsigned, will continue to be best done inline.
> 
> Initially, I had no way of determing the code size of a function except
> to assume that each line in the assembly output was the same size.  The
> PR stated that "a % 5" was an example that shouldn't be inlined.  I first
> tried a value of COSTS_N_INSNS(3), but GCC still inlined the modulus.
> I then tried COSTS_N_INSNS(2), and we inlined "a % 4" but not "a % 5"
> which seemed reasonable.  The evaluation against CSiBE (once I'd figured
> out how to run it with an uninstalled cross tool chain) confirmed that
> this was an overall improvement: there were a few tests that did increase
> in size by four bytes, but there were far more tests that shrunk, and
> these improvements were also more significant.
> 

If you've examined the code to check that what is happening is reasonable, 
and have benchmarks that show that on average a cost of 2 insns is better 
than a cost of 3, I'm completely happy for your patch to go in on the 
trunk.

Thanks,

R.




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]