Another look at the ARM division routine
Nicolas Pitre
nico@cam.org
Wed Nov 12 19:29:00 GMT 2003
On Wed, 12 Nov 2003, Mark Mitchell wrote:
> On Tue, 2003-11-11 at 13:09, Nicolas Pitre wrote:
> > On 11 Nov 2003, Ian Lance Taylor wrote:
> >
> > > Nicolas's code tests every four bits for a zero dividend, and then
> > > loops. The test adds one instruction, and the loop adds three
> > > instructions. Is it better to add four instructions for each four
> > > bits, with the chance of leaving the loop, or is it better to simply
> > > unroll the loop completely as Steve's code does?
> >
> > Actually I just reused the same loop that was there before. I mainly
> > optimized the code surounding that loop which is now pretty optimal, but the
> > loop itself isn't that impressive.
> >
> > > Another way to ask
> > > the question is: how frequently does the divisor end with four or more
> > > zero bits?
> >
> > Right. And that might not be as frequent as I thought.
>
> I suspect that the cases where the divisor ends with four zero bits are
> largely constant power-of-two cases,
Not necessarily. Consider 900000 / 3 for example.
> which should be implemented as
> shifts anyhow.
... which my original patch does already for power of 2 divisors.
> Given Ian's measurements, I'd say we should go with Ian's patch, and you
> seem to occur.
>
> Ian, this patch is not appropriate for stage 3, but would you please
> apply it to the csl-arm-branch? (CodeSourcery will merge that branch
> into GCC 3.5.)
BTW: Who has aproval authority for that branch? I have some patches I'd
like to have merged at some point.
Nicolas
More information about the Gcc-patches
mailing list