This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Another look at the ARM division routine

From: Richard Earnshaw <rearnsha at arm dot com>
To: mark at codesourcery dot com
Cc: Nicolas Pitre <nico at cam dot org>, Ian Lance Taylor <ian at wasabisystems dot com>, gcc-patches at gcc dot gnu dot org, Richard dot Earnshaw at arm dot com
Date: Wed, 12 Nov 2003 19:55:32 +0000
Subject: Re: Another look at the ARM division routine
Organization: ARM Ltd.
Reply-to: Richard dot Earnshaw at arm dot com

> On Tue, 2003-11-11 at 13:09, Nicolas Pitre wrote:
> > On 11 Nov 2003, Ian Lance Taylor wrote:
> > 
> > > Nicolas's code tests every four bits for a zero dividend, and then
> > > loops.  The test adds one instruction, and the loop adds three
> > > instructions.  Is it better to add four instructions for each four
> > > bits, with the chance of leaving the loop, or is it better to simply
> > > unroll the loop completely as Steve's code does?
> > 
> > Actually I just reused the same loop that was there before.  I mainly
> > optimized the code surounding that loop which is now pretty optimal, but the 
> > loop itself isn't that impressive.
> > 
> > > Another way to ask
> > > the question is: how frequently does the divisor end with four or more
> > > zero bits?
> > 
> > Right.  And that might not be as frequent as I thought.
> 
> I suspect that the cases where the divisor ends with four zero bits are
> largely constant power-of-two cases, which should be implemented as
> shifts anyhow.
> 
> Given Ian's measurements, I'd say we should go with Ian's patch, and you
> seem to occur.
> 
> Ian, this patch is not appropriate for stage 3, but would you please
> apply it to the csl-arm-branch?  (CodeSourcery will merge that branch
> into GCC 3.5.)
> 

My only concern with this patch is that it is substantially larger than 
what we had before.   That's not too bad if you are calling the function 
often, but can make it slower if you are only doing the occasional 
division, since there's more code to pull into the cache.  It makes me 
wonder whether we should have a size-based version as well...

R.

Follow-Ups:
- Re: Another look at the ARM division routine
  - From: Ian Lance Taylor
- Re: Another look at the ARM division routine
  - From: Mark Mitchell

References:
- Re: Another look at the ARM division routine
  - From: Mark Mitchell

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]