[PATCH] expansion: Improve double-word modulo by certain constant divisors [PR97459]

Richard Biener rguenther@suse.de
Mon Nov 30 09:21:46 GMT 2020


On Sat, 28 Nov 2020, Thomas Koenig wrote:

> Hello Jakub,
> 
> thanks a lot for taking this on!
> 
> As far as I can tell, the patch works very well, and really speeds up
> things.
> 
> As (somewhat confusingly) discussed in the PR, there are a couple of
> things that could still be done incrementally with this method.
> 
> Fist, it can be combined with, or even used for, the calulation
> of the quotient.  Let a be a number for which your patch works,
> for example 5.
> 
> If you want to calculate n / 5, you can do
> 
>   rem = n % 5;  /* Fast, with your patch.  */
>   quot = (n - rem) * magic;
> 
> in a fast way, where magic is the multiplicative inverse of 5
> modulo 2^128 (for a 128-bit number) or 2^64 (for a 64-bit number),
> which can be calculated using gmp_invert. The multiplicative inverse
> for division works because n - rem is divisible by 5.
> 
> Second, you can also use this for the quotient and/or remainder
> by 2*a (for example 10), with a slight adjustment:
> 
>   rem5 = n % 5;
>   quot5 = (n - rem5) * magic;
>   rem10 = (quot5 % 2) * 5 + rem5;
>   quot10 = quot5 / 2;
> 
> This would cover the important use case of getting the quotient and
> remainder for division by 10.
> 
> However, a benchmark (source attached) indicates that this method
> is much faster even when only one of quotient and remainder
> of division by 10 is needed.  Numbers I got indicate that this
> method is faster by about a factor of five than calling the
> library version.

Hmm, the benchmark measures throughput of integer division/modulo
which is _much_ worse than just latency since division/modulo is
usually not pipelined so there can be only one division/modulo op
in-flight.

So not sure how relevant the benchmark is - a benchmark measuring
only the latency difference would be more useful, but that's of
course harder to get at.  Maybe add a data dependence on each of
the loop iteration computations.

Richard.

> I hope this clears up the somewhat confusing string of comments
> in the PR.
> 
> Best regards
> 
> 	Thomas
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


More information about the Gcc-patches mailing list