I believe tests of divisibility by a constant (ie. n % d == 0, where d
is a compile-time constant) can be improved by the technique described
in section 9 of "Division by Invariant Integers using Multiplication"
by Granlund and Montgomery.
(This section of this paper is what's currently used in gcc for exact
division by a constant arising from pointer subtraction, but the
divisibility tests managed to escape implementation. :-)
System: Linux blah 2.2.15 #1 Tue Apr 25 17:13:48 EST 2000 i586 unknown unknown GNU/Linux
<machine, os, target, libraries (multiple lines)>
configured with: ../src/configure -v --enable-languages=c,c++,java,f77,pascal,objc,ada,treelang --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared --with-system-zlib --enable-nls --without-included-gettext --enable-__cxa_atexit --enable-clocale=gnu --enable-debug --enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-gc i486-linux
A little function like,
bar (unsigned n)
if (n % 3 == 0)
gcc -march=athlon -mcpu=athlon -O3 -fomit-frame-pointer -S dive.c
currently comes out with
movl 4(%esp), %ecx
movl $1639179445, %edx
movl %ecx, %eax
shrl $16, %edx
imull $171717, %edx, %edx
cmpl %edx, %ecx
I believe instead this could be something like
movl 4(%esp), %ecx
imull $0x4B45300D, %ecx
cmpl $0x61B3, %ecx
Notice there's only one low-half multiply, as opposed to a high-half
plus a low-half above.
An even divisor will require an extra rotl (or a separate test of some
low bits), but still ought to be smaller and faster in all cases.
If the actual value of the remainder is wanted then this technique
wouldn't help, it's only for a test for == 0 (or particular values).
When the input does prove to be divisible, then the mul shown gives
the quotient. Maybe that could be noted on the corresponding side of
the branch, in case it allowed for some CSE.
If anybody wants to look into this, the material at http://www.hackersdelight.org/
might be useful.
Subject: Re: testing divisibility by constant
> if (n % 3 == 0)
Actually, the asm code I included was for a divisor of 171717. Small
values get strength reduction on the multiplies, obscuring what gets
A simple mathematical proof that the algorithm works is found here:
See also https://stackoverflow.com/a/49264279/1137388.