This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Improved signed integer remainder by power of two


On Sun, 2004-06-27 at 19:23, Roger Sayle wrote:
> The following patch improves the code that GCC generates for signed
> integer remainder, modulo a constant which is a power of two.  Many
> thanks for Steven Bosscher for bringing this to my attention, pointing
> out that GCC generates inferior code to other x86 compilers, including
> the IBM compiler for OS/2 and the OpenWatcom compiler.  Thanks also
> to Andi Kleen who apparently pointed out this failing to Steven.
> 
> Consider the simple test program below:
> 
> int smod16(int x)
> {
>   return x % 16;
> }
> 
> On x86, using -O2 -fomit-frame-pointer mainline CVS currently generates
> the rather poor:
> 
> 
> smod16: movl    4(%esp), %eax
>         testl   %eax, %eax
>         movl    %eax, %edx
>         js      .L4
>         andl    $-16, %edx
>         subl    %edx, %eax
>         ret
> .L4:
>         leal    15(%eax), %edx
>         andl    $-16, %edx
>         subl    %edx, %eax
>         ret
> 
> 
> Examining the middle-end code in expmed.c's expand_divmod, I'm actually
> impressed that we do this well, as there's currently no special case code
> for handling this form of TRUNC_MOD_EXPR.  Instead, the initial RTL that
> we generate is calculated as "return x - 16*(x/16);", where the divisions
> and multiplications are handled by arithmetic shifts.
> 
> The patch below introduces a new expand_smod_pow2 function, as a helper
> function to expand_divmod, to generate improved RTL for TRUNC_MOD_EXPR.
> 
> 
> With this patch, we now generate the following code with the above
> options.
> 
> smod16: movl    4(%esp), %eax
>         andl    $-2147483633, %eax
>         js      .L4
>         ret
> .L4:
>         decl    %eax
>         orl     $-16, %eax
>         incl    %eax
>         ret
> 
> The first "andl" is actually 0x8000000f, i.e. the signbit of the mode of
> this operation, and the significant low-order bits of the modulus.  By
> performing this bitwise AND early, many targets (including x86) can avoid
> an explicit compare/test instruction when checking for the signedness of
> the operand.  This happens to be the same idiom as used by Microsoft's
> Visual C/C++ version 6 compilers.
> 
> 
> And for targets with a significant conditional branch penalty, when
> BRANCH_COST >= 2, the patch below can also generate a straight-line
> sequence without conditional jumps.
> 
> For example, using "-O2 -fomit-frame-pointer -mtune=pentium4", we now
> generate the following code:
> 
> smod16: movl    4(%esp), %eax
>         cltd
>         xorl    %edx, %eax
>         subl    %edx, %eax
>         andl    $15, %eax
>         xorl    %edx, %eax
>         subl    %edx, %eax
>         ret
> 
> which as luck would have it is identical to the code generated by the
> Intel version 7 compiler (and the IBM OS/2 and OpenWatcom compilers
> mentioned above).
> 
> 
> For the curious, the above assembly code is approximately equal to the
> following function:
> 
> int smod16a(int x)
> {
>   int signmask = x >> 31;
>   x = (x ^ signmask) - signmask;  // x = fabs(x)
>   x = x & 15;
>   x = (x ^ signmask) - signmask;
>   return x;
> }
> 
> This is relying on the idiom that "(x ^ -1) - -1" is equal to "-x",
> but that "(x ^ 0) - 0" is the identity operation.  So the second
> and fourth lines above are effectively "conditional negation"
> instructions, controlled by the original sign of x.
> 
> Or to make things clearer by removing this conditional execution:
> 
> int smod16b(int x)
> {
>   if (x < 0)
>   {
>     x = -x;
>     x = x & 15;
>     x = -x;
>   }
>   else
>     x = x & 15;
>   return x;
> }
> 
> 
> 
> The following patch has been tested on i686-pc-linux-gnu, with a full
> "make bootstrap", all default languages, and regression tested with a
> top-level "make -k check" with no new failures.
> 
> Ok for mainline?
> 
> 
> 2004-06-27  Roger Sayle  <roger@eyesopen.com>
> 
> 	* expmed.c (expand_smod_pow2): New function to expand signed
> 	remainder by a constant power of 2, such as "x % 16".
> 	(expand_divmod): Call new expand_smod_pow2 when appropriate.
> 	Minor corrections to comments, e.g. splitting long lines.
Looks good.  Please go ahead and install this patch.

jeff



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]