This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Alternate expansion of signed remainder by power of two


On Tue, 29 Jun 2004, Falk Hueffner wrote:
> Interestingly, it makes x % 16 be one cycle longer and slower on Alpha:

The following patch should address the performance regression observed
by Falk on the alpha.  I've enhanced the new expand_smod_pow2 function
to use the target's rtx_cost function to determine which of two
implementations to use.

On machines, where shifts are expensive, we implement "x % 16" as before:

int mod16(int x)
{
  int mask = x >> 31;  // mask = (x < 0) ? -1 : 0;

  x = (x^mask) - mask;
  x &= 15;
  x = (x^mask) - mask;
}

but on machines where shifts are cheap, we now use:

int mod16(int x)
{
  int mask = x >> 31; // mask = (x < 0) ? -1 : 0;

  mask = (unsigned) mask >> 28;  // mask = (x < 0) ? 15 : 0;
  x += mask;
  x &= 15;
  x -= mask;
}


After the obtaining the signmask, the first uses four cheap
instructions around the AND, whilst the second uses two cheap
instructions and a shift around the AND, so therefore is
preferred when a shift cost's less than two additions.

Falk, could you test this patch on your alpha?  alphaev67-dec-osf5.1
isn't bootstrapping for me at the moment.


The following patch has been tested on i686-pc-linux-gnu, with a full
"make bootstrap", all default languages, and regression tested with a
top-level "make -k check" with no new failures.  I also tested a simple
modification of this patch that forced GCC to always use this new code
sequence on x86, also bootstrapped/reg-tested with any problems.

Ok for mainline?


2004-06-29  Roger Sayle  <roger@eyesopen.com>

	* expmed.c (expand_smod_pow2): Provide alternate implementations
	that avoid conditional jumps, and choose between them based upon
	the target's rtx_costs.


Index: expmed.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/expmed.c,v
retrieving revision 1.170
diff -c -3 -p -r1.170 expmed.c
*** expmed.c	28 Jun 2004 20:49:37 -0000	1.170
--- expmed.c	29 Jun 2004 02:38:00 -0000
*************** static rtx
*** 3064,3070 ****
  expand_smod_pow2 (enum machine_mode mode, rtx op0, HOST_WIDE_INT d)
  {
    unsigned HOST_WIDE_INT mask;
!   rtx result, temp, label;
    int logd;

    logd = floor_log2 (d);
--- 3064,3070 ----
  expand_smod_pow2 (enum machine_mode mode, rtx op0, HOST_WIDE_INT d)
  {
    unsigned HOST_WIDE_INT mask;
!   rtx result, temp, shift, label;
    int logd;

    logd = floor_log2 (d);
*************** expand_smod_pow2 (enum machine_mode mode
*** 3079,3095 ****
        if (signmask)
  	{
  	  signmask = force_reg (mode, signmask);
- 	  temp = expand_binop (mode, xor_optab, op0, signmask,
- 			       NULL_RTX, 1, OPTAB_LIB_WIDEN);
- 	  temp = expand_binop (mode, sub_optab, temp, signmask,
- 			       NULL_RTX, 0, OPTAB_LIB_WIDEN);
  	  mask = ((HOST_WIDE_INT) 1 << logd) - 1;
! 	  temp = expand_binop (mode, and_optab, temp, GEN_INT (mask),
! 			       NULL_RTX, 1, OPTAB_LIB_WIDEN);
! 	  temp = expand_binop (mode, xor_optab, temp, signmask,
! 			       NULL_RTX, 1, OPTAB_LIB_WIDEN);
! 	  temp = expand_binop (mode, sub_optab, temp, signmask,
! 			       NULL_RTX, 1, OPTAB_LIB_WIDEN);
  	  return temp;
  	}
      }
--- 3079,3120 ----
        if (signmask)
  	{
  	  signmask = force_reg (mode, signmask);
  	  mask = ((HOST_WIDE_INT) 1 << logd) - 1;
! 	  shift = GEN_INT (GET_MODE_BITSIZE (mode) - logd);
!
! 	  /* Use the rtx_cost of a LSHIFTRT instruction to determine
! 	     which instruction sequence to use.  If logical right shifts
! 	     are expensive the use 2 XORs, 2 SUBs and an AND, otherwise
! 	     use a LSHIFTRT, 1 ADD, 1 SUB and an AND.  */
!
! 	  temp = gen_rtx_LSHIFTRT (mode, result, shift);
! 	  if (lshr_optab->handlers[mode].insn_code == CODE_FOR_nothing
! 	      || rtx_cost (temp, SET) > COSTS_N_INSNS (2))
! 	    {
! 	      temp = expand_binop (mode, xor_optab, op0, signmask,
! 				   NULL_RTX, 1, OPTAB_LIB_WIDEN);
! 	      temp = expand_binop (mode, sub_optab, temp, signmask,
! 				   NULL_RTX, 1, OPTAB_LIB_WIDEN);
! 	      temp = expand_binop (mode, and_optab, temp, GEN_INT (mask),
! 				   NULL_RTX, 1, OPTAB_LIB_WIDEN);
! 	      temp = expand_binop (mode, xor_optab, temp, signmask,
! 				   NULL_RTX, 1, OPTAB_LIB_WIDEN);
! 	      temp = expand_binop (mode, sub_optab, temp, signmask,
! 				   NULL_RTX, 1, OPTAB_LIB_WIDEN);
! 	    }
! 	  else
! 	    {
! 	      signmask = expand_binop (mode, lshr_optab, signmask, shift,
! 				       NULL_RTX, 1, OPTAB_LIB_WIDEN);
! 	      signmask = force_reg (mode, signmask);
!
! 	      temp = expand_binop (mode, add_optab, op0, signmask,
! 				   NULL_RTX, 1, OPTAB_LIB_WIDEN);
! 	      temp = expand_binop (mode, and_optab, temp, GEN_INT (mask),
! 				   NULL_RTX, 1, OPTAB_LIB_WIDEN);
! 	      temp = expand_binop (mode, sub_optab, temp, signmask,
! 				   NULL_RTX, 1, OPTAB_LIB_WIDEN);
! 	    }
  	  return temp;
  	}
      }

Roger
--



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]