This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFA: Tweak optabs.c's use of constant rtx_costs


Ping^2

Richard Sandiford <richard@codesourcery.com> writes:
> expand_binop has code like the following:
>
>   /* If we are inside an appropriately-short loop and we are optimizing,
>      force expensive constants into a register.  */
>   if (CONSTANT_P (op0) && optimize
>       && rtx_cost (op0, binoptab->code) > COSTS_N_INSNS (1))
>     {
>       if (GET_MODE (op0) != VOIDmode)
> 	op0 = convert_modes (mode, VOIDmode, op0, unsignedp);
>       op0 = force_reg (mode, op0);
>     }
>
> Outdated comment aside, what this code is doing seems perfectly
> reasonable in itself.  However, I think it's in the wrong place.
> The code is one of the first things that expand_binop does,
> so we run it even if we end up splitting the binary operation
> into smaller word-sized pieces or using a sequence of completely
> different optabs altogether.  There are two direct problems with this:
>
>   - The target has no real way of knowing whether a CONST_INT passed
>     to TARGET_RTX_COSTS is for a word or double-word operation.
>
>   - If we force a constant into a register before reducing the operation
>     into smaller pieces (such as word-mode pieces), we will end up using
>     SUBREGs of a multiword REG for those smaller pieces, instead of using
>     individual constants.  That's harder to optimise, and effectively
>     hides the individual constants until after the lower-subreg pass.
>
> This patch therefore moves the force-to-reg checks to two places:
>
>   - expand_binop_directly, before using an optab.
>
>   - The part of expand_binop that widens operands from mode M1 to
>     mode M2, if we don't care about the high bits of the widened
>     operand or result.  In this case we force the M1-mode constant
>     into a register and then generate an M2-mode paradoxical subreg.
>     (This is what we do now too, and is important for targets like ARM,
>     where we can load more HImode constants than we can their
>     sign-extended SImode equivalents.)
>
> It also moves the commutative operand canonicalisation so that
> it continues to happen after forcing constants to registers.
>
> I have some mips_rtx_costs tweaks that make things better with
> this patch but often makes things worse without it.
>
> Bootstrapped & regression-tested on x86_64-linux-gnu.  I ran CSiBE
> for arm-elf, sh-elf and x86_64-linux-gnu, all with -Os.  There were
> no changes for x86, a slight improvement for ARM:
>
> jpeg:jquant1                                      3704     3700 :   99.89%
> teem:src/nrrd/tmfKernel                         202825   202781 :   99.98%
> teem:src/echo/test/trend                         13609    12845 :   94.39%
> ----------------------------------------------------------------
> Total                                          3633455  3632643 :   99.98%
>
> and another slight improvement for SH:
>
> linux:fs/ext3/balloc                              4108     4100 :   99.81%
> linux:arch/testplatform/kernel/signal             4008     3980 :   99.30%
> linux:arch/testplatform/kernel/traps              7988     7928 :   99.25%
> teem:src/nrrd/kernel                             27716    27704 :   99.96%
> teem:src/nrrd/endianNrrd                           688      648 :   94.19%
> teem:src/nrrd/tmfKernel                         199876   199936 :  100.03%
> teem:src/limn/test/tcamanim                       3624     3620 :   99.89%
> teem:src/air/754                                  1964     1960 :   99.80%
> teem:src/echo/test/trend                         10300     8056 :   78.21%
> unrarlib:unrarlib/unrarlib                       12124    12140 :  100.13%
> ----------------------------------------------------------------
> Total                                          3183600  3181276 :   99.93%
>
> OK to install?
>
> Richard
>
>
> gcc/
> 	* optabs.c (shift_optab_p, commutative_optab_p): New functions,
> 	split out from expand_binop.
> 	(avoid_expensive_constant): New function.
> 	(expand_binop_directly): Remove commutative_op argument and
> 	call cummutative_optab_p instead.  Do not change op0 or op1
> 	when swapping xop0 and xop1.  Apply avoid_expensive_constant
> 	to each argument after potential swapping.  Enforce the
> 	canonical order of commutative operands.
> 	(expand_binop): Use shift_optab_p and commutative_optab_p.
> 	Update the calls to expand_binop_directly.  Only force constants
> 	into registers when widening an operation.  Only swap operands
> 	once a direct expansion has been rejected.
> 	(expand_twoval_binop): Only force constants into registers when
> 	using a direct expansion.
>
> Index: gcc/optabs.c
> ===================================================================
> --- gcc/optabs.c	2007-08-17 14:05:14.000000000 +0100
> +++ gcc/optabs.c	2007-08-17 14:24:49.000000000 +0100
> @@ -1246,6 +1246,56 @@ swap_commutative_operands_with_target (r
>      return rtx_equal_p (op1, target);
>  }
>  
> +/* Return true if BINOPTAB implements a shift operation.  */
> +
> +static bool
> +shift_optab_p (optab binoptab)
> +{
> +  switch (binoptab->code)
> +    {
> +    case ASHIFT:
> +    case ASHIFTRT:
> +    case LSHIFTRT:
> +    case ROTATE:
> +    case ROTATERT:
> +      return true;
> +
> +    default:
> +      return false;
> +    }
> +}
> +
> +/* Return true if BINOPTAB implements a commutatative binary operation.  */
> +
> +static bool
> +commutative_optab_p (optab binoptab)
> +{
> +  return (GET_RTX_CLASS (binoptab->code) == RTX_COMM_ARITH
> +	  || binoptab == smul_widen_optab
> +	  || binoptab == umul_widen_optab
> +	  || binoptab == smul_highpart_optab
> +	  || binoptab == umul_highpart_optab);
> +}
> +
> +/* X is to be used in mode MODE as an operand to BINOPTAB.  If we're
> +   optimizing, and if the operand is a constant that costs more than
> +   1 instruction, force the constant into a register and return that
> +   register.  Return X otherwise.  UNSIGNEDP says whether X is unsigned.  */
> +
> +static rtx
> +avoid_expensive_constant (enum machine_mode mode, optab binoptab,
> +			  rtx x, bool unsignedp)
> +{
> +  if (optimize
> +      && CONSTANT_P (x)
> +      && rtx_cost (x, binoptab->code) > COSTS_N_INSNS (1))
> +    {
> +      if (GET_MODE (x) != VOIDmode)
> +	x = convert_modes (mode, VOIDmode, x, unsignedp);
> +      x = force_reg (mode, x);
> +    }
> +  return x;
> +}
>  
>  /* Helper function for expand_binop: handle the case where there
>     is an insn that directly implements the indicated operation.
> @@ -1254,55 +1304,72 @@ swap_commutative_operands_with_target (r
>  expand_binop_directly (enum machine_mode mode, optab binoptab,
>  		       rtx op0, rtx op1,
>  		       rtx target, int unsignedp, enum optab_methods methods,
> -		       int commutative_op, rtx last)
> +		       rtx last)
>  {
>    int icode = (int) optab_handler (binoptab, mode)->insn_code;
>    enum machine_mode mode0 = insn_data[icode].operand[1].mode;
>    enum machine_mode mode1 = insn_data[icode].operand[2].mode;
>    enum machine_mode tmp_mode;
> +  bool commutative_p;
>    rtx pat;
>    rtx xop0 = op0, xop1 = op1;
>    rtx temp;
> +  rtx swap;
>    
>    if (target)
>      temp = target;
>    else
>      temp = gen_reg_rtx (mode);
> -  
> +
>    /* If it is a commutative operator and the modes would match
>       if we would swap the operands, we can save the conversions.  */
> -  if (commutative_op)
> -    {
> -      if (GET_MODE (op0) != mode0 && GET_MODE (op1) != mode1
> -	  && GET_MODE (op0) == mode1 && GET_MODE (op1) == mode0)
> -	{
> -	  rtx tmp;
> -	  
> -	  tmp = op0; op0 = op1; op1 = tmp;
> -	  tmp = xop0; xop0 = xop1; xop1 = tmp;
> -	}
> +  commutative_p = commutative_optab_p (binoptab);
> +  if (commutative_p
> +      && GET_MODE (xop0) != mode0 && GET_MODE (xop1) != mode1
> +      && GET_MODE (xop0) == mode1 && GET_MODE (xop1) == mode1)
> +    {
> +      swap = xop0;
> +      xop0 = xop1;
> +      xop1 = swap;
>      }
>    
> +  /* If we are optimizing, force expensive constants into a register.  */
> +  xop0 = avoid_expensive_constant (mode0, binoptab, xop0, unsignedp);
> +  if (!shift_optab_p (binoptab))
> +    xop1 = avoid_expensive_constant (mode1, binoptab, xop1, unsignedp);
> +
>    /* In case the insn wants input operands in modes different from
>       those of the actual operands, convert the operands.  It would
>       seem that we don't need to convert CONST_INTs, but we do, so
>       that they're properly zero-extended, sign-extended or truncated
>       for their mode.  */
>    
> -  if (GET_MODE (op0) != mode0 && mode0 != VOIDmode)
> +  if (GET_MODE (xop0) != mode0 && mode0 != VOIDmode)
>      xop0 = convert_modes (mode0,
> -			  GET_MODE (op0) != VOIDmode
> -			  ? GET_MODE (op0)
> +			  GET_MODE (xop0) != VOIDmode
> +			  ? GET_MODE (xop0)
>  			  : mode,
>  			  xop0, unsignedp);
>    
> -  if (GET_MODE (op1) != mode1 && mode1 != VOIDmode)
> +  if (GET_MODE (xop1) != mode1 && mode1 != VOIDmode)
>      xop1 = convert_modes (mode1,
> -			  GET_MODE (op1) != VOIDmode
> -			  ? GET_MODE (op1)
> +			  GET_MODE (xop1) != VOIDmode
> +			  ? GET_MODE (xop1)
>  			  : mode,
>  			  xop1, unsignedp);
>    
> +  /* If operation is commutative,
> +     try to make the first operand a register.
> +     Even better, try to make it the same as the target.
> +     Also try to make the last operand a constant.  */
> +  if (commutative_p
> +      && swap_commutative_operands_with_target (target, xop0, xop1))
> +    {
> +      swap = xop1;
> +      xop1 = xop0;
> +      xop0 = swap;
> +    }
> +
>    /* Now, if insn's predicates don't allow our operands, put them into
>       pseudo regs.  */
>    
> @@ -1375,12 +1442,6 @@ expand_binop (enum machine_mode mode, op
>    enum mode_class class;
>    enum machine_mode wider_mode;
>    rtx temp;
> -  int commutative_op = 0;
> -  int shift_op = (binoptab->code == ASHIFT
> -		  || binoptab->code == ASHIFTRT
> -		  || binoptab->code == LSHIFTRT
> -		  || binoptab->code == ROTATE
> -		  || binoptab->code == ROTATERT);
>    rtx entry_last = get_last_insn ();
>    rtx last;
>  
> @@ -1395,54 +1456,16 @@ expand_binop (enum machine_mode mode, op
>        binoptab = add_optab;
>      }
>  
> -  /* If we are inside an appropriately-short loop and we are optimizing,
> -     force expensive constants into a register.  */
> -  if (CONSTANT_P (op0) && optimize
> -      && rtx_cost (op0, binoptab->code) > COSTS_N_INSNS (1))
> -    {
> -      if (GET_MODE (op0) != VOIDmode)
> -	op0 = convert_modes (mode, VOIDmode, op0, unsignedp);
> -      op0 = force_reg (mode, op0);
> -    }
> -
> -  if (CONSTANT_P (op1) && optimize
> -      && ! shift_op && rtx_cost (op1, binoptab->code) > COSTS_N_INSNS (1))
> -    {
> -      if (GET_MODE (op1) != VOIDmode)
> -	op1 = convert_modes (mode, VOIDmode, op1, unsignedp);
> -      op1 = force_reg (mode, op1);
> -    }
> -
>    /* Record where to delete back to if we backtrack.  */
>    last = get_last_insn ();
>  
> -  /* If operation is commutative,
> -     try to make the first operand a register.
> -     Even better, try to make it the same as the target.
> -     Also try to make the last operand a constant.  */
> -  if (GET_RTX_CLASS (binoptab->code) == RTX_COMM_ARITH
> -      || binoptab == smul_widen_optab
> -      || binoptab == umul_widen_optab
> -      || binoptab == smul_highpart_optab
> -      || binoptab == umul_highpart_optab)
> -    {
> -      commutative_op = 1;
> -
> -      if (swap_commutative_operands_with_target (target, op0, op1))
> -	{
> -	  temp = op1;
> -	  op1 = op0;
> -	  op0 = temp;
> -	}
> -    }
> -
>    /* If we can do it with a three-operand insn, do so.  */
>  
>    if (methods != OPTAB_MUST_WIDEN
>        && optab_handler (binoptab, mode)->insn_code != CODE_FOR_nothing)
>      {
>        temp = expand_binop_directly (mode, binoptab, op0, op1, target,
> -				    unsignedp, methods, commutative_op, last);
> +				    unsignedp, methods, last);
>        if (temp)
>  	return temp;
>      }
> @@ -1469,8 +1492,7 @@ expand_binop (enum machine_mode mode, op
>  			       NULL_RTX, unsignedp, OPTAB_DIRECT);
>  				   
>        temp = expand_binop_directly (mode, otheroptab, op0, newop1,
> -				    target, unsignedp, methods,
> -				    commutative_op, last);
> +				    target, unsignedp, methods, last);
>        if (temp)
>  	return temp;
>      }
> @@ -1529,7 +1551,14 @@ expand_binop (enum machine_mode mode, op
>  		 || binoptab == add_optab || binoptab == sub_optab
>  		 || binoptab == smul_optab || binoptab == ashl_optab)
>  		&& class == MODE_INT)
> -	      no_extend = 1;
> +	      {
> +		no_extend = 1;
> +		xop0 = avoid_expensive_constant (mode, binoptab,
> +						 xop0, unsignedp);
> +		if (binoptab != ashl_optab)
> +		  xop1 = avoid_expensive_constant (mode, binoptab,
> +						   xop1, unsignedp);
> +	      }
>  
>  	    xop0 = widen_operand (xop0, wider_mode, mode, unsignedp, no_extend);
>  
> @@ -1558,6 +1587,18 @@ expand_binop (enum machine_mode mode, op
>  	  }
>        }
>  
> +  /* If operation is commutative,
> +     try to make the first operand a register.
> +     Even better, try to make it the same as the target.
> +     Also try to make the last operand a constant.  */
> +  if (commutative_optab_p (binoptab)
> +      && swap_commutative_operands_with_target (target, op0, op1))
> +    {
> +      temp = op1;
> +      op1 = op0;
> +      op0 = temp;
> +    }
> +
>    /* These can be done a word at a time.  */
>    if ((binoptab == and_optab || binoptab == ior_optab || binoptab == xor_optab)
>        && class == MODE_INT
> @@ -1980,7 +2021,7 @@ expand_binop (enum machine_mode mode, op
>  
>        start_sequence ();
>  
> -      if (shift_op)
> +      if (shift_optab_p (binoptab))
>  	{
>  	  op1_mode = targetm.libgcc_shift_count_mode ();
>  	  /* Specify unsigned here,
> @@ -2256,16 +2297,6 @@ expand_twoval_binop (optab binoptab, rtx
>  
>    class = GET_MODE_CLASS (mode);
>  
> -  /* If we are inside an appropriately-short loop and we are optimizing,
> -     force expensive constants into a register.  */
> -  if (CONSTANT_P (op0) && optimize
> -      && rtx_cost (op0, binoptab->code) > COSTS_N_INSNS (1))
> -    op0 = force_reg (mode, op0);
> -
> -  if (CONSTANT_P (op1) && optimize
> -      && rtx_cost (op1, binoptab->code) > COSTS_N_INSNS (1))
> -    op1 = force_reg (mode, op1);
> -
>    if (!targ0)
>      targ0 = gen_reg_rtx (mode);
>    if (!targ1)
> @@ -2282,6 +2313,10 @@ expand_twoval_binop (optab binoptab, rtx
>        rtx pat;
>        rtx xop0 = op0, xop1 = op1;
>  
> +      /* If we are optimizing, force expensive constants into a register.  */
> +      xop0 = avoid_expensive_constant (mode0, binoptab, xop0, unsignedp);
> +      xop1 = avoid_expensive_constant (mode1, binoptab, xop1, unsignedp);
> +
>        /* In case the insn wants input operands in modes different from
>  	 those of the actual operands, convert the operands.  It would
>  	 seem that we don't need to convert CONST_INTs, but we do, so


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]