This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
Re: [PATCH]: Machine independent patch, was: Update SSE5 vector multiplication, shift, rotate

From: Richard Guenther <rguenther at suse dot de>
To: Michael Meissner <michael dot meissner at amd dot com>
Cc: Uros Bizjak <ubizjak at gmail dot com>, gcc-patches at gcc dot gnu dot org, dwarak dot rajagopal at amd dot com, christophe dot harle at amd dot com, hongjiu dot lu at intel dot com
Date: Fri, 18 Apr 2008 22:52:55 +0200 (CEST)
Subject: Re: [PATCH]: Machine independent patch, was: Update SSE5 vector multiplication, shift, rotate
References: <20080417185036.GA15776@mmeissner-gold.amd.com> <5787cf470804172258t7caedb73m3b499abf6e57cc67@mail.gmail.com> <20080418154455.GA17904@mmeissner-gold.amd.com>
On Fri, 18 Apr 2008, Michael Meissner wrote:

> On Fri, Apr 18, 2008 at 07:58:38AM +0200, Uros Bizjak wrote:
> > On Thu, Apr 17, 2008 at 8:50 PM, Michael Meissner
> > <michael.meissner@amd.com> wrote:
> > 
> > > The following patch updates some of the current SSE5 code patterns to add the
> > >  following:
> > >
> > >  1) Update vector 64-bit integer multiply
> > >  2) Update vector 32x32->64-bit integer widening multiply
> > >  3) Add support for SSE5 vector/vector shift patterns
> > >  4) Add support for vectorizing rotate patterns
> > 
> > Is it possible to split this patch into machine-independent and
> > machine-dependant part? Machine-independent part should be reviewed by
> > a middle-end (vectorizer) maintainer, and I will look at
> > machine-dependant/testsuite part. It is recommended to mark each part
> > of the patch with [PATCH, middle-end] or [PATCH, i386].
> > 
> > BTW: If I can choose, I would prefer the later part in a unidiff format.
> 
> Fair enough.  Here is the machine independent portion of the patch:

Can't you do without the new tree codes?  There are already too many
shift/rotate ones and the naming doesn't really distinguish them in
an obvious way.  Simply overloading the existing scalar shift/rotate
codes by using the appropriate operand types works for me.

Thanks,
Richard.

> 2008-04-18  Michael Meissner  <michael.meissner@amd.com>
> 	    Dwarakanath Rajagopal  <dwarak.rajagopal@amd.com>
> 	
> 	* optabs.h (OTI_vashl): New optab index for vector shift/rotate by
> 	vector support.
> 	(OTI_vlshr): Ditto.
> 	(OTI_vashr): Ditto.
> 	(OTI_vrotl): Ditto.
> 	(OTI_vrotr): Ditto.
> 	(vashl_optab): New optab for vector shift/rotate by vector
> 	support.
> 	(vlshl_optab): Ditto.
> 	(vashr_optab): Ditto.
> 	(vrotl_optab): Ditto.
> 	(vrotr_optab): Ditto.
> 
> 	* optabs.c (optab_for_tree_code): Add support for vector
> 	shift/rotate by vector.
> 
> 	* genopinit.c (optabs): Add vashl, vlshl, vashr, vrotl, vrotr
> 	optabs.
> 
> 	* expmed.c (expand_shift): If a machine description has a vashl,
> 	vlshl, vashr, vrotl, or vrotr optabs, use that for vector shift
> 	and rotate by a vector instruction.
> 
> 	* tree-vect-transform.c (vectorizable_operation): If a machine has
> 	vashl, vlshl, vashr optabs, use that for vector shift by a vector
> 	operation.  Fall back to looking at ashl, lshl, ashr's second
> 	operand mode if vashl/vlshl/vashr aren't present to determine if
> 	the machine has a vector shift by scalar or vector shift by
> 	vector operation.  Add vector rotate support.
> 
> 	* tree.def (VLSHIFT_EXPR): New tree code for vector shift/rotate
> 	by vector.
> 	(VRSHIFT_EXPR): Ditto.
> 	(VLROTATE_EXPR): Ditto.
> 	(VRROTATE_EXPR): Ditto.
> 
> 	* expr.c (expand_expr_real_1): Support vectorized rotates.
> 
> 	* doc/c-tree.texi (VLSHIFT_EXPR): New tree code for vector
> 	shift/rotate by vector.
> 	(VRSHIFT_EXPR): Ditto.
> 	(VLROTATE_EXPR): Ditto.
> 	(VRROTATE_EXPR): Ditto.
> 	(LROTATE_EXPR): Document missing tree code.
> 	(RROTATE_EXPR): Ditto.
> 
> 	* doc/md.texi (vashl<mode>3): Document new standard name for shift
> 	and rotate of a vector by a vector.
> 	(vashl<mode>3): Ditto.
> 	(vlshr<mode>3): Ditto.
> 	(vrotl<mode>3): Ditto.
> 	(vrotr<mode>3): Ditto.
> 
> --- gcc/optabs.h.~0~	2008-04-17 12:28:06.643070000 -0400
> +++ gcc/optabs.h	2008-04-15 16:40:00.462084000 -0400
> @@ -167,6 +167,18 @@ enum optab_index
>    OTI_rotl,
>    /* Rotate right */
>    OTI_rotr,
> +
> +  /* Arithmetic shift left of vector by vector */
> +  OTI_vashl,
> +  /* Logical shift right of vector by vector */
> +  OTI_vlshr,
> +  /* Arithmetic shift right of vector by vector */
> +  OTI_vashr,
> +  /* Rotate left of vector by vector */
> +  OTI_vrotl,
> +  /* Rotate right of vector by vector */
> +  OTI_vrotr,
> +
>    /* Signed and floating-point minimum value */
>    OTI_smin,
>    /* Signed and floating-point maximum value */
> @@ -412,6 +424,11 @@ extern struct optab optab_table[OTI_MAX]
>  #define ashr_optab (&optab_table[OTI_ashr])
>  #define rotl_optab (&optab_table[OTI_rotl])
>  #define rotr_optab (&optab_table[OTI_rotr])
> +#define vashl_optab (&optab_table[OTI_vashl])
> +#define vlshr_optab (&optab_table[OTI_vlshr])
> +#define vashr_optab (&optab_table[OTI_vashr])
> +#define vrotl_optab (&optab_table[OTI_vrotl])
> +#define vrotr_optab (&optab_table[OTI_vrotr])
>  #define smin_optab (&optab_table[OTI_smin])
>  #define smax_optab (&optab_table[OTI_smax])
>  #define umin_optab (&optab_table[OTI_umin])
> --- gcc/optabs.c.~0~	2008-04-17 12:28:06.594117000 -0400
> +++ gcc/optabs.c	2008-04-15 16:40:00.489112000 -0400
> @@ -387,6 +387,18 @@ optab_for_tree_code (enum tree_code code
>      case RROTATE_EXPR:
>        return rotr_optab;
>  
> +    case VLSHIFT_EXPR:
> +      return vashl_optab;
> +
> +    case VRSHIFT_EXPR:
> +      return TYPE_UNSIGNED (type) ? vlshr_optab : vashl_optab;
> +
> +    case VLROTATE_EXPR:
> +      return vrotl_optab;
> +
> +    case VRROTATE_EXPR:
> +      return vrotr_optab;
> +
>      case MAX_EXPR:
>        return TYPE_UNSIGNED (type) ? umax_optab : smax_optab;
>  
> --- gcc/genopinit.c.~0~	2008-04-17 12:28:06.667044000 -0400
> +++ gcc/genopinit.c	2008-04-15 16:40:00.510502000 -0400
> @@ -130,6 +130,11 @@ static const char * const optabs[] =
>    "optab_handler (lshr_optab, $A)->insn_code = CODE_FOR_$(lshr$a3$)",
>    "optab_handler (rotl_optab, $A)->insn_code = CODE_FOR_$(rotl$a3$)",
>    "optab_handler (rotr_optab, $A)->insn_code = CODE_FOR_$(rotr$a3$)",
> +  "optab_handler (vashr_optab, $A)->insn_code = CODE_FOR_$(vashr$a3$)",
> +  "optab_handler (vlshr_optab, $A)->insn_code = CODE_FOR_$(vlshr$a3$)",
> +  "optab_handler (vashl_optab, $A)->insn_code = CODE_FOR_$(vashl$a3$)",
> +  "optab_handler (vrotl_optab, $A)->insn_code = CODE_FOR_$(vrotl$a3$)",
> +  "optab_handler (vrotr_optab, $A)->insn_code = CODE_FOR_$(vrotr$a3$)",
>    "optab_handler (smin_optab, $A)->insn_code = CODE_FOR_$(smin$a3$)",
>    "optab_handler (smax_optab, $A)->insn_code = CODE_FOR_$(smax$a3$)",
>    "optab_handler (umin_optab, $A)->insn_code = CODE_FOR_$(umin$I$a3$)",
> --- gcc/expmed.c.~0~	2008-04-17 12:28:06.416295000 -0400
> +++ gcc/expmed.c	2008-04-15 16:40:00.531902000 -0400
> @@ -2027,6 +2027,9 @@ expand_dec (rtx target, rtx dec)
>      emit_move_insn (target, value);
>  }
>  
> +#define optab_handler_valid_p(o, m) \
> +  optab_handler(o, m)->insn_code != CODE_FOR_nothing
> +
>  /* Output a shift instruction for expression code CODE,
>     with SHIFTED being the rtx for the value to shift,
>     and AMOUNT the tree for the amount to shift by.
> @@ -2041,14 +2044,69 @@ expand_shift (enum tree_code code, enum 
>    rtx op1, temp = 0;
>    int left = (code == LSHIFT_EXPR || code == LROTATE_EXPR);
>    int rotate = (code == LROTATE_EXPR || code == RROTATE_EXPR);
> +  optab lshift_optab = ashl_optab;
> +  optab rshift_arith_optab = ashr_optab;
> +  optab rshift_uns_optab = lshr_optab;
> +  optab lrotate_optab = rotl_optab;
> +  optab rrotate_optab = rotr_optab;
> +  enum machine_mode op1_mode;
>    int try;
>  
> +  op1 = expand_normal (amount);
> +  op1_mode = GET_MODE (op1);
> +
> +  /* Determine whether the shift/rotate amount is a vector, or scalar.  If the
> +     shift amount is a vector, see if the machine has a separate set of optabs
> +     for vector by vector shifts.  Historically, GCC looked at the 2nd
> +     operand's type in the shift optab to see what type of shift was
> +     supported.  */
> +  if (VECTOR_MODE_P (mode) && VECTOR_MODE_P (op1_mode))
> +    {
> +      enum tree_code new_code = code;
> +      optab shift_optab;
> +
> +      switch (code)
> +	{
> +	default:
> +	  break;
> +
> +	case LSHIFT_EXPR:
> +	  if (optab_handler_valid_p (vashl_optab, op1_mode))
> +	    new_code = VLSHIFT_EXPR;
> +	  break;
> +
> +	case RSHIFT_EXPR:
> +	  shift_optab = (unsignedp) ? vlshr_optab : vashr_optab;
> +	  if (optab_handler_valid_p (shift_optab, op1_mode))
> +	    new_code = VRSHIFT_EXPR;
> +	  break;
> +
> +	case LROTATE_EXPR:
> +	  if (optab_handler_valid_p (vrotl_optab, op1_mode))
> +	    new_code = VLROTATE_EXPR;
> +	  break;
> +
> +	case RROTATE_EXPR:
> +	  if (optab_handler_valid_p (vrotr_optab, op1_mode))
> +	    new_code = VRROTATE_EXPR;
> +	  break;
> +	}
> +
> +      if (code != new_code)
> +	{
> +	  code = new_code;
> +	  lshift_optab = vashl_optab;
> +	  rshift_arith_optab = vashr_optab;
> +	  rshift_uns_optab = vlshr_optab;
> +	  lrotate_optab = vrotl_optab;
> +	  rrotate_optab = vrotr_optab;
> +	}
> +    }
> +
>    /* Previously detected shift-counts computed by NEGATE_EXPR
>       and shifted in the other direction; but that does not work
>       on all machines.  */
>  
> -  op1 = expand_normal (amount);
> -
>    if (SHIFT_COUNT_TRUNCATED)
>      {
>        if (GET_CODE (op1) == CONST_INT
> @@ -2138,12 +2196,12 @@ expand_shift (enum tree_code code, enum 
>  	    }
>  
>  	  temp = expand_binop (mode,
> -			       left ? rotl_optab : rotr_optab,
> +			       left ? lrotate_optab : rrotate_optab,
>  			       shifted, op1, target, unsignedp, methods);
>  	}
>        else if (unsignedp)
>  	temp = expand_binop (mode,
> -			     left ? ashl_optab : lshr_optab,
> +			     left ? lshift_optab : rshift_uns_optab,
>  			     shifted, op1, target, unsignedp, methods);
>  
>        /* Do arithmetic shifts.
> @@ -2162,7 +2220,7 @@ expand_shift (enum tree_code code, enum 
>  	  /* Arithmetic shift */
>  
>  	  temp = expand_binop (mode,
> -			       left ? ashl_optab : ashr_optab,
> +			       left ? lshift_optab : rshift_arith_optab,
>  			       shifted, op1, target, unsignedp, methods1);
>  	}
>  
> --- gcc/tree-vect-transform.c.~0~	2008-04-17 12:28:06.451265000 -0400
> +++ gcc/tree-vect-transform.c	2008-04-15 16:40:01.005950000 -0400
> @@ -3830,7 +3830,7 @@ vectorizable_operation (tree stmt, block
>    tree vectype = STMT_VINFO_VECTYPE (stmt_info);
>    loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>    struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
> -  enum tree_code code;
> +  enum tree_code code, alt_code;
>    enum machine_mode vec_mode;
>    tree new_temp;
>    int op_type;
> @@ -3850,6 +3850,7 @@ vectorizable_operation (tree stmt, block
>    tree vop0, vop1;
>    unsigned int k;
>    bool scalar_shift_arg = false;
> +  bool shift_rotate_p = false;
>  
>    /* FORNOW: SLP with multiple types is not supported. The SLP analysis verifies
>       this, so we can safely override NCOPIES with 1 here.  */
> @@ -3923,6 +3924,59 @@ vectorizable_operation (tree stmt, block
>  	}
>      }
>  
> +  /* If this is a shift/rotate, determine whether the shift amount is a vector,
> +     or scalar.  If the shift/rotate amount is a vector, see if the machine has
> +     a separate set of optabs for vector by vector shifts.  Historically, GCC
> +     looked at the 2nd operand's type in the shift optab to see what type of
> +     shift was supported.  */
> +  alt_code = code;
> +  switch (code)
> +    {
> +    default:
> +      break;
> +
> +    case LSHIFT_EXPR:
> +      alt_code = VLSHIFT_EXPR;
> +      shift_rotate_p = true;
> +      break;
> +
> +    case RSHIFT_EXPR:
> +      alt_code = VRSHIFT_EXPR;
> +      shift_rotate_p = true;
> +      break;
> +
> +    case LROTATE_EXPR:
> +      alt_code = VLROTATE_EXPR;
> +      shift_rotate_p = true;
> +      break;
> +
> +    case RROTATE_EXPR:
> +      alt_code = VRROTATE_EXPR;
> +      shift_rotate_p = true;
> +      break;
> +    }
> +
> +  if (shift_rotate_p)
> +    {
> +      if (dt[1] == vect_loop_def
> +	  || (!optab && (dt[1] == vect_constant_def
> +			 || dt[1] == vect_invariant_def)))
> +	{
> +	  struct optab *voptab = optab_for_tree_code (alt_code, vectype);
> +
> +	  if (voptab
> +	      && (optab_handler (voptab, TYPE_MODE (vectype))->insn_code
> +		  != CODE_FOR_nothing))
> +	    {
> +	      if (vect_print_dump_info (REPORT_DETAILS))
> +		fprintf (vect_dump, "vector shift/rotate by vector found, mode %s",
> +			 GET_MODE_NAME (TYPE_MODE (vectype)));
> +
> +	      optab = voptab;
> +	    }
> +	}
> +    }
> +
>    /* Supportable by target?  */
>    if (!optab)
>      {
> @@ -3957,11 +4011,15 @@ vectorizable_operation (tree stmt, block
>        return false;
>      }
>  
> -  if (code == LSHIFT_EXPR || code == RSHIFT_EXPR)
> +  if (shift_rotate_p)
>      {
>        /* FORNOW: not yet supported.  */
>        if (!VECTOR_MODE_P (vec_mode))
> -	return false;
> +	{
> +	  if (vect_print_dump_info (REPORT_DETAILS))
> +	    fprintf (vect_dump, "vec_mode is not a vector type");
> +	  return false;
> +	}
>  
>        /* Invariant argument is needed for a vector shift
>  	 by a scalar shift operand.  */
> @@ -4072,8 +4130,7 @@ vectorizable_operation (tree stmt, block
>        /* Handle uses.  */
>        if (j == 0)
>  	{
> -	  if (op_type == binary_op
> -	      && (code == LSHIFT_EXPR || code == RSHIFT_EXPR))
> +	  if (op_type == binary_op && scalar_shift_arg)
>  	    {
>  	      /* Vector shl and shr insn patterns can be defined with scalar 
>  		 operand 2 (shift operand). In this case, use constant or loop 
> --- gcc/tree.def.~0~	2008-04-17 12:28:06.393319000 -0400
> +++ gcc/tree.def	2008-04-15 16:40:01.521653000 -0400
> @@ -683,6 +683,13 @@ DEFTREECODE (RSHIFT_EXPR, "rshift_expr",
>  DEFTREECODE (LROTATE_EXPR, "lrotate_expr", tcc_binary, 2)
>  DEFTREECODE (RROTATE_EXPR, "rrotate_expr", tcc_binary, 2)
>  
> +/* Vector/vector shifts and rotates, where both arguments are vector types.
> +   This is only used during the expansion of shifts and rotates.  */
> +DEFTREECODE (VLSHIFT_EXPR, "vlshift_expr", tcc_binary, 2)
> +DEFTREECODE (VRSHIFT_EXPR, "vrshift_expr", tcc_binary, 2)
> +DEFTREECODE (VLROTATE_EXPR, "vlrotate_expr", tcc_binary, 2)
> +DEFTREECODE (VRROTATE_EXPR, "vrrotate_expr", tcc_binary, 2)
> +
>  /* Bitwise operations.  Operands have same mode as result.  */
>  DEFTREECODE (BIT_IOR_EXPR, "bit_ior_expr", tcc_binary, 2)
>  DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
> --- gcc/expr.c.~0~	2008-04-17 12:28:06.373344000 -0400
> +++ gcc/expr.c	2008-04-15 16:47:24.587040000 -0400
> @@ -8868,12 +8868,6 @@ expand_expr_real_1 (tree exp, rtx target
>  
>      case LROTATE_EXPR:
>      case RROTATE_EXPR:
> -      /* The expansion code only handles expansion of mode precision
> -	 rotates.  */
> -      gcc_assert (GET_MODE_PRECISION (TYPE_MODE (type))
> -		  == TYPE_PRECISION (type));
> -
> -      /* Falltrough.  */
>      case LSHIFT_EXPR:
>      case RSHIFT_EXPR:
>        /* If this is a fixed-point operation, then we cannot use the code
> --- gcc/doc/c-tree.texi.~0~	2008-04-17 12:28:07.309401000 -0400
> +++ gcc/doc/c-tree.texi	2008-04-15 16:40:01.666905000 -0400
> @@ -1926,6 +1926,12 @@ This macro returns the attributes on the
>  @tindex THROW_EXPR
>  @tindex LSHIFT_EXPR
>  @tindex RSHIFT_EXPR
> +@tindex VLSHIFT_EXPR
> +@tindex VRSHIFT_EXPR
> +@tindex LROTATE_EXPR
> +@tindex RROTATE_EXPR
> +@tindex VLROTATE_EXPR
> +@tindex VRROTATE_EXPR
>  @tindex BIT_IOR_EXPR
>  @tindex BIT_XOR_EXPR
>  @tindex BIT_AND_EXPR
> @@ -2300,6 +2306,22 @@ Note that the result is undefined if the
>  than or equal to the first operand's type size.
>  
>  
> +@item VLSHIFT_EXPR
> +@itemx VRSHIFT_EXPR
> +These nodes represent left and right shifts, respectively.
> +@code{VLSHIFT_EXPR} and @code{VRSHIFT_EXPR} are used when expanding
> +shifts of vector types by the same size vector type to distinguish
> +them from shifts of vector types by scalar amounts.
> +
> +@item LROTATE_EXPR
> +@itemx RROTATE_EXPR
> +These nodes represent left and right rotates, respectively.
> +
> +@item VLROTATE_EXPR
> +@itemx VRROTATE_EXPR
> +These nodes represent left and right rotates of vector types by the
> +same size vector type, respectively.
> +
>  @item BIT_IOR_EXPR
>  @itemx BIT_XOR_EXPR
>  @itemx BIT_AND_EXPR
> --- gcc/doc/md.texi.~0~	2008-04-17 14:44:30.526922000 -0400
> +++ gcc/doc/md.texi	2008-04-17 14:43:55.044816000 -0400
> @@ -3858,6 +3858,20 @@ counts can optionally be specified by @c
>  Other shift and rotate instructions, analogous to the
>  @code{ashl@var{m}3} instructions.
>  
> +@cindex @code{vashl@var{m}3} instruction pattern
> +@cindex @code{vashr@var{m}3} instruction pattern
> +@cindex @code{vlshr@var{m}3} instruction pattern
> +@cindex @code{vrotl@var{m}3} instruction pattern
> +@cindex @code{vrotr@var{m}3} instruction pattern
> +@item @samp{vashl@var{m}3}, @samp{vashr@var{m}3}, @samp{vlshr@var{m}3}, @samp{vrotl@var{m}3}, @samp{vrotr@var{m}3}
> +Vector shift and rotate instructions that take vectors as operand 2 to
> +allow a machine that has both a vector shift/rotate by a scalar
> +instruction and a separate vector shift/rotate by a vector instruction
> +to support both instructions.  If these vector shift instructions are
> +not present, the machine will look at the mode of operand 2 of the
> +normal shift instruction to determine which type of vector shift is
> +supported.
> +
>  @cindex @code{neg@var{m}2} instruction pattern
>  @cindex @code{ssneg@var{m}2} instruction pattern
>  @cindex @code{usneg@var{m}2} instruction pattern
> 
> 

-- 
Richard Guenther <rguenther@suse.de>
Novell / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 - GF: Markus Rex
Follow-Ups:
- Re: [PATCH]: Machine independent patch, was: Update SSE5 vector multiplication, shift, rotate
  - From: Michael Meissner
References:
- [PATCH]: Update SSE5 vector multiplication, shift, rotate
  - From: Michael Meissner
- Re: [PATCH]: Update SSE5 vector multiplication, shift, rotate
  - From: Uros Bizjak
- Re: [PATCH]: Machine independent patch, was: Update SSE5 vector multiplication, shift, rotate
  - From: Michael Meissner
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]