This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH]: Machine independent patch, was: Update SSE5 vector multiplication, shift, rotate
On Fri, 18 Apr 2008, Michael Meissner wrote:
> On Fri, Apr 18, 2008 at 07:58:38AM +0200, Uros Bizjak wrote:
> > On Thu, Apr 17, 2008 at 8:50 PM, Michael Meissner
> > <michael.meissner@amd.com> wrote:
> >
> > > The following patch updates some of the current SSE5 code patterns to add the
> > > following:
> > >
> > > 1) Update vector 64-bit integer multiply
> > > 2) Update vector 32x32->64-bit integer widening multiply
> > > 3) Add support for SSE5 vector/vector shift patterns
> > > 4) Add support for vectorizing rotate patterns
> >
> > Is it possible to split this patch into machine-independent and
> > machine-dependant part? Machine-independent part should be reviewed by
> > a middle-end (vectorizer) maintainer, and I will look at
> > machine-dependant/testsuite part. It is recommended to mark each part
> > of the patch with [PATCH, middle-end] or [PATCH, i386].
> >
> > BTW: If I can choose, I would prefer the later part in a unidiff format.
>
> Fair enough. Here is the machine independent portion of the patch:
Can't you do without the new tree codes? There are already too many
shift/rotate ones and the naming doesn't really distinguish them in
an obvious way. Simply overloading the existing scalar shift/rotate
codes by using the appropriate operand types works for me.
Thanks,
Richard.
> 2008-04-18 Michael Meissner <michael.meissner@amd.com>
> Dwarakanath Rajagopal <dwarak.rajagopal@amd.com>
>
> * optabs.h (OTI_vashl): New optab index for vector shift/rotate by
> vector support.
> (OTI_vlshr): Ditto.
> (OTI_vashr): Ditto.
> (OTI_vrotl): Ditto.
> (OTI_vrotr): Ditto.
> (vashl_optab): New optab for vector shift/rotate by vector
> support.
> (vlshl_optab): Ditto.
> (vashr_optab): Ditto.
> (vrotl_optab): Ditto.
> (vrotr_optab): Ditto.
>
> * optabs.c (optab_for_tree_code): Add support for vector
> shift/rotate by vector.
>
> * genopinit.c (optabs): Add vashl, vlshl, vashr, vrotl, vrotr
> optabs.
>
> * expmed.c (expand_shift): If a machine description has a vashl,
> vlshl, vashr, vrotl, or vrotr optabs, use that for vector shift
> and rotate by a vector instruction.
>
> * tree-vect-transform.c (vectorizable_operation): If a machine has
> vashl, vlshl, vashr optabs, use that for vector shift by a vector
> operation. Fall back to looking at ashl, lshl, ashr's second
> operand mode if vashl/vlshl/vashr aren't present to determine if
> the machine has a vector shift by scalar or vector shift by
> vector operation. Add vector rotate support.
>
> * tree.def (VLSHIFT_EXPR): New tree code for vector shift/rotate
> by vector.
> (VRSHIFT_EXPR): Ditto.
> (VLROTATE_EXPR): Ditto.
> (VRROTATE_EXPR): Ditto.
>
> * expr.c (expand_expr_real_1): Support vectorized rotates.
>
> * doc/c-tree.texi (VLSHIFT_EXPR): New tree code for vector
> shift/rotate by vector.
> (VRSHIFT_EXPR): Ditto.
> (VLROTATE_EXPR): Ditto.
> (VRROTATE_EXPR): Ditto.
> (LROTATE_EXPR): Document missing tree code.
> (RROTATE_EXPR): Ditto.
>
> * doc/md.texi (vashl<mode>3): Document new standard name for shift
> and rotate of a vector by a vector.
> (vashl<mode>3): Ditto.
> (vlshr<mode>3): Ditto.
> (vrotl<mode>3): Ditto.
> (vrotr<mode>3): Ditto.
>
> --- gcc/optabs.h.~0~ 2008-04-17 12:28:06.643070000 -0400
> +++ gcc/optabs.h 2008-04-15 16:40:00.462084000 -0400
> @@ -167,6 +167,18 @@ enum optab_index
> OTI_rotl,
> /* Rotate right */
> OTI_rotr,
> +
> + /* Arithmetic shift left of vector by vector */
> + OTI_vashl,
> + /* Logical shift right of vector by vector */
> + OTI_vlshr,
> + /* Arithmetic shift right of vector by vector */
> + OTI_vashr,
> + /* Rotate left of vector by vector */
> + OTI_vrotl,
> + /* Rotate right of vector by vector */
> + OTI_vrotr,
> +
> /* Signed and floating-point minimum value */
> OTI_smin,
> /* Signed and floating-point maximum value */
> @@ -412,6 +424,11 @@ extern struct optab optab_table[OTI_MAX]
> #define ashr_optab (&optab_table[OTI_ashr])
> #define rotl_optab (&optab_table[OTI_rotl])
> #define rotr_optab (&optab_table[OTI_rotr])
> +#define vashl_optab (&optab_table[OTI_vashl])
> +#define vlshr_optab (&optab_table[OTI_vlshr])
> +#define vashr_optab (&optab_table[OTI_vashr])
> +#define vrotl_optab (&optab_table[OTI_vrotl])
> +#define vrotr_optab (&optab_table[OTI_vrotr])
> #define smin_optab (&optab_table[OTI_smin])
> #define smax_optab (&optab_table[OTI_smax])
> #define umin_optab (&optab_table[OTI_umin])
> --- gcc/optabs.c.~0~ 2008-04-17 12:28:06.594117000 -0400
> +++ gcc/optabs.c 2008-04-15 16:40:00.489112000 -0400
> @@ -387,6 +387,18 @@ optab_for_tree_code (enum tree_code code
> case RROTATE_EXPR:
> return rotr_optab;
>
> + case VLSHIFT_EXPR:
> + return vashl_optab;
> +
> + case VRSHIFT_EXPR:
> + return TYPE_UNSIGNED (type) ? vlshr_optab : vashl_optab;
> +
> + case VLROTATE_EXPR:
> + return vrotl_optab;
> +
> + case VRROTATE_EXPR:
> + return vrotr_optab;
> +
> case MAX_EXPR:
> return TYPE_UNSIGNED (type) ? umax_optab : smax_optab;
>
> --- gcc/genopinit.c.~0~ 2008-04-17 12:28:06.667044000 -0400
> +++ gcc/genopinit.c 2008-04-15 16:40:00.510502000 -0400
> @@ -130,6 +130,11 @@ static const char * const optabs[] =
> "optab_handler (lshr_optab, $A)->insn_code = CODE_FOR_$(lshr$a3$)",
> "optab_handler (rotl_optab, $A)->insn_code = CODE_FOR_$(rotl$a3$)",
> "optab_handler (rotr_optab, $A)->insn_code = CODE_FOR_$(rotr$a3$)",
> + "optab_handler (vashr_optab, $A)->insn_code = CODE_FOR_$(vashr$a3$)",
> + "optab_handler (vlshr_optab, $A)->insn_code = CODE_FOR_$(vlshr$a3$)",
> + "optab_handler (vashl_optab, $A)->insn_code = CODE_FOR_$(vashl$a3$)",
> + "optab_handler (vrotl_optab, $A)->insn_code = CODE_FOR_$(vrotl$a3$)",
> + "optab_handler (vrotr_optab, $A)->insn_code = CODE_FOR_$(vrotr$a3$)",
> "optab_handler (smin_optab, $A)->insn_code = CODE_FOR_$(smin$a3$)",
> "optab_handler (smax_optab, $A)->insn_code = CODE_FOR_$(smax$a3$)",
> "optab_handler (umin_optab, $A)->insn_code = CODE_FOR_$(umin$I$a3$)",
> --- gcc/expmed.c.~0~ 2008-04-17 12:28:06.416295000 -0400
> +++ gcc/expmed.c 2008-04-15 16:40:00.531902000 -0400
> @@ -2027,6 +2027,9 @@ expand_dec (rtx target, rtx dec)
> emit_move_insn (target, value);
> }
>
> +#define optab_handler_valid_p(o, m) \
> + optab_handler(o, m)->insn_code != CODE_FOR_nothing
> +
> /* Output a shift instruction for expression code CODE,
> with SHIFTED being the rtx for the value to shift,
> and AMOUNT the tree for the amount to shift by.
> @@ -2041,14 +2044,69 @@ expand_shift (enum tree_code code, enum
> rtx op1, temp = 0;
> int left = (code == LSHIFT_EXPR || code == LROTATE_EXPR);
> int rotate = (code == LROTATE_EXPR || code == RROTATE_EXPR);
> + optab lshift_optab = ashl_optab;
> + optab rshift_arith_optab = ashr_optab;
> + optab rshift_uns_optab = lshr_optab;
> + optab lrotate_optab = rotl_optab;
> + optab rrotate_optab = rotr_optab;
> + enum machine_mode op1_mode;
> int try;
>
> + op1 = expand_normal (amount);
> + op1_mode = GET_MODE (op1);
> +
> + /* Determine whether the shift/rotate amount is a vector, or scalar. If the
> + shift amount is a vector, see if the machine has a separate set of optabs
> + for vector by vector shifts. Historically, GCC looked at the 2nd
> + operand's type in the shift optab to see what type of shift was
> + supported. */
> + if (VECTOR_MODE_P (mode) && VECTOR_MODE_P (op1_mode))
> + {
> + enum tree_code new_code = code;
> + optab shift_optab;
> +
> + switch (code)
> + {
> + default:
> + break;
> +
> + case LSHIFT_EXPR:
> + if (optab_handler_valid_p (vashl_optab, op1_mode))
> + new_code = VLSHIFT_EXPR;
> + break;
> +
> + case RSHIFT_EXPR:
> + shift_optab = (unsignedp) ? vlshr_optab : vashr_optab;
> + if (optab_handler_valid_p (shift_optab, op1_mode))
> + new_code = VRSHIFT_EXPR;
> + break;
> +
> + case LROTATE_EXPR:
> + if (optab_handler_valid_p (vrotl_optab, op1_mode))
> + new_code = VLROTATE_EXPR;
> + break;
> +
> + case RROTATE_EXPR:
> + if (optab_handler_valid_p (vrotr_optab, op1_mode))
> + new_code = VRROTATE_EXPR;
> + break;
> + }
> +
> + if (code != new_code)
> + {
> + code = new_code;
> + lshift_optab = vashl_optab;
> + rshift_arith_optab = vashr_optab;
> + rshift_uns_optab = vlshr_optab;
> + lrotate_optab = vrotl_optab;
> + rrotate_optab = vrotr_optab;
> + }
> + }
> +
> /* Previously detected shift-counts computed by NEGATE_EXPR
> and shifted in the other direction; but that does not work
> on all machines. */
>
> - op1 = expand_normal (amount);
> -
> if (SHIFT_COUNT_TRUNCATED)
> {
> if (GET_CODE (op1) == CONST_INT
> @@ -2138,12 +2196,12 @@ expand_shift (enum tree_code code, enum
> }
>
> temp = expand_binop (mode,
> - left ? rotl_optab : rotr_optab,
> + left ? lrotate_optab : rrotate_optab,
> shifted, op1, target, unsignedp, methods);
> }
> else if (unsignedp)
> temp = expand_binop (mode,
> - left ? ashl_optab : lshr_optab,
> + left ? lshift_optab : rshift_uns_optab,
> shifted, op1, target, unsignedp, methods);
>
> /* Do arithmetic shifts.
> @@ -2162,7 +2220,7 @@ expand_shift (enum tree_code code, enum
> /* Arithmetic shift */
>
> temp = expand_binop (mode,
> - left ? ashl_optab : ashr_optab,
> + left ? lshift_optab : rshift_arith_optab,
> shifted, op1, target, unsignedp, methods1);
> }
>
> --- gcc/tree-vect-transform.c.~0~ 2008-04-17 12:28:06.451265000 -0400
> +++ gcc/tree-vect-transform.c 2008-04-15 16:40:01.005950000 -0400
> @@ -3830,7 +3830,7 @@ vectorizable_operation (tree stmt, block
> tree vectype = STMT_VINFO_VECTYPE (stmt_info);
> loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
> struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
> - enum tree_code code;
> + enum tree_code code, alt_code;
> enum machine_mode vec_mode;
> tree new_temp;
> int op_type;
> @@ -3850,6 +3850,7 @@ vectorizable_operation (tree stmt, block
> tree vop0, vop1;
> unsigned int k;
> bool scalar_shift_arg = false;
> + bool shift_rotate_p = false;
>
> /* FORNOW: SLP with multiple types is not supported. The SLP analysis verifies
> this, so we can safely override NCOPIES with 1 here. */
> @@ -3923,6 +3924,59 @@ vectorizable_operation (tree stmt, block
> }
> }
>
> + /* If this is a shift/rotate, determine whether the shift amount is a vector,
> + or scalar. If the shift/rotate amount is a vector, see if the machine has
> + a separate set of optabs for vector by vector shifts. Historically, GCC
> + looked at the 2nd operand's type in the shift optab to see what type of
> + shift was supported. */
> + alt_code = code;
> + switch (code)
> + {
> + default:
> + break;
> +
> + case LSHIFT_EXPR:
> + alt_code = VLSHIFT_EXPR;
> + shift_rotate_p = true;
> + break;
> +
> + case RSHIFT_EXPR:
> + alt_code = VRSHIFT_EXPR;
> + shift_rotate_p = true;
> + break;
> +
> + case LROTATE_EXPR:
> + alt_code = VLROTATE_EXPR;
> + shift_rotate_p = true;
> + break;
> +
> + case RROTATE_EXPR:
> + alt_code = VRROTATE_EXPR;
> + shift_rotate_p = true;
> + break;
> + }
> +
> + if (shift_rotate_p)
> + {
> + if (dt[1] == vect_loop_def
> + || (!optab && (dt[1] == vect_constant_def
> + || dt[1] == vect_invariant_def)))
> + {
> + struct optab *voptab = optab_for_tree_code (alt_code, vectype);
> +
> + if (voptab
> + && (optab_handler (voptab, TYPE_MODE (vectype))->insn_code
> + != CODE_FOR_nothing))
> + {
> + if (vect_print_dump_info (REPORT_DETAILS))
> + fprintf (vect_dump, "vector shift/rotate by vector found, mode %s",
> + GET_MODE_NAME (TYPE_MODE (vectype)));
> +
> + optab = voptab;
> + }
> + }
> + }
> +
> /* Supportable by target? */
> if (!optab)
> {
> @@ -3957,11 +4011,15 @@ vectorizable_operation (tree stmt, block
> return false;
> }
>
> - if (code == LSHIFT_EXPR || code == RSHIFT_EXPR)
> + if (shift_rotate_p)
> {
> /* FORNOW: not yet supported. */
> if (!VECTOR_MODE_P (vec_mode))
> - return false;
> + {
> + if (vect_print_dump_info (REPORT_DETAILS))
> + fprintf (vect_dump, "vec_mode is not a vector type");
> + return false;
> + }
>
> /* Invariant argument is needed for a vector shift
> by a scalar shift operand. */
> @@ -4072,8 +4130,7 @@ vectorizable_operation (tree stmt, block
> /* Handle uses. */
> if (j == 0)
> {
> - if (op_type == binary_op
> - && (code == LSHIFT_EXPR || code == RSHIFT_EXPR))
> + if (op_type == binary_op && scalar_shift_arg)
> {
> /* Vector shl and shr insn patterns can be defined with scalar
> operand 2 (shift operand). In this case, use constant or loop
> --- gcc/tree.def.~0~ 2008-04-17 12:28:06.393319000 -0400
> +++ gcc/tree.def 2008-04-15 16:40:01.521653000 -0400
> @@ -683,6 +683,13 @@ DEFTREECODE (RSHIFT_EXPR, "rshift_expr",
> DEFTREECODE (LROTATE_EXPR, "lrotate_expr", tcc_binary, 2)
> DEFTREECODE (RROTATE_EXPR, "rrotate_expr", tcc_binary, 2)
>
> +/* Vector/vector shifts and rotates, where both arguments are vector types.
> + This is only used during the expansion of shifts and rotates. */
> +DEFTREECODE (VLSHIFT_EXPR, "vlshift_expr", tcc_binary, 2)
> +DEFTREECODE (VRSHIFT_EXPR, "vrshift_expr", tcc_binary, 2)
> +DEFTREECODE (VLROTATE_EXPR, "vlrotate_expr", tcc_binary, 2)
> +DEFTREECODE (VRROTATE_EXPR, "vrrotate_expr", tcc_binary, 2)
> +
> /* Bitwise operations. Operands have same mode as result. */
> DEFTREECODE (BIT_IOR_EXPR, "bit_ior_expr", tcc_binary, 2)
> DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
> --- gcc/expr.c.~0~ 2008-04-17 12:28:06.373344000 -0400
> +++ gcc/expr.c 2008-04-15 16:47:24.587040000 -0400
> @@ -8868,12 +8868,6 @@ expand_expr_real_1 (tree exp, rtx target
>
> case LROTATE_EXPR:
> case RROTATE_EXPR:
> - /* The expansion code only handles expansion of mode precision
> - rotates. */
> - gcc_assert (GET_MODE_PRECISION (TYPE_MODE (type))
> - == TYPE_PRECISION (type));
> -
> - /* Falltrough. */
> case LSHIFT_EXPR:
> case RSHIFT_EXPR:
> /* If this is a fixed-point operation, then we cannot use the code
> --- gcc/doc/c-tree.texi.~0~ 2008-04-17 12:28:07.309401000 -0400
> +++ gcc/doc/c-tree.texi 2008-04-15 16:40:01.666905000 -0400
> @@ -1926,6 +1926,12 @@ This macro returns the attributes on the
> @tindex THROW_EXPR
> @tindex LSHIFT_EXPR
> @tindex RSHIFT_EXPR
> +@tindex VLSHIFT_EXPR
> +@tindex VRSHIFT_EXPR
> +@tindex LROTATE_EXPR
> +@tindex RROTATE_EXPR
> +@tindex VLROTATE_EXPR
> +@tindex VRROTATE_EXPR
> @tindex BIT_IOR_EXPR
> @tindex BIT_XOR_EXPR
> @tindex BIT_AND_EXPR
> @@ -2300,6 +2306,22 @@ Note that the result is undefined if the
> than or equal to the first operand's type size.
>
>
> +@item VLSHIFT_EXPR
> +@itemx VRSHIFT_EXPR
> +These nodes represent left and right shifts, respectively.
> +@code{VLSHIFT_EXPR} and @code{VRSHIFT_EXPR} are used when expanding
> +shifts of vector types by the same size vector type to distinguish
> +them from shifts of vector types by scalar amounts.
> +
> +@item LROTATE_EXPR
> +@itemx RROTATE_EXPR
> +These nodes represent left and right rotates, respectively.
> +
> +@item VLROTATE_EXPR
> +@itemx VRROTATE_EXPR
> +These nodes represent left and right rotates of vector types by the
> +same size vector type, respectively.
> +
> @item BIT_IOR_EXPR
> @itemx BIT_XOR_EXPR
> @itemx BIT_AND_EXPR
> --- gcc/doc/md.texi.~0~ 2008-04-17 14:44:30.526922000 -0400
> +++ gcc/doc/md.texi 2008-04-17 14:43:55.044816000 -0400
> @@ -3858,6 +3858,20 @@ counts can optionally be specified by @c
> Other shift and rotate instructions, analogous to the
> @code{ashl@var{m}3} instructions.
>
> +@cindex @code{vashl@var{m}3} instruction pattern
> +@cindex @code{vashr@var{m}3} instruction pattern
> +@cindex @code{vlshr@var{m}3} instruction pattern
> +@cindex @code{vrotl@var{m}3} instruction pattern
> +@cindex @code{vrotr@var{m}3} instruction pattern
> +@item @samp{vashl@var{m}3}, @samp{vashr@var{m}3}, @samp{vlshr@var{m}3}, @samp{vrotl@var{m}3}, @samp{vrotr@var{m}3}
> +Vector shift and rotate instructions that take vectors as operand 2 to
> +allow a machine that has both a vector shift/rotate by a scalar
> +instruction and a separate vector shift/rotate by a vector instruction
> +to support both instructions. If these vector shift instructions are
> +not present, the machine will look at the mode of operand 2 of the
> +normal shift instruction to determine which type of vector shift is
> +supported.
> +
> @cindex @code{neg@var{m}2} instruction pattern
> @cindex @code{ssneg@var{m}2} instruction pattern
> @cindex @code{usneg@var{m}2} instruction pattern
>
>
--
Richard Guenther <rguenther@suse.de>
Novell / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 - GF: Markus Rex