[autovect] [patch] vectorize reduction without specialized target support
Dorit Naishlos
DORIT@il.ibm.com
Wed Jun 15 20:16:00 GMT 2005
The epilog code of a vectorized reduction computation reduces a vector of
partial results into a single scalar result. Currently we do that using a
"REDUC_op" stmt followed by a BIT_FIELD expression to extract the scalar
result from the vector register:
va_4 = reduc_op <va_3>
a_5 = bit_field_ref_expr <va_4, bitpos>
This scheme is only applicable for targets that support these REDUC_op
operations.
This patch adds support for two additional ways to generate the reduction
epilog code, that don't require a specialized REDUC_op:
(alternative 1) sequentially compute the reduction on the scalar elements
(the partial results) of the vector.
(alternative 2) For targets that support whole vector shifts, we generate
the following:
for (offset = VS/2; offset >= element_size; offset/=2) {
va' = vec_shift_left <va, offset>
va = vop <va, va'>
}
va = bit_field_ref_expr <va, bitpos>
For example, when VS=16 bytes and the data type operated upon is ints, this
will be generated:
va_4 = vec_shift_left <va_3, 8>
va_5 = vop <va_3, va_4>
va_6 = vec_shift_left <va_5, 4>
va_7 = vop <va_5, va_6>
a_5 = bit_field_ref_expr <va_7, bitpos>
This scheme requires adding new tree-codes for the vector-shifts. I
introduced the simplest vector-shift possible - the shift amount is in
bytes, and is a constant (immediate value). This has to potential to be
easily expanded to the most efficient code, and be applicable to as many
targets as possible.
The overall scheme is:
if (HAVE_REDUC_op)
generate current scheme using REDUC_op
else if (HAVE_whole-vector-shift)
generate alternative 2
else
generate alternative 1
This allows vectorizing reduction without special target support - the
testcases are changed approprietly (enabled for all relevant targets). Also
removed the now unnecessary reduc_op patterns from altivec.md (these were
implemented using vector shifts).
Bootstrapped and testsed on powerpc-darwin and i686-pc-linux-gnu, committed
to autovect branch
dorit
Changelog:
* genopinit.c (vec_shli_optab, vec_shri_optab): Initialize new
optabs.
(reduc_plus_optab): Removed. Replcaed with...
(reduc_splus_optab, reduc_uplus_optab): Initialize new optabs.
* optabs.c (optab_for_tree_code): Return reduc_splus_optab or
reduc_uplus_optab instead of reduc_plus_optab.
(expand_vec_shift_expr): New function.
(init_optabs): Initialize new optabs. Remove initialization of
reduc_plus_optab.
* optabs.h (OTI_reduc_plus): Removed. Replaced with...
(OTI_reduc_splus, OTI_reduc_uplus): New.
(reduc_plus_optab): Removed. Replcaed with...
(reduc_splus_optab, reduc_uplus_optab): New optabs.
(vec_shli_optab, vec_shri_optab): New optabs.
(expand_vec_shift_expr): New function declaration.
* tree.def (VEC_LSHIFT_EXPR, VEC_RSHIFT_EXPR): New tree-codes.
* tree-inline.c (estimate_num_insns_1): Handle new tree-codes.
* expr.c (expand_expr_real_1): Handle new tree-codes.
* tree-pretty-print.c (dump_generic_node, op_symbol, op_prio):
Likewise.
* tree-vect-transform.c (vect_create_epilog_for_reduction): Add two
alternatives for generating reduction epilog code.
(vectorizable_reduction): Don't fail of direct reduction support is
not available.
(vectorizable_target_reduction_pattern): Likewise.
* config/rs6000/altivec.md (reduc_smax_v4si, reduc_smax_v4sf,
reduc_umax_v4si, reduc_smin_v4si, reduc_smin_v4sf, reduc_umin_v4si,
reduc_plus_v4si, reduc_plus_v4sf): Removed.
(vec_shli_<mode>, vec_shri_<mode>, altivec_vsumsws_nomode,
reduc_splus_<mode>, reduc_uplus_v16qi): New.
testsuite/Changelog:
* gcc.dg/vect/no_version/vect-reduc-1.c: Vectorizable on all
relevant platforms - remove "target powerpc*-*-*" restriction.
* gcc.dg/vect/no_version/vect-reduc-2.c: Likewise.
* gcc.dg/vect/no_version/vect-reduc-3.c: Likewise.
* gcc.dg/vect/no_version/vect-reduc-2short.c: New test.
* gcc.dg/vect/no_version/vect-reduc-2char.c: New test.
* gcc.dg/vect/no_version/vect-reduc-3short.c: New test.
* gcc.dg/vect/no_version/vect-reduc-3char.c: New test.
patch:
(See attached file: autovect.june15)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: autovect.june15
Type: application/octet-stream
Size: 45679 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20050615/78fcb1b8/attachment.obj>
More information about the Gcc-patches
mailing list