[RFC] [patch] Support vectorization of min/max location pattern
Ira Rosen
IRAR@il.ibm.com
Tue Jul 6 07:15:00 GMT 2010
gcc-patches-owner@gcc.gnu.org wrote on 01/07/2010 11:00:50 AM:
> Hi,
>
> This patch adds vectorization support of min/max location pattern:
>
> for (i = 0; i < N; i++)
> if (arr[i] < limit)
> {
> pos = i + 1;
> limit = arr[i];
> }
>
> The recognized pattern is compound of two statements (and is called
> compound pattern):
>
> # pos_22 = PHI <pos_1(4), 1(2)>
> # limit_24 = PHI <limit_4(4), 0(2)>
> ...
> pos_1 = [cond_expr] limit_9 < limit_24 ? pos_10 : pos_22;
> limit_4 = [cond_expr] limit_9 < limit_24 ? limit_9 : limit_24;
>
> both statements should be reductions with cond_expr and have the same
> condition part. The min/max statement is expected to be of the form "x op
> y ? x : y" (where op can be >, <, >= or <=), and the location is expected
> to be an induction.
>
> To vectorize min/max location pattern we use a technique described in
> "Multimedia vectorization of floating-point MIN/MAX reductions" by
> A.J.C.Bik, X.Tian and M.B.Girkar,
> http://portal.acm.org/citation.cfm?id=1145765.
>
> Vectorized loop (maxloc, first index):
> vcx[0:vl-1:1] = | x |..| x |; - vector of max values
> vck[0:vl-1:1] = | k |..| k |; - vector of positions
> ind[0:vl-1:1] = |vl-1|..| 0 |;
> inc[0:vl-1:1] = | vl |..| vl |;
> for (i = 0; i < N; i += vl) {
> msk[0:vl-1:1] = (a[i:i+vl-1:1] > vcx[0:vl-1:1]);
> vck[0:vl-1:1] = (ind[0:vl-1:1] & msk[0:vl-1:1]) |
> (vck[0:vl-1:1] & !msk[0:vl-1:1]);
> vcx[0:vl-1:1] = VMAX(vcx[0:vl-1:1], a[i:i+vl-1:1]);
> ind[0:vl-1:1] += inc[0:vl-1:1];
> }
> x = HMAX(vcx[0:vl-1:1]); - scalar maximum extraction
> msk[0:vl-1:1] = (vcx[0:vl-1:1] == |x|..|x|);
> vck[0:vl-1:1] = (vck[0:vl-1:1] & msk[0:vl-1:1]) |
> (|MaxInt|..|MaxInt| & !msk[0:vl-1:1]);
> k = HMIN(vck[0:vl-1:1]); - first position extraction
>
>
> Vectorization of minloc is supposed to help gas_dyn from Polyhedron as
> discussed in PR 31067.
>
> PRs 44710 and 44711 currently prevent the vectorization. PR 44711 can be
> bypassed by using -fno-tree-pre. I'll wait for a fix of PR 44710 before I
> commit this patch (after I regtest it again).
> Also the case of pos = i; instead of pos = i+1; is not supported since in
> this case the operands are switched, i.e., we get "x op y ? y : x".
>
>
> My main question is the implementation of vector comparisons. I
understand
> that different targets can return different types of results. So instead
of
> defining new tree codes, I used target builtin which also returns the
type
> of the result.
>
> Other comments are welcome too.
>
> Bootstrapped and tested on powerpc64-suse-linux.
Since it looks like nobody objects the use of target builtins for vector
comparison, I am resubmitting an updated patch (the code) for review of
non-vectorizer parts.
Thanks,
Ira
ChangeLog:
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (TARGET_VECTORIZE_BUILTIN_VECT_COMPARE):
Document.
* target.def (builtin_vect_compare): Add new builtin.
* tree-vectorizer.h (enum vect_compound_pattern): New.
(struct _stmt_vec_info): Add new fields compound_pattern and
reduc_scalar_result_stmt. Add macros to access them.
(is_pattern_stmt_p): Return true for compound pattern.
(vectorizable_condition): Add arguments.
(vect_recog_compound_func_ptr): New function-pointer type.
(NUM_COMPOUND_PATTERNS): New.
(vect_compound_pattern_recog): Declare.
* tree-vect-loop.c (vect_determine_vectorization_factor): Fix assert
for compound patterns.
(vect_analyze_scalar_cycles_1): Fix typo. Detect compound reduction
patterns. Update comment.
(vect_analyze_scalar_cycles): Update comment.
(destroy_loop_vec_info): Update def stmt for the original pattern
statement.
(vect_is_simple_reduction_1): Skip compound pattern statements in
uses check. Add spaces. Skip commutativity and type checks for
minimum location statement. Fix printings.
(vect_model_reduction_cost): Add min/max location pattern cost
computation.
(vect_create_epilogue_for_compound_pattern): New function.
(vect_create_epilog_for_reduction): Don't retrieve the original
statement for compound pattern. Fix comment accordingly. Store the
result of vector reduction computation in a variable and use it. Call
vect_create_epilogue_for_compound_pattern (). Check if optab exists
before using it. Keep the scalar result computation statement. Use
either exit phi node result or compound pattern result in scalar
extraction. Don't expect to find an exit phi node for min/max
statement.
(vectorizable_reduction): Skip check for uses in loop for compound
patterns. Don't retrieve the original statement for compound pattern.
Call vectorizable_condition () with additional parameters. Skip
reduction code check for compound patterns. Prepare operands for
min/max location statement vectorization and pass them to
vectorizable_condition ().
(vectorizable_live_operation): Return TRUE for compound patterns.
* tree-vect-patterns.c (vect_recog_min_max_loc_pattern): Declare.
(vect_recog_compound_func_ptrs): Likewise.
(vect_recog_min_max_loc_pattern): New function.
(vect_compound_pattern_recog): Likewise.
* tree-vect-stmts.c (process_use): Mark compound pattern statements
as used by reduction.
(vect_mark_stmts_to_be_vectorized): Allow compound pattern statements
to be used by reduction.
(vectorize_minmax_location_pattern): New function.
(vectorizable_condition): Update comment, add arguments. Skip checks
irrelevant for compound pattern. Check that vector comparisons are
supported by the target. Prepare operands using new arguments. Call
vectorize_minmax_location_pattern().
(vect_analyze_stmt): Allow nested cycle statements to be used by
reduction. Call vectorizable_condition () with additional arguments.
(vect_transform_stmt): Call vectorizable_condition () with additional
arguments.
(new_stmt_vec_info): Initialize new fields.
* config/rs6000/rs6000-builtin.def (ALTIVEC_BUILTIN_VCMPLTFP): New.
(ALTIVEC_BUILTIN_VCMPLEFP): New.
* config/rs6000/rs6000.c (rs6000_builtin_vect_compare): New.
(TARGET_VECTORIZE_BUILTIN_VEC_CMP): Redefine.
(struct builtin_description bdesc_2arg): Add altivec_vcmpltfp and
altivec_vcmplefp.
* config/rs6000/altivec.md (altivec_vcmpltfp): New pattern.
(altivec_vcmplefp): Likewise.
* tree-vect-slp.c (vect_get_and_check_slp_defs): Fail for compound
patterns.
(See attached file: minloc-new.txt)
>
> testsuite/ChangeLog:
>
> * gcc.dg/vect/vect.exp: Define how to run tests named fast-math*.c
> * lib/target-supports.exp (check_effective_target_vect_cmp): New.
> * gcc.dg/vect/fast-math-no-pre-minmax-loc-1.c: New test.
> * gcc.dg/vect/fast-math-no-pre-minmax-loc-2.c,
> gcc.dg/vect/fast-math-no-pre-minmax-loc-3.c,
> gcc.dg/vect/fast-math-no-pre-minmax-loc-4.c,
> gcc.dg/vect/fast-math-no-pre-minmax-loc-5.c,
> gcc.dg/vect/fast-math-no-pre-minmax-loc-6.c,
> gcc.dg/vect/fast-math-no-pre-minmax-loc-7.c,
> gcc.dg/vect/fast-math-no-pre-minmax-loc-8.c,
> gcc.dg/vect/fast-math-no-pre-minmax-loc-9.c: Likewise.
>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: minloc-new.txt
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20100706/7d71a795/attachment.txt>
More information about the Gcc-patches
mailing list