This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[patch] [4.2 projects] vectorize type conversions - 6/6


Last part of the support for vectorization in the presence of data-types of
different sizes.
This part adds support for vectorization of type promotion.

The patch includes:

1) detect a widening-multiplication pattern.

2) vecorize type promotion operations. Because the widened results do not
fit in one vector, a pair of vector stmts is usually required to vectorize
a type-promotion operation. Like before, we "chain" them together via the
STMT_VINFO_RELATED_STMT field.

Two type-promotion operations are supported - cast (NOP_EXPR) and
widening-multiplication (WIDEN_MULT_EXPR). The cast is vectorized using the
vec_unpack_hi/lo idioms; The widening-multiplication is vectorized in one
ot two ways: 1) if the vectorizer can prove that the order of the products
does not have to be preserved (e.g. the result of the multiplication is
used only to feed a reduction computation) then the target hooks
builtin_mul_widen_even/odd can be used, if available (Altivec uses this in
vect-widen-mult-sum.c for example). Otherwise: 2) vectorize using the
vec_widen_mult_hi/lo idioms (preserving the order of the products).

Currently type-promotion is supported only for case that the wider-type
(the type of the result) is exactly twice as wide as the smaller type (the
type of the arguments). This restriction will be relaxed in the future.

3) The following new tree-codes/optabs/target-hooks were added:

vec_unpacku_hi_optab, vec_unpacks_hi_optab and VEC_UNPACK_HI_EXPR (widening
of the high elements of a vector).

vec_unpacku_lo_optab, vec_unpacks_lo_optab and VEC_UNPACK_LO_EXPR (widening
of the low elements of a vector).

vec_widen_umult_hi_optab, vec_widen_smult_hi_optab and
VEC_WIDEN_MULT_HI_EXPR (widening multiplication of the high elements of two
vectors).

vec_widen_umult_lo_optab, vec_widen_smult_lo_optab and
VEC_WIDEN_MULT_LO_EXPR (widening multiplication of the low elements of two
vectors).

TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_ODD,
TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_EVEN
(widening multiplication of the odd/even elements of two vectors).
These hooks are used only for the "unordered multiplication" case - i.e.
when the multiplication is used only to feed a reduction computation. We
find that during our "mark_stmts_to_be_vectorized" scan; for this purpose
the "relevant" field was changed from bool to enum, and can take one of the
following values - "not used in loop", "used for reduction only", "used in
loop".

4) modelling for the optabs and target-hooks for Altivec, and the sse
modelling that rth contributed to autovect-branch. I did not include the
ia64 modelling for these optabs (also available on autovect-branch) cause I
couldn't test it, so some of the new testcase will fail on ia64 for now
(unless we remove ia64 from the target keywords in target-support.exp).

5) New testcases and new target keywords defined in target-support.exp

Bootstrapped with vectorization enabled and tested on the vectorizer
testcases on powerpc-linux.
Also bootstrapped and tested on the vectorizer testcases on
i686-pc-linux-gnu.
Also tested with 'make info' and 'make dvi'.

ok for mainline?

thanks,
dorit

:ADDPATCH SSA (vectorizer):

2006-02-14  Dorit Nuzman  <dorit@il.ibm.com>

        * tree-vect-analyze.c (vect_analyze_operations): Call
        vectorizable_type_promotion.
        * tree-vectorizer.h (type_promotion_vec_info_type): New enum
        stmt_vec_info_type value.
        (supportable_widening_operation, vectorizable_type_promotion): New
        function declarations.
        * tree-vect-transform.c (vect_gen_widened_results_half): New
function.
        (vectorizable_type_promotion): New function.
        (vect_transform_stmt): Call vectorizable_type_promotion.
        * tree-vect-analyze.c (supportable_widening_operation): New
function.
        * tree-vect-patterns.c (vect_recog_dot_prod_pattern):
        Add implementation.
        * tree-vect-generic.c (expand_vector_operations_1): Consider
correct
        mode.

        * tree.def (VEC_WIDEN_MULT_HI_EXPR, VEC_WIDEN_MULT_LO_EXPR):
        (VEC_UNPACK_HI_EXPR, VEC_UNPACK_LO_EXPR): New tree-codes.
        * tree-inline.c (estimate_num_insns_1): Add cases for above new
        tree-codes.
        * tree-pretty-print.c (dump_generic_node, op_prio): Likewise.
        * expr.c (expand_expr_real_1): Likewise.
        * optabs.c (optab_for_tree_code): Likewise.
        (init_optabs): Initialize new optabs.
        * genopinit.c (vec_widen_umult_hi_optab, vec_widen_smult_hi_optab,
        vec_widen_smult_hi_optab, vec_widen_smult_lo_optab,
        vec_unpacks_hi_optab, vec_unpacks_lo_optab, vec_unpacku_hi_optab,
        vec_unpacku_lo_optab): Initialize new optabs.
        * optabs.h (OTI_vec_widen_umult_hi, OTI_vec_widen_umult_lo):
        (OTI_vec_widen_smult_h, OTI_vec_widen_smult_lo, OTI_vec_unpacks_hi,
        OTI_vec_unpacks_lo, OTI_vec_unpacku_hi, OTI_vec_unpacku_lo): New
        optab indices.
        (vec_widen_umult_hi_optab, vec_widen_umult_lo_optab):
        (vec_widen_smult_hi_optab, vec_widen_smult_lo_optab):
        (vec_unpacks_hi_optab, vec_unpacku_hi_optab, vec_unpacks_lo_optab):
        (vec_unpacku_lo_optab): New optabs.
        * doc/md.texi (vec_unpacks_hi, vec_unpacks_lo, vec_unpacku_hi):
        (vec_unpacku_lo, vec_widen_umult_hi, vec_widen_umult_lo):
        (vec_widen_smult_hi, vec_widen_smult_lo): New.

        * config/rs6000/altivec.md (UNSPEC_VMULWHUB, UNSPEC_VMULWLUB):
        (UNSPEC_VMULWHSB, UNSPEC_VMULWLSB, UNSPEC_VMULWHUH,
UNSPEC_VMULWLUH):
        (UNSPEC_VMULWHSH, UNSPEC_VMULWLSH): New.
        (UNSPEC_VPERMSI, UNSPEC_VPERMHI): New.
        (vec_vperm_v8hiv4si, vec_vperm_v16qiv8hi): New patterns used to
        implement the unsigned unpacking patterns.
        (vec_unpacks_hi_v16qi, vec_unpacks_hi_v8hi,vec_unpacks_lo_v16qi):
        (vec_unpacks_lo_v8hi): New signed unpacking patterns.
        (vec_unpacku_hi_v16qi, vec_unpacku_hi_v8hi, vec_unpacku_lo_v16qi):
        (vec_unpacku_lo_v8hi): New unsigned unpacking patterns.
        (vec_widen_umult_hi_v16qi, vec_widen_umult_lo_v16qi):
        (vec_widen_smult_hi_v16qi, vec_widen_smult_lo_v16qi):
        (vec_widen_umult_hi_v8hi, vec_widen_umult_lo_v8hi):
        (vec_widen_smult_hi_v8hi, vec_widen_smult_lo_v8hi): New widening
        multiplication patterns.

        * target.h (builtin_mul_widen_even, builtin_mul_widen_odd): New.
        * target-def.h (TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_EVEN):
        (TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_ODD): New.
        * config/rs6000/rs6000.c (rs6000_builtin_mul_widen_even): New.
        (rs6000_builtin_mul_widen_odd): New.
        (TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_EVEN): Defined.
        (TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_ODD): Defined.
        * tree-vectorizer.h (enum vect_relevant): New enum type.
        (_stmt_vec_info): Field relevant chaned from bool to enum
        vect_relevant.
        (STMT_VINFO_RELEVANT_P): Updated.
        (STMT_VINFO_RELEVANT): New.
        * tree-vectorizer.c (new_stmt_vec_info): Use STMT_VINFO_RELEVANT
        instead of STMT_VINFO_RELEVANT_P.
        * tree-vect-analyze.c (vect_mark_relevant, vect_stmt_relevant_p):
        Take enum argument instead of bool. Replace STMT_VINFO_RELEVANT_P
        with STMT_VINFO_RELEVANT.
        (vect_mark_stmts_to_be_vectorized): Likewise + update
        documentation.
        * doc/tm.texi (TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_EVEN): New.
        (TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_ODD): New.

2006-02-14  Richard Henderson  <rth@redhat.com>

        * config/i386/sse.md (vec_widen_smult_hi_v8hi,
        vec_widen_smult_lo_v8hi, vec_widen_umult_hi_v8hi,
        vec_widen_smult_hi_v4si, vec_widen_smult_lo_v4si,
        vec_widen_umult_hi_v4si, vec_widen_umult_lo_v4si): New.

        * config/i386/i386.c (ix86_expand_sse_unpack): New.
        * config/i386/i386-protos.h (ix86_expand_sse_unpack): New.
        * config/i386/sse.md (vec_unpacku_hi_v16qi, vec_unpacks_hi_v16qi,
        vec_unpacku_lo_v16qi, vec_unpacks_lo_v16qi, vec_unpacku_hi_v8hi,
        vec_unpacks_hi_v8hi, vec_unpacku_lo_v8hi, vec_unpacks_lo_v8hi,
        vec_unpacku_hi_v4si, vec_unpacks_hi_v4si, vec_unpacku_lo_v4si,
        vec_unpacks_lo_v4si): New.

        * config/i386/sse.md (vec_interleave_highv4sf): Rename
        sse_unpckhps.
        (vec_interleave_lowv4sf): Rename sse_unpcklps.
        (vec_interleave_highv2df, vec_interleave_lowv2df): New.
        (sse2_unpckhpd, sse2_unpcklpd): Add leading * to name.
        (vec_interleave_highv16qi): Rename sse2_punpckhbw.
        (vec_interleave_lowv16qi): Rename sse2_punpcklbw.
        (vec_interleave_highv8hi): Rename sse2_punpckhwd.
        (vec_interleave_lowv8hi): Rename sse2_punpcklwd.
        (vec_interleave_highv4si): Rename sse2_punpckhdq.
        (vec_interleave_lowv4si): Rename sse2_punpckldq.
        (vec_interleave_highv2di): Rename sse2_punpckhqdq.
        (vec_interleave_lowv2di): Rename sse2_punpcklqdq.
        * config/i386/i386.c: Update to match.

testsuite/Changelog:

        * gcc.dg/vect/vect-1.c: Loop with multiple types removed (appears
in
        vect-9.c).
        * gcc.dg/vect/vect-106.c: Removed (duplicate of vect-9.c).
        * gcc.dg/vect/vect-9.c: Now vectorizable.
        * gcc.dg/vect/vect-reduc-dot-s16a.c: Now vectorizable also on
targets
        that support vect_widen_mult.
        * gcc.dg/vect/vect-reduc-dot-u16.c: Removed (split into two new
tests).
        * gcc.dg/vect/vect-reduc-dot-u16a.c: New test (split from
        vect-reduc-dot-u16.c).
        * gcc.dg/vect/vect-reduc-dot-u16b.c: New test (split from
        vect-reduc-dot-u16.c).
        * gcc.dg/vect/vect-reduc-dot-s8.c: Removed (split into three new
tests).             * gcc.dg/vect/vect-reduc-dot-s8a.c: New test (split
from
        vect-reduc-dot-s8.c).
        * gcc.dg/vect/vect-reduc-dot-s8b.c: New test (split from
        vect-reduc-dot-s8.c).
        * gcc.dg/vect/vect-reduc-dot-s8c.c: New test (split from
        vect-reduc-dot-s8.c).
        * gcc.dg/vect/vect-reduc-dot-u8.c: Removed (split into two new
tests).
        * gcc.dg/vect/vect-reduc-dot-u8a.c: New test (split from
        vect-reduc-dot-u8.c).
        * gcc.dg/vect/vect-reduc-dot-u8b.c: New test (split from
        vect-reduc-dot-u8.c).
        * gcc.dg/vect/vect-widen-mult-sum.c: New test.
        * gcc.dg/vect/vect-multitypes-9.c: New test.
        * gcc.dg/vect/vect-multitypes-10.c: New test.
        * gcc.dg/vect/vect-widen-mult-s16.c: New test.
        * gcc.dg/vect/vect-widen-mult-u16.c: New test.
        * gcc.dg/vect/vect-widen-mult-u8.c: New test.
        * gcc.dg/vect/vect-widen-mult-s8.c: New test.
        * gcc.dg/vect/wrapv-vect-reduc-dot-s8.c: Removed.
        * gcc.dg/vect/wrapv-vect-reduc-dot-s8b.c: New reduced version of
        wrapv-vect-reduc-dot-s8.c.
        * lib/target-support.exp (check_effective_target_vect_unpack): New.
        (check_effective_target_vect_widen_sum_hi_to_si): Now also includes
        targets that support vec_unpack.
        (check_effective_target_vect_widen_sum_qi_to_hi): Likewise.
        (check_effective_target_vect_widen_mult_qi_to_hi): New.
        (check_effective_target_vect_widen_mult_hi_to_si): New.

(See attached file: multitypes.patch6.txt)(See attached file: sse.patch6)

Attachment: multitypes.patch6.txt
Description: Text document

Attachment: sse.patch6
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]