[PATCH, vectorizer]: Vectorize int->double and double->int conversions

Uros Bizjak ubizjak@gmail.com
Tue Apr 24 12:48:00 GMT 2007


Hello!

This patch vectorizes int->double and double->int conversions. As
noticed by Dorit, the core change of the patch is in
vectorizable_conversion() that now includes code to widen and narrow
operands (the same as in vectorizable_type_promotion/demotion).

Additionally, this patch implements supportable_narrowing_operation()
function as a helper function to check if target supports certain
narrowing optabs. While looking into this, I noticed that
supportable_widening_operation() should handle CONVERT_EXPR in the
same way as NOP_EXPR.

I didn't remove vectorizable_type_promotion/demotion() functions, as
the patch is already big enough, but this removal should be trivial.
However, vectorizable_type_promotion() was changed to use
supportable_narrowing_operation().

Following that change, it should be relatively easy to add additional
widening/narrowing operation (i.e. float from DImode to SFmode should
be added as narrowing op using vec_pack_float optab because we pack
2x2 elements into 4 elements - in contrast to float from SImode to
DFmode, where operation is implemented as widening op usign
vec_unpacks_float_[hi|lo] optab). However, no target currently
implements DImode->SFmode vector conversion, so the optab is omitted
from the patch.

Following testcases illustrate generated code:

int->double:

--cut here--
int x[256];
double y[256];

void foo(void)
{
  int i;
  for (i=0; i<256; ++i)
    y[i] =  (double) x[i];
}
--cut here--

compiles to:

.L2:
        movdqa  x(%eax), %xmm0
        cvtdq2pd        %xmm0, %xmm1
        pshufd  $238, %xmm0, %xmm0
        movapd  %xmm1, y(%eax,%eax)
        cvtdq2pd        %xmm0, %xmm0
        movapd  %xmm0, y+16(%eax,%eax)
        addl    $16, %eax
        cmpl    $1024, %eax
        jne     .L2

double->int:

--cut here--
void foo(void)
{
  int i;
  for (i=0; i<256; ++i)
    y[i] =  (int) x[i];
}
--cut here--

compiles to:

.L2:
        cvttpd2dq       x(%eax,%eax), %xmm0
        cvttpd2dq       x+16(%eax,%eax), %xmm1
        punpcklqdq      %xmm1, %xmm0
        movdqa  %xmm0, y(%eax)
        addl    $16, %eax
        cmpl    $1024, %eax
        jne     .L2

And finally, this case could also be vectorized:

--cut here--
int x[256];
float y[256];
double z[256];

void foo(void)
{
  int i;
  for (i=0; i<256; ++i)
    z[i] =  (double) (x[i] * y[i]);
}
--cut here--

.L2:
        cvtdq2ps        x(%eax), %xmm0
        mulps   y(%eax), %xmm0
        movhlps %xmm0, %xmm2
        cvtps2pd        %xmm0, %xmm1
        movapd  %xmm1, z(%eax,%eax)
        cvtps2pd        %xmm2, %xmm0
        movapd  %xmm0, z+16(%eax,%eax)
        addl    $16, %eax
        cmpl    $1024, %eax
        jne     .L2

The patch includes all documentation and testcases. This patch was
bootstrapped on i686-pc-linux-gnu and regression tested for all
default languages.

OK for mainline (patch needs middle-end approval)?

2007-04-24  Uros Bizjak  <ubizjak@gmail.com>

        PR tree-optimization/24659
        * optabs.h (enum optab_index): Add OTI_vec_unpacks_float_hi,
        OTI_vec_unpacks_float_lo, OTI_vec_unpacku_float_hi,
        OTI_vec_unpacku_float_lo, OTI_vec_pack_sfix_trunc and
        OTI_vec_pack_ufix_trunc.
        (vec_unpacks_float_hi_optab): Define new macro.
        (vec_unpacks_float_lo_optab): Ditto.
        (vec_unpacku_float_hi_optab): Ditto.
        (vec_unpacku_float_lo_optab): Ditto.
        (vec_pack_sfix_trunc_optab): Ditto.
        (vec_pack_ufix_trunc_optab): Ditto.
        * genopinit.c (optabs): Implement vec_unpack[s|u]_[hi|lo]_optab
        and vec_pack_[s|u]fix_trunc_optab using
        vec_unpack[s|u]_[hi\lo]_* and vec_pack_[u|s]fix_trunc_* patterns
        * tree-vectorizer.c (supportable_widening_operation): Handle
        FLOAT_EXPR and CONVERT_EXPR.  Update comment.
        (supportable_narrowing_operation): New function.
        * tree-vectorizer.h (supportable_narrowing_operation): Prototype.
        * tree-vect-transform.c (vectorizable_conversion): Handle
        (nunits_in == nunits_out / 2) and (nunits_out == nunits_in / 2) cases.
        (vect_gen_widened_results_half): Move before vectorizable_conversion.
        (vectorizable_type_demotion): Call supportable_narrowing_operation()
        to check for target support.
        * optabs.c (optab_for_tree_code) Return vec_unpack[s|u]_float_hi_optab
        for VEC_UNPACK_FLOAT_HI_EXPR, vec_unpack[s|u]_float_lo_optab
        for VEC_UNPACK_FLOAT_LO_EXPR and vec_pack_[u|s]fix_trunc_optab
        for VEC_PACK_FIX_TRUNC_EXPR.
        (expand_binop): Special case mode of the result for
        vec_pack_[u|s]fix_trunc_optab.
        (init_optabs): Initialize vec_unpack[s|u]_[hi|lo]_optab and
        vec_pack_[u|s]fix_trunc_optab.

        * tree.def (VEC_UNPACK_FLOAT_HI_EXPR, VEC_UNPACK_FLOAT_LO_EXPR,
        VEC_PACK_FIX_TRUNC_EXPR): New tree codes.
        * tree-pretty-print.c (dump_generic_node): Handle
        VEC_UNPACK_FLOAT_HI_EXPR, VEC_UNPACK_FLOAT_LO_EXPR and
        VEC_PACK_FIX_TRUNC_EXPR.
        (op_prio): Ditto.
        * expr.c (expand_expr_real_1): Ditto.
        * tree-inline.c (estimate_num_insns_1): Ditto.
        * tree-vect-generic.c (expand_vector_operations_1): Ditto.

        * config/i386/sse.md (vec_unpacks_float_hi_v4si): New expander.
        (vec_unpacks_float_lo_v4si): Ditto.
        (vec_pack_sfix_trunc_v2df): Ditto.

        * doc/c-tree.texi (Expression trees) [VEC_UNPACK_FLOAT_HI_EXPR]:
        Document.
        [VEC_UNPACK_FLOAT_LO_EXPR]: Ditto.
        [VEC_PACK_FIX_TRUNC_EXPR]: Ditto.
        * doc/md.texi (Standard Names) [vec_pack_sfix_trunc]: Document.
        [vec_pack_ufix_trunc]: Ditto.
        [vec_unpacks_float_hi]: Ditto.
        [vec_unpacks_float_lo]: Ditto.
        [vec_unpacku_float_hi]: Ditto.
        [vec_unpacku_float_lo]: Ditto.

testsuite/ChangeLog:

2007-04-24  Uros Bizjak  <ubizjak@gmail.com>

        PR tree-optimization/24659
        * gcc.dg/vect/vect-floatint-conversion-2.c: New test.
        * gcc.dg/vect/vect-intfloat-conversion-3.c: New test.

Uros.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vect-intfloat.diff
Type: application/octet-stream
Size: 38728 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20070424/87df7091/attachment.obj>


More information about the Gcc-patches mailing list