[PATCH, vectorizer]: Vectorize int->double and double->int conversions
Uros Bizjak
ubizjak@gmail.com
Tue Apr 24 12:48:00 GMT 2007
Hello!
This patch vectorizes int->double and double->int conversions. As
noticed by Dorit, the core change of the patch is in
vectorizable_conversion() that now includes code to widen and narrow
operands (the same as in vectorizable_type_promotion/demotion).
Additionally, this patch implements supportable_narrowing_operation()
function as a helper function to check if target supports certain
narrowing optabs. While looking into this, I noticed that
supportable_widening_operation() should handle CONVERT_EXPR in the
same way as NOP_EXPR.
I didn't remove vectorizable_type_promotion/demotion() functions, as
the patch is already big enough, but this removal should be trivial.
However, vectorizable_type_promotion() was changed to use
supportable_narrowing_operation().
Following that change, it should be relatively easy to add additional
widening/narrowing operation (i.e. float from DImode to SFmode should
be added as narrowing op using vec_pack_float optab because we pack
2x2 elements into 4 elements - in contrast to float from SImode to
DFmode, where operation is implemented as widening op usign
vec_unpacks_float_[hi|lo] optab). However, no target currently
implements DImode->SFmode vector conversion, so the optab is omitted
from the patch.
Following testcases illustrate generated code:
int->double:
--cut here--
int x[256];
double y[256];
void foo(void)
{
int i;
for (i=0; i<256; ++i)
y[i] = (double) x[i];
}
--cut here--
compiles to:
.L2:
movdqa x(%eax), %xmm0
cvtdq2pd %xmm0, %xmm1
pshufd $238, %xmm0, %xmm0
movapd %xmm1, y(%eax,%eax)
cvtdq2pd %xmm0, %xmm0
movapd %xmm0, y+16(%eax,%eax)
addl $16, %eax
cmpl $1024, %eax
jne .L2
double->int:
--cut here--
void foo(void)
{
int i;
for (i=0; i<256; ++i)
y[i] = (int) x[i];
}
--cut here--
compiles to:
.L2:
cvttpd2dq x(%eax,%eax), %xmm0
cvttpd2dq x+16(%eax,%eax), %xmm1
punpcklqdq %xmm1, %xmm0
movdqa %xmm0, y(%eax)
addl $16, %eax
cmpl $1024, %eax
jne .L2
And finally, this case could also be vectorized:
--cut here--
int x[256];
float y[256];
double z[256];
void foo(void)
{
int i;
for (i=0; i<256; ++i)
z[i] = (double) (x[i] * y[i]);
}
--cut here--
.L2:
cvtdq2ps x(%eax), %xmm0
mulps y(%eax), %xmm0
movhlps %xmm0, %xmm2
cvtps2pd %xmm0, %xmm1
movapd %xmm1, z(%eax,%eax)
cvtps2pd %xmm2, %xmm0
movapd %xmm0, z+16(%eax,%eax)
addl $16, %eax
cmpl $1024, %eax
jne .L2
The patch includes all documentation and testcases. This patch was
bootstrapped on i686-pc-linux-gnu and regression tested for all
default languages.
OK for mainline (patch needs middle-end approval)?
2007-04-24 Uros Bizjak <ubizjak@gmail.com>
PR tree-optimization/24659
* optabs.h (enum optab_index): Add OTI_vec_unpacks_float_hi,
OTI_vec_unpacks_float_lo, OTI_vec_unpacku_float_hi,
OTI_vec_unpacku_float_lo, OTI_vec_pack_sfix_trunc and
OTI_vec_pack_ufix_trunc.
(vec_unpacks_float_hi_optab): Define new macro.
(vec_unpacks_float_lo_optab): Ditto.
(vec_unpacku_float_hi_optab): Ditto.
(vec_unpacku_float_lo_optab): Ditto.
(vec_pack_sfix_trunc_optab): Ditto.
(vec_pack_ufix_trunc_optab): Ditto.
* genopinit.c (optabs): Implement vec_unpack[s|u]_[hi|lo]_optab
and vec_pack_[s|u]fix_trunc_optab using
vec_unpack[s|u]_[hi\lo]_* and vec_pack_[u|s]fix_trunc_* patterns
* tree-vectorizer.c (supportable_widening_operation): Handle
FLOAT_EXPR and CONVERT_EXPR. Update comment.
(supportable_narrowing_operation): New function.
* tree-vectorizer.h (supportable_narrowing_operation): Prototype.
* tree-vect-transform.c (vectorizable_conversion): Handle
(nunits_in == nunits_out / 2) and (nunits_out == nunits_in / 2) cases.
(vect_gen_widened_results_half): Move before vectorizable_conversion.
(vectorizable_type_demotion): Call supportable_narrowing_operation()
to check for target support.
* optabs.c (optab_for_tree_code) Return vec_unpack[s|u]_float_hi_optab
for VEC_UNPACK_FLOAT_HI_EXPR, vec_unpack[s|u]_float_lo_optab
for VEC_UNPACK_FLOAT_LO_EXPR and vec_pack_[u|s]fix_trunc_optab
for VEC_PACK_FIX_TRUNC_EXPR.
(expand_binop): Special case mode of the result for
vec_pack_[u|s]fix_trunc_optab.
(init_optabs): Initialize vec_unpack[s|u]_[hi|lo]_optab and
vec_pack_[u|s]fix_trunc_optab.
* tree.def (VEC_UNPACK_FLOAT_HI_EXPR, VEC_UNPACK_FLOAT_LO_EXPR,
VEC_PACK_FIX_TRUNC_EXPR): New tree codes.
* tree-pretty-print.c (dump_generic_node): Handle
VEC_UNPACK_FLOAT_HI_EXPR, VEC_UNPACK_FLOAT_LO_EXPR and
VEC_PACK_FIX_TRUNC_EXPR.
(op_prio): Ditto.
* expr.c (expand_expr_real_1): Ditto.
* tree-inline.c (estimate_num_insns_1): Ditto.
* tree-vect-generic.c (expand_vector_operations_1): Ditto.
* config/i386/sse.md (vec_unpacks_float_hi_v4si): New expander.
(vec_unpacks_float_lo_v4si): Ditto.
(vec_pack_sfix_trunc_v2df): Ditto.
* doc/c-tree.texi (Expression trees) [VEC_UNPACK_FLOAT_HI_EXPR]:
Document.
[VEC_UNPACK_FLOAT_LO_EXPR]: Ditto.
[VEC_PACK_FIX_TRUNC_EXPR]: Ditto.
* doc/md.texi (Standard Names) [vec_pack_sfix_trunc]: Document.
[vec_pack_ufix_trunc]: Ditto.
[vec_unpacks_float_hi]: Ditto.
[vec_unpacks_float_lo]: Ditto.
[vec_unpacku_float_hi]: Ditto.
[vec_unpacku_float_lo]: Ditto.
testsuite/ChangeLog:
2007-04-24 Uros Bizjak <ubizjak@gmail.com>
PR tree-optimization/24659
* gcc.dg/vect/vect-floatint-conversion-2.c: New test.
* gcc.dg/vect/vect-intfloat-conversion-3.c: New test.
Uros.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vect-intfloat.diff
Type: application/octet-stream
Size: 38728 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20070424/87df7091/attachment.obj>
More information about the Gcc-patches
mailing list