This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC PATCH, vectorizer]: Vectorize int -> double conversions
- From: Dorit Nuzman <DORIT at il dot ibm dot com>
- To: "Uros Bizjak" <ubizjak at gmail dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Mon, 23 Apr 2007 22:08:20 +0300
- Subject: Re: [RFC PATCH, vectorizer]: Vectorize int -> double conversions
"Uros Bizjak" <ubizjak@gmail.com> wrote on 23/04/2007 15:03:49:
> Hello!
>
> Attached (RFC!) patch implements vectorization of int -> double
> conversion. The testcase:
>
> /* int -> double */
> for (i = 0; i < N; i++)
> {
> da[i] = (double) ib[i];
> }
>
> compiles on i686 -msse2 target into:
>
> .L3:
> movdqa (%eax,%ecx), %xmm0
> cvtdq2pd %xmm0, %xmm1
> pshufd $238, %xmm0, %xmm0
> movapd %xmm1, (%edx,%eax,2)
> cvtdq2pd %xmm0, %xmm0
> movapd %xmm0, 16(%edx,%eax,2)
> addl $16, %eax
> cmpl $128, %eax
> jne .L3
>
> (pshufd is there to shuffle SImode vector from x0x1x2x3 into x2x3x2x3.
>
> Regarding the patch: in vectorizable_conversion() we detect
> (nunits_out == nunits_in / 2) as EXPAND case and handle conversion in
> the same way as vectorizable_type_promotion(). Unfortunatelly, we need
> a couple of new tree codes and 4 new optabs to handle signed and
> unsigned conversion.
>
> Unfortunatelly, if correct optabs are not defined, compilation aborts
> in vect_transform_stmt():
>
> 4351 case type_conversion_vec_info_type:
> 4352 done = vectorizable_conversion (stmt, bsi, &vec_stmt);
> 4353 gcc_assert (done);
> 4354 break;
>
> vectorizable_conversion() returns false when it can't handle EXPAND
> case due to missing optabs. Sure, we will abort then, but
> vectorizable_convresion() has many other exits with "false". I'm kind
> of out of ideas here how transformation should be skipped in this
> case. This is why the patch is RFC only.
>
Hi Uros,
So judging from your more recent email the problem you describe above is
now solved, right?
Just a few quick comments:
- It would help if you could generate your patches in the future with the
function context (-c3p).
- According to coding conventions declarations should be avoided if
possible:
> +static tree vect_gen_widened_results_half (enum tree_code, tree, tree,
> + tree, tree, int,
> + tree, block_stmt_iterator *, tree);
- Your patch combines the logic of
vectorizable_type_promotion/demotion/conversion together in one function,
or rather duplicates code from vectorizable_type_promotion/demotion into
vectorizable_conversion, which brings up the question whether we need these
3 separate functions at all (I think it's not the first time this question
came up - I did want to look at creating one template for all the
vectorizable_* functions sometime). Anyhow - I don't think this necessarily
has to be part of your patch, but looks like it may be a nice follow-up
patch.
There's also the question of tree-codes that Danny brought up that we
should consider.
Other than that - looks ok to me.
dorit
> The code is also prepared for SHRINK case, but it is not handled ATM.
>
> 2007-04-23 Uros Bizjak <ubizjak@gmail.com>
>
> PR tree-optimization/24659
> * optabs.h (enum optab_index): Add OTI_vec_unpacks_float_hi,
> OTI_vec_unpacks_float_lo, OTI_vec_unpacku_float_hi and
> OTI_vec_unpacku_float_lo.
> (vec_unpacks_float_hi_optab): Define new macro.
> (vec_unpacks_float_lo_optab): Ditto.
> (vec_unpacku_float_hi_optab): Ditto.
> (vec_unpacku_float_lo_optab): Ditto.
> * genopinit.c (optabs): Implement vec_unpack[s|u]_[hi|lo]_optab
> using vec_unpack[s|u]_[hi\lo]_* patterns.
> * tree-vect-transform.c (vect_gen_widened_results_half): Prototype.
> (vectorizable_conversion): Handle (nunits_in == nunits_out / 2)
> and (nunits_out == nunits_in / 2) cases.
> * optabs.c (optab_for_tree_code) Return vec_unpack[s|u]_float_hi_optab
> for VEC_UNPACK_FLOAT_HI_EXPR and vec_unpack[s|u]_float_lo_optab
> for VEC_UNPACK_FLOAT_LO_EXPR.
> (init_optabs): Initialize vec_unpack[s|u]_[hi|lo]_optab.
>
> * tree.def (VEC_UNPACK_FLOAT_HI_EXPR, VEC_UNPACK_FLOAT_LO_EXPR):
> New tree codes.
> * tree-pretty-print.c (dump_generic_node): Handle
> VEC_UNPACK_FLOAT_HI_EXPR and VEC_UNPACK_FLOAT_LO_EXPR.
> (op_prio): Ditto.
> * expr.c (expand_expr_real_1): Ditto.
> * tree-inline.c (estimate_num_insns_1): Ditto.
> * tree-vect-generic.c (expand_vector_operations_1): Ditto.
>
> * config/i386/sse.md (vec_unpacks_float_hi_v4si): New expander.
> (vec_unpacks_float_lo_v4si): Ditto.
>
> testsuite/ChangeLog:
>
> 2007-04-23 Uros Bizjak <ubizjak@gmail.com>
>
> PR tree-optimization/24659
> * gcc.dg/vect/vect-float-intfloat-3.c: New test.
>
> Uros.
> [attachment "vect-intfloat.diff" deleted by Dorit Nuzman/Haifa/IBM]