This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC PATCH, vectorizer]: Vectorize int -> double conversions


"Uros Bizjak" <ubizjak@gmail.com> wrote on 23/04/2007 15:03:49:

> Hello!
>
> Attached (RFC!) patch implements vectorization of int -> double
> conversion. The testcase:
>
>   /* int -> double */
>   for (i = 0; i < N; i++)
>     {
>       da[i] = (double) ib[i];
>     }
>
> compiles on i686 -msse2 target into:
>
> .L3:
>         movdqa  (%eax,%ecx), %xmm0
>         cvtdq2pd        %xmm0, %xmm1
>         pshufd  $238, %xmm0, %xmm0
>         movapd  %xmm1, (%edx,%eax,2)
>         cvtdq2pd        %xmm0, %xmm0
>         movapd  %xmm0, 16(%edx,%eax,2)
>         addl    $16, %eax
>         cmpl    $128, %eax
>         jne     .L3
>
> (pshufd is there to shuffle SImode vector from x0x1x2x3 into x2x3x2x3.
>
> Regarding the patch: in vectorizable_conversion() we detect
> (nunits_out == nunits_in / 2) as EXPAND case and handle conversion in
> the same way as vectorizable_type_promotion(). Unfortunatelly, we need
> a couple of new tree codes and 4 new optabs to handle signed and
> unsigned conversion.
>
> Unfortunatelly, if correct optabs are not defined, compilation aborts
> in vect_transform_stmt():
>
> 4351    case type_conversion_vec_info_type:
> 4352      done = vectorizable_conversion (stmt, bsi, &vec_stmt);
> 4353      gcc_assert (done);
> 4354      break;
>
> vectorizable_conversion() returns false when it can't handle EXPAND
> case due to missing optabs. Sure, we will abort then, but
> vectorizable_convresion() has many other exits with "false". I'm kind
> of out of ideas here how transformation should be skipped in this
> case. This is why the patch is RFC only.
>

Hi Uros,

So judging from your more recent email the problem you describe above is
now solved, right?

Just a few quick comments:

- It would help if you could generate your patches in the future with the
function context (-c3p).

- According to coding conventions declarations should be avoided if
possible:
> +static tree vect_gen_widened_results_half (enum tree_code, tree, tree,
> +                              tree, tree, int,
> +                              tree, block_stmt_iterator *, tree);

- Your patch combines the logic of
vectorizable_type_promotion/demotion/conversion together in one function,
or rather duplicates code from vectorizable_type_promotion/demotion into
vectorizable_conversion, which brings up the question whether we need these
3 separate functions at all (I think it's not the first time this question
came up - I did want to look at creating one template for all the
vectorizable_* functions sometime). Anyhow - I don't think this necessarily
has to be part of your patch, but looks like it may be a nice follow-up
patch.

There's also the question of tree-codes that Danny brought up that we
should consider.

Other than that - looks ok to me.

dorit

> The code is also prepared for SHRINK case,  but it is not handled ATM.
>
> 2007-04-23  Uros Bizjak  <ubizjak@gmail.com>
>
>    PR tree-optimization/24659
>         * optabs.h (enum optab_index): Add OTI_vec_unpacks_float_hi,
>    OTI_vec_unpacks_float_lo, OTI_vec_unpacku_float_hi and
>    OTI_vec_unpacku_float_lo.
>    (vec_unpacks_float_hi_optab): Define new macro.
>    (vec_unpacks_float_lo_optab): Ditto.
>    (vec_unpacku_float_hi_optab): Ditto.
>    (vec_unpacku_float_lo_optab): Ditto.
>    * genopinit.c (optabs): Implement vec_unpack[s|u]_[hi|lo]_optab
>    using vec_unpack[s|u]_[hi\lo]_* patterns.
>    * tree-vect-transform.c (vect_gen_widened_results_half): Prototype.
>    (vectorizable_conversion): Handle (nunits_in == nunits_out / 2)
>    and (nunits_out == nunits_in / 2) cases.
>    * optabs.c (optab_for_tree_code) Return vec_unpack[s|u]_float_hi_optab
>    for VEC_UNPACK_FLOAT_HI_EXPR and vec_unpack[s|u]_float_lo_optab
>    for VEC_UNPACK_FLOAT_LO_EXPR.
>    (init_optabs): Initialize vec_unpack[s|u]_[hi|lo]_optab.
>
>    * tree.def (VEC_UNPACK_FLOAT_HI_EXPR, VEC_UNPACK_FLOAT_LO_EXPR):
>    New tree codes.
>    * tree-pretty-print.c (dump_generic_node): Handle
>    VEC_UNPACK_FLOAT_HI_EXPR and VEC_UNPACK_FLOAT_LO_EXPR.
>    (op_prio): Ditto.
>    * expr.c (expand_expr_real_1): Ditto.
>    * tree-inline.c (estimate_num_insns_1): Ditto.
>    * tree-vect-generic.c (expand_vector_operations_1): Ditto.
>
>    * config/i386/sse.md (vec_unpacks_float_hi_v4si): New expander.
>    (vec_unpacks_float_lo_v4si): Ditto.
>
> testsuite/ChangeLog:
>
> 2007-04-23  Uros Bizjak  <ubizjak@gmail.com>
>
>    PR tree-optimization/24659
>    * gcc.dg/vect/vect-float-intfloat-3.c: New test.
>
> Uros.
> [attachment "vect-intfloat.diff" deleted by Dorit Nuzman/Haifa/IBM]


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]