This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix MMX/SSE/AVX* shifts by non-immediate scalar (PR target/80286)


On Thu, Apr 6, 2017 at 10:40 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Thu, Apr 06, 2017 at 09:33:58AM +0200, Uros Bizjak wrote:
>> Newly introduced alternatives (x/x) and (v/v) are valid also for
>> 32-bit targets, so we have to adjust insn constraint of
>> *vec_extractv4si_0_zext and enable alternatives accordingly. After the
>
> That is true.  But if we provide just the x/x and v/v alternatives in
> *vec_extractv4si_0_zext, then it will be forced to always do the zero
> extraction on the SSE registers in 32-bit mode.  Is that what we want?

Yes, for SSE4 targets. We are sure that we have SSE source register
here, and there is no direct zero-extension to a general reg in
32-bit case.

> As for the define_insn_and_split that would transform sign extensions
> used solely by the vector shifts by scalar shift count, did you mean
> something like following (for every shift pattern)?
>
> --- sse.md.jj1  2017-04-04 19:51:01.000000000 +0200
> +++ sse.md      2017-04-06 10:26:26.877545109 +0200
> @@ -10696,6 +10696,22 @@
>     (set_attr "prefix" "orig,vex")
>     (set_attr "mode" "<sseinsnmode>")])
>
> +(define_insn_and_split "*<shift_insn><mode>3<mask_name>_1"
> +  [(set (match_operand:VI2_AVX2_AVX512BW 0 "register_operand")
> +       (any_lshift:VI2_AVX2_AVX512BW
> +         (match_operand:VI2_AVX2_AVX512BW 1 "register_operand")
> +         (sign_extend:DI (match_operand:SI 2 "nonmemory_operand"))))]
> +  "TARGET_SSE2 && <mask_mode512bit_condition> && <mask_avx512bw_condition>
> +   && can_create_pseudo_p ()"
> +  "#"
> +  "&& 1"
> +  [(set (match_dup 3) (zero_extend:DI (match_dup 2)))
> +   (set (match_dup 0) (any_lshift:VI2_AVX2_AVX512BW
> +                       (match_dup 1) (match_dup 3)))]
> +{
> +  operands[3] = gen_reg_rtx (DImode);
> +})
>
Yes, something like this. You ca use any_extend instead of
sign_extend, so the pattern will also remove possible zero_extend of
count operand.

>  (define_insn "<shift_insn><mode>3<mask_name>"
>    [(set (match_operand:VI48_AVX2 0 "register_operand" "=x,x,v")
>         (any_lshift:VI48_AVX2
>
> The problem with that is that apparently our infrastructure doesn't support
> define_subst for define_insn_and_split (and define_split), so either we'd
> need to have separate define_insn_and_split for masked and for non-masked,
> or we'd need to extend the define_subst infrastructure for
> define_insn_and_split somehow.  Looking say at
> (define_subst "mask"
>   [(set (match_operand:SUBST_V 0)
>         (match_operand:SUBST_V 1))]
>   "TARGET_AVX512F"
>   [(set (match_dup 0)
>         (vec_merge:SUBST_V
>           (match_dup 1)
>           (match_operand:SUBST_V 2 "vector_move_operand" "0C")
>           (match_operand:<avx512fmaskmode> 3 "register_operand" "Yk")))])
> that is a transformation we want to do on the define_insn part of
> define_insn_and_split, but not exactly what we want to do on the split
> part of the insn - there we want literaly match_dup 0, match_dup 1,
> and instead of the 2 other match_operand match_dup 2 and match_dup 3.

Hm, I'm not that versed in define_subst, but that looks quite a
drawback of define_subst to me.

Uros.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]