This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH i386 4/8] [AVX512] [1/n] Add substed patterns.


On 11/01/2013 10:19 PM, Kirill Yukhin wrote:
> Hello Richard,
> 
> On 21 Oct 16:01, Richard Henderson wrote:
>> Error on V16SF.  Probably better to fill this out.
> Thanks, fixed.
> 
>> Better to just use <sseinsnmode> here, as it's a compile-time constant.
> Fixed.
> 
>>> +(define_insn "avx512f_store<mode>_mask"
>>
>> Likewise.
> Fixed.
> 
>> Nested vec_merge?  That seems... odd to say the least.
>> How in the world does this get matched?
> Moved to separate patch.
> 
>>> +(define_insn "*avx512f_loads<mode>_mask"
>>
>> Likewise.
> Moved to separate patch.
> 
>>> +(define_insn "avx512f_stores<mode>_mask"
>> This seems similar, though of course it's an extract.
>> I still can't imagine how it could be used.
> Separate patch.
> 
>>> -(define_insn "rcp14<mode>"
>>> +(define_insn "<mask_codefor>rcp14<mode><mask_name>"
>>
>> What, this name isn't used for non-masked anymore?
> Bogus original pattern. Introduced by me here:
>       svn+ssh://gcc.gnu.org/svn/gcc/trunk@203609
> This (and subseqent patterns you mention) are not used
> by name. In general case for every built-in whether it is
> masked or not we use masked version of built-in. E.g.:
> extern __inline __m512d
> __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> _mm512_rcp14_pd (__m512d __A)
> {
>   return (__m512d) __builtin_ia32_rcp14pd512_mask ((__v8df) __A,
>                                                    (__v8df)
>                                                    _mm512_setzero_pd (),
>                                                    (__mmask8) -1);
> }
> Then, while expanding, if we can prove that mask is const -1 we remove
> redundant vec_merge getting non-masked variant of the built-in.
> Same holds for all inssn you mentioned
> 
> 
>>> -(define_insn "srcp14<mode>"
>> Likewise.  These changes don't belong in this patch.
> Ditto.
> 
>>> -(define_insn "rsqrt14<mode>"
>> Likewise.
> Ditto.
> 
>>> -	  (match_operand:FMAMODE 3 "nonimmediate_operand" " v,vm, 0,xm,x")))]
>>> +	  (match_operand:FMAMODE 1 "nonimmediate_operand" "%0,0,v,x,x")
>>> +	  (match_operand:FMAMODE 2 "nonimmediate_operand" "vm,v,vm,x,m")
>>> +	  (match_operand:FMAMODE 3 "nonimmediate_operand" "v,vm,0,xm,x")))]
>>
>> Unrelated changes.  Repeated throughout the fma patterns.
> Reverted.
> 
>>> +(define_insn "*fmai_fmadd_<mode>_maskz"
>>> +  [(set (match_operand:VF_128 0 "register_operand" "=v,v")
>>> +	(vec_merge:VF_128
>>> +	  (vec_merge:VF_128
>>> +	    (fma:VF_128
>>> +	      (match_operand:VF_128 1 "nonimmediate_operand" "0,0")
>>> +	      (match_operand:VF_128 2 "nonimmediate_operand" "vm,v")
>>> +	      (match_operand:VF_128 3 "nonimmediate_operand" "v,vm"))
>>> +	    (match_operand:VF_128 4 "const0_operand")
>>> +	    (match_operand:QI 5 "register_operand" "k,k"))
>>> +	  (match_dup 1)
>>> +	  (const_int 1)))]
>>> +  "TARGET_AVX512F"
>>> +  "@
>>> +   vfmadd132<ssescalarmodesuffix>\t{%2, %3, %0%{%5%}%N4|%0%{%5%}%N4, %3, %2}
>>> +   vfmadd213<ssescalarmodesuffix>\t{%3, %2, %0%{%5%}%N4|%0%{%5%}%N4, %2, %3}"
>>> +  [(set_attr "type" "ssemuladd")
>>> +   (set_attr "mode" "<MODE>")])
>>
>> These seem like useless patterns.  If they're for builtins,
>> then they seem like useless builtins.  See above.
> Moved to dedicated patch.
> 
>>> @@ -3686,8 +4328,8 @@
>>>     (set_attr "athlon_decode" "vector,double,*")
>>>     (set_attr "amdfam10_decode" "vector,double,*")
>>>     (set_attr "bdver1_decode" "direct,direct,*")
>>> -   (set_attr "btver2_decode" "double,double,double")
>>>     (set_attr "prefix" "orig,orig,vex")
>>> +   (set_attr "btver2_decode" "double,double,double")
>>>     (set_attr "mode" "SF")])
>>
>> Unrelated changes.
> Ugh, reverted.
> 
>>> +(define_expand "vec_unpacku_float_hi_v16si"
>>> +  [(match_operand:V8DF 0 "register_operand")
>>> +   (match_operand:V16SI 1 "register_operand")]
>>> +  "TARGET_AVX512F"
>>> +{
>>> +  REAL_VALUE_TYPE TWO32r;
>>> +  rtx k, x, tmp[4];
>>> +
>>> +  real_ldexp (&TWO32r, &dconst1, 32);
>>> +  x = const_double_from_real_value (TWO32r, DFmode);
>>> +
>>> +  tmp[0] = force_reg (V8DFmode, CONST0_RTX (V8DFmode));
>>> +  tmp[1] = force_reg (V8DFmode, ix86_build_const_vector (V8DFmode, 1, x));
>>> +  tmp[2] = gen_reg_rtx (V8DFmode);
>>> +  tmp[3] = gen_reg_rtx (V8SImode);
>>> +  k = gen_reg_rtx (QImode);
>>> +
>>> +  emit_insn (gen_vec_extract_hi_v16si (tmp[3], operands[1]));
>>> +  emit_insn (gen_floatv8siv8df2 (tmp[2], tmp[3]));
>>> +  emit_insn (gen_rtx_SET (VOIDmode, k,
>>> +			  gen_rtx_LT (QImode, tmp[2], tmp[0])));
>>> +  emit_insn (gen_addv8df3_mask (tmp[2], tmp[2], tmp[1], tmp[2], k));
>>> +  emit_move_insn (operands[0], tmp[2]);
>>> +  DONE;
>>> +})
>>
>> Separate patch.  And this is too complicated, since vcvtudq2pd exists.
> Moved to [5/8] Extend hooks.
> 
>> Non-masked name change again.
> See above.
>>> +(define_insn "<mask_codefor>avx512f_unpcklps512<mask_name>"
>>
>> Ditto.
> Above.
> 
>>> +(define_insn "<mask_codefor>avx512f_movshdup512<mask_name>"
>>
>> Ditto.
> Above.
> 
>>> +(define_insn "<mask_codefor>avx512f_movsldup512<mask_name>"
>>
>> Ditto.
> Above.
> 
> Updated patch in the bottom.
> 
> Bootstrapped.
> 
> Coould you pls take a look?

This is ok now.


r~


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]