This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH i386 3/8] [AVX512] [19/n] Add AVX-512 patterns: Extracts and converts.


On Thu, Oct 17, 2013 at 12:47 PM, Kirill Yukhin <kirill.yukhin@gmail.com> wrote:
>
>> > I suspect gen_lowpart is bad turn when reload is completed, as
>> > far as it can create new pseudo. gen_lowpart () may call
>> > gen_reg_rtx (), which contain corresponging gcc_assert ().
>>
>> False.  gen_lowpart is perfectly safe post-reload.
>> Indeed, taking the subreg of a hard register should arrive
>>
>>           x = gen_rtx_REG_offset (op, outermode, final_regno, final_offset);
>>
>> in simplify_subreg.
>>
>> Have you encountered some specific problem with gen_lowpart?
> Yes.
>
> Patch [8/8] contains testsuite for AVX-512.
> This pattern is covered as well.
>
> When trying to do so:
>
> (define_insn_and_split "vec_extract_lo_v32hi"
>   [(set (match_operand:V16HI 0 "nonimmediate_operand" "=v,m")
>         (vec_select:V16HI
>           (match_operand:V32HI 1 "nonimmediate_operand" "vm,v")
>           (parallel [(const_int 0) (const_int 1)
>                      (const_int 2) (const_int 3)
>                      (const_int 4) (const_int 5)
>                      (const_int 6) (const_int 7)
>                      (const_int 8) (const_int 9)
>                      (const_int 10) (const_int 11)
>                      (const_int 12) (const_int 13)
>                      (const_int 14) (const_int 15)])))]
>   "TARGET_AVX512F && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
>   "#"
>   "&& reload_completed"
>   [(const_int 0)]
> {
>   rtx op1 = operands[1];
>   op1 = gen_lowpart (V16HImode, op1);
>   emit_move_insn (operands[0], op1);
>   DONE;
> })
>
> I've got ICE, with following bt:
>
> #1  0x00000000006f28d6 in gen_reg_rtx (mode=V32HImode) at /export/users/kyukhin/gcc/git/gcc/gcc/emit-rtl.c:866
> #2  0x000000000070243a in copy_to_reg (x=(reg:V32HI 21 xmm0 [163])) at /export/users/kyukhin/gcc/git/gcc/gcc/explow.c\
> :606
> #3  0x000000000091dfb8 in gen_lowpart_general (mode=V16HImode, x=<optimized out>)
>     at /export/users/kyukhin/gcc/git/gcc/gcc/rtlhooks.c:50
> #4  0x0000000000ce16e8 in gen_split_4943 (curr_insn=<optimized out>, operands=0x16f6320)
>     at /export/users/kyukhin/gcc/git/gcc/gcc/config/i386/sse.md:6329
> #5  0x00000000006f7865 in try_split (pat=(set (reg:V16HI 23 xmm2 [164])
>     (vec_select:V16HI (reg:V32HI 21 xmm0 [163])
>         (parallel [
>                 (const_int 0 [0])
>                 (const_int 1 [0x1])
>                 (const_int 2 [0x2])
>                 (const_int 3 [0x3])
>                 (const_int 4 [0x4])
>                 (const_int 5 [0x5])
>                 (const_int 6 [0x6])
>                 (const_int 7 [0x7])
>                 (const_int 8 [0x8])
>                 (const_int 9 [0x9])
>                 (const_int 10 [0xa])
>                 (const_int 11 [0xb])
>                 (const_int 12 [0xc])
>                 (const_int 13 [0xd])
>                 (const_int 14 [0xe])
>                 (const_int 15 [0xf])
>             ]))), trial=(insn 48 46 49 6 (set (reg:V16HI 23 xmm2 [164])
>         (vec_select:V16HI (reg:V32HI 21 xmm0 [163])
>             (parallel [
>                     (const_int 0 [0])
>                     (const_int 1 [0x1])
>                     (const_int 2 [0x2])
>                     (const_int 3 [0x3])
>                     (const_int 4 [0x4])
>                     (const_int 5 [0x5])
>                     (const_int 6 [0x6])
>                     (const_int 7 [0x7])
>                     (const_int 8 [0x8])
>                     (const_int 9 [0x9])
>                     (const_int 10 [0xa])
>                     (const_int 11 [0xb])
>                     (const_int 12 [0xc])
>                     (const_int 13 [0xd])
>                     (const_int 14 [0xe])
>                     (const_int 15 [0xf])
>                 ]))) /export/users/kyukhin/gcc/git/gcc/gcc/testsuite/gcc.target/i386/avx512f-vec-unpack.c:24 2151 {ve\
> c_extract_lo_v32hi}
>      (nil)), last=<optimized out>)
>     at /export/users/kyukhin/gcc/git/gcc/gcc/emit-rtl.c:3467
>
> So, we have:
>     [rtlhooks.c:50]gen_lowpart_general () ->
>     [explow.c:606]copy_to_reg () ->
>     [emit-rtl.c:866]gen_reg_rtx (): gcc_assert (can_create_pseudo_p ());
>
> Maybe the code in the pattern is buggy? Or is it a gen_lowpart?

I think that original approach with gen_rtx_REG is correct and follows
established practice in sse.md (please grep for gen_reg_RTX in
sse.md). If this approach is necessary due to the deficiency of
gen_lowpart, then the fix to gen_lowpart should be proposed in a
follow-up patch.

Uros.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]