This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH i386 3/8] [AVX512] [19/n] Add AVX-512 patterns: Extracts and converts.
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: Kirill Yukhin <kirill dot yukhin at gmail dot com>
- Cc: Richard Henderson <rth at redhat dot com>, Vladimir Makarov <vmakarov at redhat dot com>, Jakub Jelinek <jakub at redhat dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 17 Oct 2013 13:14:28 +0200
- Subject: Re: [PATCH i386 3/8] [AVX512] [19/n] Add AVX-512 patterns: Extracts and converts.
- Authentication-results: sourceware.org; auth=none
- References: <20130808112524 dot GA40277 at msticlxl57 dot ims dot intel dot com> <20130814072638 dot GD52726 at msticlxl57 dot ims dot intel dot com> <52129604 dot 6040305 at redhat dot com> <20131009103129 dot GR52466 at msticlxl57 dot ims dot intel dot com> <525D6367 dot 7040804 at redhat dot com> <20131016160707 dot GC22220 at msticlxl57 dot ims dot intel dot com> <525EC60B dot 2080108 at redhat dot com> <20131017104743 dot GA18369 at msticlxl57 dot ims dot intel dot com>
On Thu, Oct 17, 2013 at 12:47 PM, Kirill Yukhin <kirill.yukhin@gmail.com> wrote:
>
>> > I suspect gen_lowpart is bad turn when reload is completed, as
>> > far as it can create new pseudo. gen_lowpart () may call
>> > gen_reg_rtx (), which contain corresponging gcc_assert ().
>>
>> False. gen_lowpart is perfectly safe post-reload.
>> Indeed, taking the subreg of a hard register should arrive
>>
>> x = gen_rtx_REG_offset (op, outermode, final_regno, final_offset);
>>
>> in simplify_subreg.
>>
>> Have you encountered some specific problem with gen_lowpart?
> Yes.
>
> Patch [8/8] contains testsuite for AVX-512.
> This pattern is covered as well.
>
> When trying to do so:
>
> (define_insn_and_split "vec_extract_lo_v32hi"
> [(set (match_operand:V16HI 0 "nonimmediate_operand" "=v,m")
> (vec_select:V16HI
> (match_operand:V32HI 1 "nonimmediate_operand" "vm,v")
> (parallel [(const_int 0) (const_int 1)
> (const_int 2) (const_int 3)
> (const_int 4) (const_int 5)
> (const_int 6) (const_int 7)
> (const_int 8) (const_int 9)
> (const_int 10) (const_int 11)
> (const_int 12) (const_int 13)
> (const_int 14) (const_int 15)])))]
> "TARGET_AVX512F && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
> "#"
> "&& reload_completed"
> [(const_int 0)]
> {
> rtx op1 = operands[1];
> op1 = gen_lowpart (V16HImode, op1);
> emit_move_insn (operands[0], op1);
> DONE;
> })
>
> I've got ICE, with following bt:
>
> #1 0x00000000006f28d6 in gen_reg_rtx (mode=V32HImode) at /export/users/kyukhin/gcc/git/gcc/gcc/emit-rtl.c:866
> #2 0x000000000070243a in copy_to_reg (x=(reg:V32HI 21 xmm0 [163])) at /export/users/kyukhin/gcc/git/gcc/gcc/explow.c\
> :606
> #3 0x000000000091dfb8 in gen_lowpart_general (mode=V16HImode, x=<optimized out>)
> at /export/users/kyukhin/gcc/git/gcc/gcc/rtlhooks.c:50
> #4 0x0000000000ce16e8 in gen_split_4943 (curr_insn=<optimized out>, operands=0x16f6320)
> at /export/users/kyukhin/gcc/git/gcc/gcc/config/i386/sse.md:6329
> #5 0x00000000006f7865 in try_split (pat=(set (reg:V16HI 23 xmm2 [164])
> (vec_select:V16HI (reg:V32HI 21 xmm0 [163])
> (parallel [
> (const_int 0 [0])
> (const_int 1 [0x1])
> (const_int 2 [0x2])
> (const_int 3 [0x3])
> (const_int 4 [0x4])
> (const_int 5 [0x5])
> (const_int 6 [0x6])
> (const_int 7 [0x7])
> (const_int 8 [0x8])
> (const_int 9 [0x9])
> (const_int 10 [0xa])
> (const_int 11 [0xb])
> (const_int 12 [0xc])
> (const_int 13 [0xd])
> (const_int 14 [0xe])
> (const_int 15 [0xf])
> ]))), trial=(insn 48 46 49 6 (set (reg:V16HI 23 xmm2 [164])
> (vec_select:V16HI (reg:V32HI 21 xmm0 [163])
> (parallel [
> (const_int 0 [0])
> (const_int 1 [0x1])
> (const_int 2 [0x2])
> (const_int 3 [0x3])
> (const_int 4 [0x4])
> (const_int 5 [0x5])
> (const_int 6 [0x6])
> (const_int 7 [0x7])
> (const_int 8 [0x8])
> (const_int 9 [0x9])
> (const_int 10 [0xa])
> (const_int 11 [0xb])
> (const_int 12 [0xc])
> (const_int 13 [0xd])
> (const_int 14 [0xe])
> (const_int 15 [0xf])
> ]))) /export/users/kyukhin/gcc/git/gcc/gcc/testsuite/gcc.target/i386/avx512f-vec-unpack.c:24 2151 {ve\
> c_extract_lo_v32hi}
> (nil)), last=<optimized out>)
> at /export/users/kyukhin/gcc/git/gcc/gcc/emit-rtl.c:3467
>
> So, we have:
> [rtlhooks.c:50]gen_lowpart_general () ->
> [explow.c:606]copy_to_reg () ->
> [emit-rtl.c:866]gen_reg_rtx (): gcc_assert (can_create_pseudo_p ());
>
> Maybe the code in the pattern is buggy? Or is it a gen_lowpart?
I think that original approach with gen_rtx_REG is correct and follows
established practice in sse.md (please grep for gen_reg_RTX in
sse.md). If this approach is necessary due to the deficiency of
gen_lowpart, then the fix to gen_lowpart should be proposed in a
follow-up patch.
Uros.