This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, i386, AVX512] PR target/67849: Avoid upper-bank registers when splitting vec_extract_lo instruction.
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: Alexander Fomin <afomin dot mailbox at gmail dot com>
- Cc: kyukhin at gcc dot gnu dot org, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Mon, 5 Oct 2015 18:01:08 +0200
- Subject: Re: [PATCH, i386, AVX512] PR target/67849: Avoid upper-bank registers when splitting vec_extract_lo instruction.
- Authentication-results: sourceware.org; auth=none
- References: <20151005155446 dot GA55082 at msticlxl57 dot ims dot intel dot com>
On Mon, Oct 5, 2015 at 5:54 PM, Alexander Fomin
<afomin.mailbox@gmail.com> wrote:
> This patch addresses PR target/67849. Given a machine that does not
> support AVX512VL, following "else" branch for vec_exract_lo insn
> may result in a split using YMMs from upper-bank, hence invalid
> assembly. Tuning define_insn pattern and define_split constraints
> eliminates this problem.
>
> Please take a look at ChangeLog entries - not sure how to reference
> to corresponding split in md.
Just describe it as "<what_it_does> splitter". There are some examples
in ChangeLog.
> Regards,
> Alexander
> ---
> gcc/
>
> PR target/67849
> * config/i386/sse.md (define_split): Restrict splitting for
> upper-bank registers when target does not support AVX512VL.
> (define_insn "vec_extract_lo_<mode><mask_name>"): Restrict
> instruction splitting when target does not support AVX512VL.
> ---
> gcc/config/i386/sse.md | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 9b7a338..b413726 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -6924,7 +6924,8 @@
> (parallel [(const_int 0) (const_int 1)
> (const_int 2) (const_int 3)])))]
> "TARGET_AVX512F && !(MEM_P (operands[0]) && MEM_P (operands[1]))
> - && reload_completed"
> + && reload_completed
> + && (TARGET_AVX512VL || (REG_P (operands[0]) && !(REG_P (operands[1]) && EXT_REX_SSE_REGNO_P (REGNO (operands[1])))))"
EXT_REX_SSE_REG_P (operands[1]) can be used here.
> [(const_int 0)]
> {
> rtx op1 = operands[1];
> @@ -6962,7 +6963,7 @@
> (const_int 2) (const_int 3)])))]
> "TARGET_AVX512F && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
> {
> - if (<mask_applied>)
> + if (<mask_applied> || !TARGET_AVX512VL)
> return "vextract<shuffletype>64x4\t{$0x0, %1, %0<mask_operand2>|%0<mask_operand2>, %1, 0x0}";
> else
> return "#";
> --
> 1.8.3.1
>