This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Patch ping
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: Kirill Yukhin <kirill dot yukhin at gmail dot com>
- Cc: Jakub Jelinek <jakub at redhat dot com>, Richard Biener <rguenther at suse dot de>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Mon, 13 Jan 2014 19:40:16 +0100
- Subject: Re: Patch ping
- Authentication-results: sourceware.org; auth=none
- References: <20140113080711 dot GS892 at tucnak dot redhat dot com> <CAFULd4b-2=TVNSEdMDa6xmgh5QmLOOcYwM8_8HQnafvXbt53LQ at mail dot gmail dot com> <20140113083501 dot GT892 at tucnak dot redhat dot com> <20140113182613 dot GD24431 at msticlxl57 dot ims dot intel dot com>
On Mon, Jan 13, 2014 at 7:26 PM, Kirill Yukhin <kirill.yukhin@gmail.com> wrote:
>> > Kirill, is it possible for you to test the patch in the simulator? Do
>> > we have a testcase in gcc's testsuite that can be used to check this
>> > patch?
>>
>> E.g. gcc.target/i386/avx2-gather* and avx512f-gather*.
> This tests are for built-in generation. The issue is connected to
> auto code gen.
>
> It seems to be working, we have for hss2a.fppized.f:
> .L402:
> vmovdqu64 (%rdi,%rax), %zmm1
> kmovw %k1, %k3
> kmovw %k1, %k2
> kmovw %k1, %k4
> kmovw %k1, %k5
> addl $1, %esi
> vpgatherdd npwrx.4971-4(,%zmm1,4), %zmm0{%k3}
> vpgatherdd (%r10,%zmm1,4), %zmm2{%k2}
> vpmulld %zmm3, %zmm0, %zmm0
> vpaddd %zmm7, %zmm0, %zmm0
> vmovdqu32 %zmm0, (%r11,%rax)
> vpgatherdd npwry.4973-4(,%zmm1,4), %zmm0{%k4}
> vpmulld %zmm3, %zmm0, %zmm0
> vpaddd %zmm6, %zmm0, %zmm0
> vmovdqu32 %zmm0, (%r9,%rax)
> vpgatherdd npwrz.4975-4(,%zmm1,4), %zmm0{%k5}
> vpmulld %zmm3, %zmm0, %zmm0
> vpaddd %zmm5, %zmm0, %zmm0
> vmovdqu32 %zmm0, (%r14,%rax)
> vpaddd %zmm2, %zmm4, %zmm0
> vmovdqa64 %zmm0, (%r15,%rax)
> addq $64, %rax
> cmpl %esi, %edx
> ja .L402
An unrelated observation: gcc should figure out that %k1 mask register
can be used in all gather insns and avoid unnecessary copies at the
beginning of the loop.
Uros.