This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC PATCH] AVX2 32-byte integer min/max reductions


On Fri, Sep 16, 2011 at 8:52 PM, Jakub Jelinek <jakub@redhat.com> wrote:

>> So, either we can fix this by adding reduc_{smin,smax,umin,umax}_v{32q,16h,8s,4d}i
>> patterns (at that point I guess I should just macroize them together with
>> the reduc_{smin,smax,umin,umax}_v{4sf,8sf,4df}) and handle the 4 32-byte
>> integer modes also in ix86_expand_reduc, or come up with some new optab
>
> Here is a patch that does it this way and also moves the umaxmin expanders
> one insn down to the right spot.
>
> I've noticed <sse2_avx2>_lshr<mode>3 insn was modelled incorrectly
> for the 256-bit shift, because, as the documentation says, it
> shifts each 128-bit lane separately, while it was modelled as V4DImode
> shift (i.e. shifting each 64-bit chunk), and sse2_lshrv1ti3 was there
> just for the 128-bit variant, not the 256-bit one.
>
> Regtested on x86_64-linux and i686-linux on SandyBridge, unfortunately
> I don't have AVX2 emulator and thus AVX2 assembly was just eyeballed.
> E.g. for the V16HImode reduction the difference with this patch is:
> - ? ? ? vmovdqa %xmm0, %xmm1
> - ? ? ? vextracti128 ? ?$0x1, %ymm0, %xmm0
> - ? ? ? vpextrw $0, %xmm1, %eax
> - ? ? ? vpextrw $1, %xmm1, %edx
> - ? ? ? cmpw ? ?%ax, %dx
> - ? ? ? cmovl ? %eax, %edx
> - ? ? ? vpextrw $2, %xmm1, %eax
> - ? ? ? cmpw ? ?%ax, %dx
> - ? ? ? cmovl ? %eax, %edx
> - ? ? ? vpextrw $3, %xmm1, %eax
> - ? ? ? cmpw ? ?%ax, %dx
> - ? ? ? cmovl ? %eax, %edx
> - ? ? ? vpextrw $4, %xmm1, %eax
> - ? ? ? cmpw ? ?%ax, %dx
> - ? ? ? cmovl ? %eax, %edx
> - ? ? ? vpextrw $5, %xmm1, %eax
> - ? ? ? cmpw ? ?%ax, %dx
> - ? ? ? cmovl ? %eax, %edx
> - ? ? ? vpextrw $6, %xmm1, %eax
> - ? ? ? cmpw ? ?%ax, %dx
> - ? ? ? cmovl ? %eax, %edx
> - ? ? ? vpextrw $7, %xmm1, %eax
> - ? ? ? cmpw ? ?%ax, %dx
> - ? ? ? cmovl ? %eax, %edx
> - ? ? ? vpextrw $0, %xmm0, %eax
> - ? ? ? cmpw ? ?%ax, %dx
> - ? ? ? cmovl ? %eax, %edx
> - ? ? ? vpextrw $1, %xmm0, %eax
> - ? ? ? cmpw ? ?%ax, %dx
> - ? ? ? cmovl ? %eax, %edx
> - ? ? ? vpextrw $2, %xmm0, %eax
> - ? ? ? cmpw ? ?%ax, %dx
> - ? ? ? cmovl ? %eax, %edx
> - ? ? ? vpextrw $3, %xmm0, %eax
> - ? ? ? cmpw ? ?%ax, %dx
> - ? ? ? cmovl ? %eax, %edx
> - ? ? ? vpextrw $4, %xmm0, %eax
> - ? ? ? cmpw ? ?%ax, %dx
> - ? ? ? cmovl ? %eax, %edx
> - ? ? ? vpextrw $5, %xmm0, %eax
> - ? ? ? cmpw ? ?%ax, %dx
> - ? ? ? cmovl ? %eax, %edx
> - ? ? ? vpextrw $6, %xmm0, %eax
> - ? ? ? cmpw ? ?%ax, %dx
> - ? ? ? cmovl ? %eax, %edx
> - ? ? ? vpextrw $7, %xmm0, %eax
> - ? ? ? cmpw ? ?%ax, %dx
> - ? ? ? cmovge ?%edx, %eax
> + ? ? ? vperm2i128 ? ? ?$1, %ymm0, %ymm0, %ymm1
> + ? ? ? vpmaxsw %ymm1, %ymm0, %ymm0
> + ? ? ? vpsrldq $8, %ymm0, %ymm1
> + ? ? ? vpmaxsw %ymm1, %ymm0, %ymm0
> + ? ? ? vpsrldq $4, %ymm0, %ymm1
> + ? ? ? vpmaxsw %ymm1, %ymm0, %ymm0
> + ? ? ? vpsrldq $2, %ymm0, %ymm1
> + ? ? ? vpmaxsw %ymm1, %ymm0, %ymm0
> + ? ? ? vpextrw $0, %xmm0, %eax
>
> 2011-09-16 ?Jakub Jelinek ?<jakub@redhat.com>
>
> ? ? ? ?* config/i386/sse.md (VIMAX_AVX2): Change V4DI to V2TI.
> ? ? ? ?(sse2_avx, sseinsnmode): Add V2TI.
> ? ? ? ?(REDUC_SMINMAX_MODE): New mode iterator.
> ? ? ? ?(reduc_smax_v4sf, reduc_smin_v4sf, reduc_smax_v8sf,
> ? ? ? ?reduc_smin_v8sf, reduc_smax_v4df, reduc_smin_v4df): Remove.
> ? ? ? ?(reduc_<code>_<mode>): New smaxmin and umaxmin expanders.
> ? ? ? ?(sse2_lshrv1ti3): Rename to...
> ? ? ? ?(<sse2_avx2>_lshr<mode>3): ... this. ?Use VIMAX_AVX2 mode
> ? ? ? ?iterator. ?Move before umaxmin expanders.
> ? ? ? ?* config/i386/i386.h (VALID_AVX256_REG_MODE,
> ? ? ? ?SSE_REG_MODE_P): Accept V2TImode.
> ? ? ? ?* config/i386/i386.c (ix86_expand_reduc): Handle V32QImode,
> ? ? ? ?V16HImode, V8SImode and V4DImode.

OK for mainline SVN.

Thanks,
Uros.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]