This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH i386 AVX512] [22/n] Extend unaligned loads & stores.
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: Kirill Yukhin <kirill dot yukhin at gmail dot com>
- Cc: Jakub Jelinek <jakub at redhat dot com>, Richard Henderson <rth at redhat dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Sat, 23 Aug 2014 09:44:46 +0200
- Subject: Re: [PATCH i386 AVX512] [22/n] Extend unaligned loads & stores.
- Authentication-results: sourceware.org; auth=none
- References: <20140822115114 dot GB47539 at msticlxl57 dot ims dot intel dot com>
On Fri, Aug 22, 2014 at 1:51 PM, Kirill Yukhin <kirill.yukhin@gmail.com> wrote:
> This patch extends unaligned loads and stores patterns.
>
> I've refactored original patch (stored on SVN's branch)
> toward reducing complexity of conditions in
> define_insn "<avx512>_storedqu<mode>_mask"
>
> It seems like such a trick won't work for:
> <sse2_avx_avx512f>_loaddqu<mode><mask_name>
> Problem is V[32|16]QI modes, which enabled for SSE/AVX
> w/o masking and for AVX-512BW & AVX-512VL when masking is
> on.
>
> Of course, I can split the define_insn & define_expand
> into 3 patterns w/ mode iterators of:
> 1. V16QI, V32QI - baseline is SSE2, masks enabled for AVX-512BW&VL
> 2. V64QI, V8HI, V16HI, V32HI - baseline is AVX-512BW, masks enabled
> for AVX-512VL
> 3. V8DI, V4DI, V2DI, V16SI, V8SI, V4SI - baseline is AVX-512F, masks
> enabled for AVX-512VL.
>
> But such approach will lead to 6 patterns instead of 2 (with non-trivial
> asm emit). I have doubts if it is useful...
At this stage, I'd still prefer simple constraints (the solution,
proposed above), even for the price of additional patterns. Looking at
the patterns, it is quite hard to calculate final condition for the
particular mode/target combo, even without enable attribute and
conditional operand constraints/predicates. With the solution above,
the complexity is conveniently pushed to mask define_subst attribute.
Uros.