This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH i386 AVX512] [22/n] Extend unaligned loads & stores.


On Fri, Aug 22, 2014 at 1:51 PM, Kirill Yukhin <kirill.yukhin@gmail.com> wrote:

> This patch extends unaligned loads and stores patterns.
>
> I've refactored original patch (stored on SVN's branch)
> toward reducing complexity of conditions in
>    define_insn "<avx512>_storedqu<mode>_mask"
>
> It seems like such a trick won't work for:
>    <sse2_avx_avx512f>_loaddqu<mode><mask_name>
> Problem is V[32|16]QI modes, which enabled for SSE/AVX
> w/o masking and for AVX-512BW & AVX-512VL when masking is
> on.
>
> Of course, I can split the define_insn & define_expand
> into 3 patterns w/ mode iterators of:
>   1. V16QI, V32QI - baseline is SSE2, masks enabled for AVX-512BW&VL
>   2. V64QI, V8HI, V16HI, V32HI - baseline is AVX-512BW, masks enabled
>      for AVX-512VL
>   3. V8DI, V4DI, V2DI, V16SI, V8SI, V4SI - baseline is AVX-512F, masks
>      enabled for AVX-512VL.
>
> But such approach will lead to 6 patterns instead of 2 (with non-trivial
> asm emit). I have doubts if it is useful...

At this stage, I'd still prefer simple constraints (the solution,
proposed above), even for the price of additional patterns. Looking at
the patterns, it is quite hard to calculate final condition for the
particular mode/target combo, even without enable attribute and
conditional operand constraints/predicates. With the solution above,
the complexity is conveniently pushed to mask define_subst attribute.

Uros.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]