This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC PATCH] Enable V32HI/V64QI const permutations

On Fri, Oct 03, 2014 at 04:39:08PM +0200, Jakub Jelinek wrote:
> Just to stress the new testcases some more, I've enabled the
> vec_perm_const{32hi,64qi} patterns.
> Got several ICEs in expand_vec_perm_broadcast_1,
> on the final gcc_unreachable () in the function.  That function
> is only called if it couldn't be broadcasted in a single insn,
> which I believe for TARGET_AVX512BW must be always possible.
> Shall I look at this, or do you plan to address this in the near future?

Speaking of -mavx512{bw,vl,f}, there apparently is a full 2 operand shuffle
for V32HI, V16S[IF], V8D[IF], so the only one instruction full
2 operand shuffle we are missing is V64QI, right?

What would be best worst case sequence for that?

I'd think 2x vpermi2w, 2x vpshufb and one vpor could achieve that,
(first vpermi2w would put the even bytes into the right word positions
(i.e. at the right position or one above it), second vpermi2w would put
the odd bytes into the right word positions (i.e. at the right position
or one below it), each vpshufb would swap the byte pairs where necessary
and zero out the other (odd or even) byte,
and vpor merge the results), do you have something better?
What about arbitrary one operand V64QI const permutation?


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]