This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix V64QImode multiplication with AVX512BW (PR target/70329)


Hi Jakub!
On 21 Mar 21:16, Jakub Jelinek wrote:
> The ix86_expand_vecop_qihi function has been adjusted for AVX512* just
> by changing i < 32 to i < 64 (where both were sometimes wasteful), but
> for !full_interleave that is even wrong, swapping the second and third
> quarter is something that works to undo AVX256 unpacks only,
> where we want
> 0,2,4,6,8,10,12,14,32,34,36,38,40,42,44,46,16,18,20,22,24,26,28,30,48,50,52,54,56,58,60,62,
> permutation.  But, for AVX512 we want
> 0,2,4,6,8,10,12,14,64,66,68,70,72,74,76,78,16,18,20,22,24,26,28,30,80,82,84,86,88,90,92,94,32,34,36,38,40,42,44,46,96,98,100,102,104,106,108,110,48,50,52,54,56,58,60,62,112,114,116,118,120,122,124,126
> where the current trunk code has been producing
> 0,2,4,6,8,10,12,14,32,34,36,38,40,42,44,46,16,18,20,22,24,26,28,30,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78,96,98,100,102,104,106,108,110,80,82,84,86,88,90,92,94,112,114,116,118,120,122,124,126
> instead.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?

Your putch is OK.
I'd only suggest to add a comment to this calculation:
+	d.perm[i] = ((i * 2) & 14) + ((i & 8) ? d.nelt : 0) + (i & ~15);

--
Thanks, K


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]