This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][rs6000] avoid using unaligned vsx or lxvd2x/stxvd2x for memcpy/memmove inline expansion


On 12/20/18 5:44 PM, Segher Boessenkool wrote:
> On Thu, Dec 20, 2018 at 05:34:54PM -0600, Aaron Sawdey wrote:
>> On 12/20/18 3:51 AM, Segher Boessenkool wrote:
>>> On Wed, Dec 19, 2018 at 01:53:05PM -0600, Aaron Sawdey wrote:
>>>> Because of POWER9 dd2.1 issues with certain unaligned vsx instructions
>>>> to cache inhibited memory, here is a patch that keeps memmove (and memcpy)
>>>> inline expansion from doing unaligned vector or using vector load/store
>>>> other than lvx/stvx. More description of the issue is here:
>>>>
>>>> https://patchwork.ozlabs.org/patch/814059/
>>>>
>>>> OK for trunk if bootstrap/regtest ok?
>>>
>>> Okay, but see below.
>>>
>> [snip]
>>>
>>> This is extraordinarily clumsy :-)  Maybe something like:
>>>
>>> static rtx
>>> gen_lvx_v4si_move (rtx dest, rtx src)
>>> {
>>>   gcc_assert (!(MEM_P (dest) && MEM_P (src));
>>>   gcc_assert (GET_MODE (dest) == V4SImode && GET_MODE (src) == V4SImode);
>>>   if (MEM_P (dest))
>>>     return gen_altivec_stvx_v4si_internal (dest, src);
>>>   else if (MEM_P (src))
>>>     return gen_altivec_lvx_v4si_internal (dest, src);
>>>   else
>>>     gcc_unreachable ();
>>> }
>>>
>>> (Or do you allow VOIDmode for src as well?)  Anyway, at least get rid of
>>> the useless extra variable.
>>
>> I think this should be better:
> 
> The gcc_unreachable at the end catches the non-mem to non-mem case.
> 
>> static rtx
>> gen_lvx_v4si_move (rtx dest, rtx src)
>> {
>>   gcc_assert ((MEM_P (dest) && !MEM_P (src)) || (MEM_P (src) && !MEM_P(dest)));
> 
> But if you prefer this, how about
> 
> {
>   gcc_assert (MEM_P (dest) ^ MEM_P (src));
>   gcc_assert (GET_MODE (dest) == V4SImode && GET_MODE (src) == V4SImode);
> 
>   if (MEM_P (dest))
>     return gen_altivec_stvx_v4si_internal (dest, src);
>   else
>     return gen_altivec_lvx_v4si_internal (dest, src);
> }
> 
> :-)
> 
> 
> Segher
> 

I like that even better, thanks!

-- 
Aaron Sawdey, Ph.D.  acsawdey@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]