[PATCH] Make mempcpy more optimal (PR middle-end/70140).

Wed Aug 2 07:26:00 GMT 2017

On 08/02/2017 09:16 AM, Jakub Jelinek wrote:
> On Wed, Aug 02, 2017 at 09:13:40AM +0200, Martin LiÅ¡ka wrote:
>> On 08/01/2017 09:50 PM, Jakub Jelinek wrote:
>>> On Thu, Jul 20, 2017 at 08:59:29AM +0200, Martin LiÅ¡ka wrote:
>>>> Hello.
>>>>
>>>> Following patch does sharing of expansion for mem{p,}cpy and also strpcy (with a known constant as source)
>>>> so that we use same type of expansion (direct insns emission, direct emission with a loop instruction and
>>>> library call). As mentioned in the PR, glibc does not provide an optimized version for majority of targets.
>>>>
>>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>>
>>> This broke e.g.
>>> FAIL: gcc.dg/20050503-1.c scan-assembler-not call
>>> on i686-linux, the result is significantly worse.
>>> Also, while perhaps majority of targets don't provide optimized version,
>>> some targets do, including i?86/x86_64, and if the memcpy would be expanded
>>> as a call, it is much better to just emit mempcpy call instead.
>>> Just look at the testcase, because of this misoptimization we suddenly can't
>>> use a tail call.
>>>
>>> 	Jakub
>>>
>>
>> I see. That said, should I introduce some target hook that will tell whether to expand to
>> 'return memcpy(dst, src,l) + dst;' or call library mempcpy routine?
> 
> If some targets aren't willing to provide fast mempcpy in libc, then yes I
> guess.  And, for -Os you should never do the former, that isn't going to be
> shorter (at least unless the memcpy is expanded inline and is shorter than
> the call + addition).

Good, I will work on that.

> 
> BTW, do we have folding of mempcpy to memcpy if the result is ignored (no
> lhs)?

Yes, we do it, I've just verified that.

Martin

> 
> 	Jakub
>