This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Use of vector instructions in memmov/memset expanding


This expanding only works on relatively small sizes (up to 4k), where
overhead of library call could be quite significant. In some cases new
implementation gives 5x acceleration (especially on small sizes - less
than ~256 bytes). Almost on all sizes from 16 to 4096 bytes there is a
some gain, in average it's 20-30% on 64-bits and 40-50% on 32-bits (on
Atom).
This inlining implementation isn't intended to replace glibc, it's
intended to replace old implementation which sometimes is quite slow.

If glibc-calls turn out to be faster than this expanding, libcall is
generated (special experiments were carried out to find threshold
values in cost models).

If the size is unknown at all, this inlining doesn't work (i.e glibc is called).

On 28 September 2011 15:55, Jakub Jelinek <jakub@redhat.com> wrote:
> On Wed, Sep 28, 2011 at 04:41:47AM -0700, Andi Kleen wrote:
>> Michael Zolotukhin <michael.v.zolotukhin@gmail.com> writes:
>> >
>> > Build and 'make check' was tested.
>>
>> Could you expand a bit on the performance benefits? ?Where does it help?
>
> Especially when glibc these days has very well optimized implementations
> tuned for various CPUs and it is very unlikely beneficial to inline
> memcpy/memset if they aren't really short or have unknown number of
> iterations.
>
> ? ? ? ?Jakub
>



-- 
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]