This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Use of vector instructions in memmov/memset expanding
- From: Michael Zolotukhin <michael dot v dot zolotukhin at gmail dot com>
- To: Jakub Jelinek <jakub at redhat dot com>
- Cc: Andi Kleen <andi at firstfloor dot org>, gcc-patches at gcc dot gnu dot org, Jan Hubicka <hubicka at ucw dot cz>, Richard Guenther <richard dot guenther at gmail dot com>, "H.J. Lu" <hjl dot tools at gmail dot com>, izamyatin at gmail dot com, areg dot melikadamyan at gmail dot com
- Date: Wed, 28 Sep 2011 16:38:35 +0400
- Subject: Re: Use of vector instructions in memmov/memset expanding
- References: <CANtU07-DAOMe9Nk4oYj3FJnkZqgkHvSnobsugeSfcRUzDChrrg@mail.gmail.com> <CANtU07_ZoRrLjWBGv=r6MCeBVTh-z13Cab0frjQdg2e7VAyzGg@mail.gmail.com> <20110715232425.GA24793@atrey.karlin.mff.cuni.cz> <CANtU07-HyO0gAZPz-XLknngAviRSyQjyk3DBfb-PjfPyt0KO_g@mail.gmail.com> <CANtU07-eCpAZ=VgvkdBCORq8bR0UZCgryofBXU_4FcRDJ7hWoQ@mail.gmail.com> <CANtU079YjpFwdy1kXoebsLyKe08KTwjwXDGJ4kY+gEJcZUTdMg@mail.gmail.com> <CANtU078YRvnQm8PMQ1GSUxczfTzwup1NLFJXmTDJfGPqVx_Nqg@mail.gmail.com> <m2hb3wj3lw.fsf@firstfloor.org> <20110928115546.GL2687@tyan-ft48-01.lab.bos.redhat.com>
This expanding only works on relatively small sizes (up to 4k), where
overhead of library call could be quite significant. In some cases new
implementation gives 5x acceleration (especially on small sizes - less
than ~256 bytes). Almost on all sizes from 16 to 4096 bytes there is a
some gain, in average it's 20-30% on 64-bits and 40-50% on 32-bits (on
Atom).
This inlining implementation isn't intended to replace glibc, it's
intended to replace old implementation which sometimes is quite slow.
If glibc-calls turn out to be faster than this expanding, libcall is
generated (special experiments were carried out to find threshold
values in cost models).
If the size is unknown at all, this inlining doesn't work (i.e glibc is called).
On 28 September 2011 15:55, Jakub Jelinek <jakub@redhat.com> wrote:
> On Wed, Sep 28, 2011 at 04:41:47AM -0700, Andi Kleen wrote:
>> Michael Zolotukhin <michael.v.zolotukhin@gmail.com> writes:
>> >
>> > Build and 'make check' was tested.
>>
>> Could you expand a bit on the performance benefits? ?Where does it help?
>
> Especially when glibc these days has very well optimized implementations
> tuned for various CPUs and it is very unlikely beneficial to inline
> memcpy/memset if they aren't really short or have unknown number of
> iterations.
>
> ? ? ? ?Jakub
>
--
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.