This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH i386]: Enable push/pop in pro/epilogue for modern CPUs

From: "H.J. Lu" <hjl dot tools at gmail dot com>
To: Jan Hubicka <hubicka at ucw dot cz>
Cc: Jakub Jelinek <jakub at redhat dot com>, Xinliang David Li <davidxl at google dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Teresa Johnson <tejohnson at google dot com>
Date: Thu, 13 Dec 2012 12:27:51 -0800
Subject: Re: [PATCH i386]: Enable push/pop in pro/epilogue for modern CPUs
References: <CAAkRFZLMofkNZs9NUkfUDnMzVd5YsVhbx0xsb8jZuXy_eqEj6w@mail.gmail.com> <20121212163722.GA21037@atrey.karlin.mff.cuni.cz> <CAAkRFZKBa3GtEh=mmWiAmy-oGffYFxrmWetpaz+pKYSG1zSvSw@mail.gmail.com> <20121212183036.GB5303@atrey.karlin.mff.cuni.cz> <CAAkRFZLAe0CuO+-sBps9pCBDVi5k2ti8cBgL9Ukw4fBmrnpUeg@mail.gmail.com> <20121213011933.GB21037@atrey.karlin.mff.cuni.cz> <CAAkRFZ+gf-733+7djNrEyi6t_Sx6UzH3xSXsAGAMH+oWwCjN1Q@mail.gmail.com> <20121213062128.GK2315@tucnak.redhat.com> <CAMe9rOri3djrv29rQKMLS8jdYvJ8xxs+7xDaL6U-iKa=ojOjrw@mail.gmail.com> <20121213202601.GD26009@kam.mff.cuni.cz>

On Thu, Dec 13, 2012 at 12:26 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> On Wed, Dec 12, 2012 at 10:21 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>> > On Wed, Dec 12, 2012 at 10:09:14PM -0800, Xinliang David Li wrote:
>> >> On Wed, Dec 12, 2012 at 5:19 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> >> >> > libcall is not faster up to 8KB to rep sequence that is better for regalloc/code
>> >> >> > cache than fully blowin function call.
>> >> >>
>> >> >> Be careful with this. My recollection is that REP sequence is good for
>> >> >> any size -- for smaller size, the REP initial set up cost is too high
>> >> >> (10s of cycles), while for large size copy, it is less efficient
>> >> >> compared with library version.
>> >> >
>> >> > Well this is based on the data from the memtest script.
>> >> > Core has good REP implementation - it is a win from rather small blocks (16
>> >> > bytes if I recall) and it does not need alignment.
>> >> > Library version starts to be interesting with caching hints, but I think till 80KB
>> >> > it is still not a win for my setup (glibc-2.15)
>> >>
>> >> A simple test shows that -mstringop-strategy=libcall always beats
>> >> -mstringop-strategy=rep_8byte (on core2 and corei7) except for size
>> >> smaller than 8 where the rep_8byte strategy simply bypasses REP movs.
>> >> Can you share your memtest ?
>> >
>> > I can't believe that say 16 byte or 32 byte memcpy can be ever faster using a
>> > libcall.  The PLT call overhead is simply too high.
>> >
>>
>> The x86 string/memory functions in the current glibc are
>> extremely fast and tuned for Core 2/Core i7.  GCC is having
>> a very hard time to beat them with inlining:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
>
> Here we speak about memcpy/memset only.  I never got around to modernize
> strlen and friends, unfortunately...
>
> memcmp and friends are different beats.  They realy need some TLC...

memcpy and memset in glibc are also extremely fast.


-- 
H.J.

Follow-Ups:
- Re: [PATCH i386]: Enable push/pop in pro/epilogue for modern CPUs
  - From: Jan Hubicka

References:
- [PATCH i386]: Enable push/pop in pro/epilogue for modern CPUs
  - From: Xinliang David Li
- Re: [PATCH i386]: Enable push/pop in pro/epilogue for modern CPUs
  - From: Jan Hubicka
- Re: [PATCH i386]: Enable push/pop in pro/epilogue for modern CPUs
  - From: Xinliang David Li
- Re: [PATCH i386]: Enable push/pop in pro/epilogue for modern CPUs
  - From: Jan Hubicka
- Re: [PATCH i386]: Enable push/pop in pro/epilogue for modern CPUs
  - From: Xinliang David Li
- Re: [PATCH i386]: Enable push/pop in pro/epilogue for modern CPUs
  - From: Jan Hubicka
- Re: [PATCH i386]: Enable push/pop in pro/epilogue for modern CPUs
  - From: Xinliang David Li
- Re: [PATCH i386]: Enable push/pop in pro/epilogue for modern CPUs
  - From: Jakub Jelinek
- Re: [PATCH i386]: Enable push/pop in pro/epilogue for modern CPUs
  - From: H.J. Lu
- Re: [PATCH i386]: Enable push/pop in pro/epilogue for modern CPUs
  - From: Jan Hubicka

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]