This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: [Fwd: performance with gcc -O0/-O2]
- From: Andrew Haley <aph at redhat dot com>
- To: Howard Chu <hyc at highlandsun dot com>
- Cc: gcc at gcc dot gnu dot org
- Date: Tue, 27 Nov 2007 13:56:25 +0000
- Subject: Re: [Fwd: performance with gcc -O0/-O2]
- References: <474C1A62.9000300@highlandsun.com>
Howard Chu writes:
> A bit of a minor mystery. Not a problem, just a curiosity. If
> someone knew off the top of their head a reason for it, that'd be
> cool, but otherwise no sweat.
It's possible, although unlikley, that the optimized code has worse
cache behaviour. No way to know better without doing some profiling.
Andrew.
>
> -------- Original Message --------
> Subject: Re: commit: ldap/servers/slapd connection.c daemon.c proto-slap.h
> syncrepl.c
> Date: Tue, 27 Nov 2007 05:17:04 -0800
> From: Howard Chu <hyc@symas.com>
> To: OpenLDAP-devel@openldap.org
> References: <200711261603.lAQG3R7e010741@cantor.openldap.org>
> <474AFA54.6080805@symas.com> <474B0620.8030706@symas.com>
> <474B92F5.50306@symas.com>
>
> Howard Chu wrote:
> > Howard Chu wrote:
> >> Howard Chu wrote:
> >>> For reference, the peak throughput with back-null on the previous code was
> >>> only 7,800 auths/sec (with 8 client threads). With this patch it's 11,140
> >>> auths/sec.
>
> Those numbers are for Windows Server 2003 x86_64 on a Celestica A8440 with 4
> Opteron 875s, using OpenLDAP compiled with gcc 4.3.0. The following numbers
> are for Linux 2.6.23.1 x86_64, on the same machine, compiled first with gcc
> 4.1.2 and then later with gcc 4.2.2. There's no disk I/O in these tests.
>
> >>> In both cases the throughput declines as more client threads are
> >>> used. (Compare to 35,553 auths/sec for the same machine running Linux, and no
> >>> drop in throughput all the way up to hundreds/thousands of connections.)
>
> > Re-running on Linux with a non-optimized build, peaked at 40,101 auths/sec. (I
> > guess HEAD has sped up a bit more in the past week or so...)
>
> OK, this is odd. The code compiled without optimization peaks at 40K auths/sec
> at around 124-132 client threads. The code compiled with -O2 peaks at 37K sec
> at around 128 client threads.
>
> The -O2 build is faster from about 4 to 24 client threads. From 28 on up, the
> nonoptimized code is faster at every load level. I was originally using gcc
> 4.1.2 but I'm seeing the same result now using gcc 4.2.2. Also, slapd is only
> configured with 8 worker threads in all of these tests. Strange that whatever
> optimizations the compiler has generated speeds things up for lighter load,
> but works against it under heavier load.
> --
> -- Howard Chu
> Chief Architect, Symas Corp. http://www.symas.com
> Director, Highland Sun http://highlandsun.com/hyc/
> Chief Architect, OpenLDAP http://www.openldap.org/project/
--
Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, UK
Registered in England and Wales No. 3798903