This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Fwd: performance with gcc -O0/-O2]


J.C. Pizarro wrote:
For your Opteron, try with this option

-O3 -fomit-frame-pointer -march=k8 -funroll-loops -finline-functions
-fpeel-loops \
-mno-sse3 -msse2 -msse -mno-mmx -mno-3dnow

The Opteron hardware said that it's better to use SSE2 than SSE3.
The MMX and 3DNow!+ instructions are shorter and older than SSE2/SSE
instructions.

Interesting. With these flags, the peak was 39K/sec, and it didn't top out until 272 client connections. (Quite a lengthy test; I'm running 2 minutes per iteration with X number of clients, then increasing on the next iteration, repeating until the transaction count stops growing. So this was over two hours before it finally maxed out.) I guess this is a pretty good setting for heavy scalability even though it didn't quite reach 40K/sec.


During these tests I see that about 94% of one core is consumed by interrupt processing, with 2% idle time left. I guess this ~200K packet per second rate is pretty near the limit of what this system can handle on gigabit ethernet. I've seen this box hit as high as 43K auths/sec using 4 slapd processes with 3 threads each, as opposed to a single process with 8 threads. In that test 100% of a core was doing interrupt processing.

Anyway, thanks again for all your responses.
--
  -- Howard Chu
  Chief Architect, Symas Corp.  http://www.symas.com
  Director, Highland Sun        http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP     http://www.openldap.org/project/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]