This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

alignment on Solaris x86


Hi, everybody!

As it doesn't seem to be news (just figured that the problem was already
reported and fixed) I post it here at general discussion list. It's just
a short story about how much misalignment of floating point variables on
Intel can cost. Those who hate HTML attachments can read it on
http://fy.chalmers.se/~appro/gcc-2.95.html.

Cheers. Andy.

P.S. Just double checked 2.95.1 in this last second. Even though
http://egcs.cygnus.com/ml/gcc-patches/1999-07/msg00733.html claims
similar (to mine) fix was applied it's not in 2.95.1!
Title: How can two lines make your code run 25% faster on Solaris x86

How can two lines make your code run 25% faster


A friend of mine loves to benchmark his programs on different computer platforms and compilers. Some time ago he took advantage of Sun's promotion offer and obtained and installed Solaris 7 on his brand new Pentium-II home computer. At work he has to use Windows NT though. For this reason I used to ask him various small questions about NT and he used to answer "It's a professional operating system, you know. Everything has to be slow and take a lot of memory." When he installed Solaris 7 and I asked him how it felt he replied "Well... It seems to be a professional operating system. Everything is slow and takes a lot of memory." I asked "What's slow?" as I already knew it takes memory. It turned that the very first thing he did was compile the program he was benchmarking under Linux on the very same computer (I should have guessed:-) and found that it goes 33% slower. Yes, he was using same compiler, namely egcs-1.1.2... I myself use Solaris at work and even though I use SPARC version I felt slightly embarrassed and responsible as it was actually me who has encouraged him to install Solaris x86. No, 33% is no excuse even for a "professional" system, you know...

My first move was "professional system needs a professional compiler." So I've persuaded him to download the trial version of Workshop for Intel. No, I wasn't expecting him to pay 1.395USD for a program to use at home, this 33% simply were bugging me (as well as him) that much... Workshop C exhibited 15% improvement over egcs-1.1.2 but was still way (well, your milage may vary) behind Linux. Something you don't want to learn about software after you've paid good money for it, huh?

But what does his program do? It's floating point intensive, it's C... He must have passed floating point arguments by value and/or had some local floating point variables... And it suddenly strikes me! All those arguments and local variables reside in stack and how the hell are they aligned? I reported myself that egcs doesn't keep track of it and they (variables) may or may not be aligned. Good news were that we alredy knew they were working on it since a while ago:

March 23, 1999
Through the efforts of John Wehle and Bernd Schmidt, GCC will now attempt to keep the stack 64bit aligned on the x86 and allocate doubles on 64bit boundaries. This can significantly improve floating point performance on the x86. Work will continue on aligning the stack and floating point values in the stack.

So next move was naturally to download gcc-2.95... In couple of hours my friend reported that on Linux his program runs 8% faster and my test program (from the bug report) exhibits perfect alignment of local variables. I was impatiently waiting for results from Solaris... An hour and a half passed... Yes! He calls and says it's 8% improvement on Solaris as well (i.e. the very same 33% behind Linux), but local variables turned out to be misaligned:-( How come?

Well, the question is how does this "attempt to keep the stack 64bit aligned" work? I could think of two ways. One can align it in every function with something like "movl %esp,%eax; andl $-8,%esp; pushl %eax" and then "popl %eax; movl %eax,%esp" at return. Or one can simply preserve alignment at some reference point. It looks like gcc-2.95 assumes and preserves double alignement of first argument passed to any particular function. And all one has to do is to make sure fist argument to main() is double aligned:

gcc-2.95.perf.patch
*** ./gcc/config/i386/sol2-c1.asm.orig Wed Dec 16 22:04:08 1998 --- ./gcc/config/i386/sol2-c1.asm Thu Aug 12 23:39:02 1999 *************** *** 59,64 **** --- 59,70 ---- pushl $0x0 movl %esp,%ebp + ! Make sure the first argument to main is double aligned. I've counted 5 + ! pushes in the original code so if I "issue" one more it's set. + ! <appro@fy.chalmers.se> + andl $-8,%esp + subl $4,%esp + ! As specified per page 3-32 of the ABI, %edx contains a function ! pointer that should be registered with atexit(), for proper ! shared object termination. Just push it onto the stack for now *** ./gcc/config/i386/sol2-gc1.asm.orig Wed Dec 16 22:04:12 1998 --- ./gcc/config/i386/sol2-gc1.asm Thu Aug 12 23:39:18 1999 *************** *** 62,67 **** --- 62,73 ---- pushl $0x0 movl %esp,%ebp + ! Make sure the first argument to main is double aligned. I've counted 5 + ! pushes in the original code so if I "issue" one more it's set. + ! <appro@fy.chalmers.se> + andl $-8,%esp + subl $4,%esp + ! As specified per page 3-32 of the ABI, %edx contains a function ! pointer that should be registered with atexit(), for proper ! shared object termination. Just push it onto the stack for now

This was my present to my friend's birthday. Now his program runs only 5% slower on Solaris than on Linux:-) So those 5% is how much proffesional system should cost nowadays, huh?


Andy. Doing his an(d)ything:-)

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]