This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: PATCH: PR target/40838: gcc shouldn't assume that the stack is aligned
On Fri, Aug 7, 2009 at 5:53 AM, H.J. Lu<hjl.tools@gmail.com> wrote:
> On Fri, Aug 7, 2009 at 12:13 AM, Jakub Jelinek<jakub@redhat.com> wrote:
>> On Fri, Aug 07, 2009 at 02:54:46AM +0200, Mikulas Patocka wrote:
>>> > > In 32bit, the incoming stack may not be 16 byte aligned. ?This patch
>>> > > assumes the incoming stack is 4 byte aligned and realigns stack if any
>>> > > SSE variable is put on stack. Any comments?
>>> >
>>> > IMHO this is wrong, I could live with a non-default option for those who
>>> > don't care about performance and think a SCO document from 1996 has any
>>> > relevance to Linux these days. ?In reality a Linux ABI for years assumes
>>> > 16 byte stack alignment for 32-bit code.
>>>
>>> Tell me which Linux distribution did you run with 16-byte stack alignment
>>> checking (as proposed in bug 40838) and what was the result?
>>>
>>> For me, the result was that 75% of binaries in /bin in Debian Lenny do not
>>> align the stack on 16-byte boundary.
>>
>> Besides the obstack glibc bug which has been fixed since then you haven't
>> reported anything particular. ?It is true that parts of i?86 glibc is
>> compiled with -mpreferered-stack-boundary=2, but only parts that don't call
>> callbacks. ?Async signals AFAIK will align the stack properly.
>>
>> I simply don't trust your 75% claim, lots of stuff would break if things
>> weren't aligned properly.
>>
>
> From gcc 3.4:
>
> ?/* Validate -mpreferred-stack-boundary= value, or provide default.
> ? ? The default of 128 bits is for Pentium III's SSE __m128, but we
> ? ? don't want additional code to keep the stack aligned when
> ? ? optimizing for code size. ?*/
> ?ix86_preferred_stack_boundary = (optimize_size
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? TARGET_64BIT ? 128 : 32
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? : 128);
>
> If you compile code with -Os, you will get 4 byte stack alignment.
> Just step back, we changed stack alignment from 4 byte to 16byte
> for SSE since we couldn't realign stack at the time. Now we can
> realign the stack very efficiently. I think we should do it for SSE
> to support the existing Linux binaries which have 4 byte stack
> alignment. If it helps, I can compare -m32 -O3 -msse2 -mfp-math=sse
> results with SPEC CPU 2006, before and after my patch.
>
Here are the differences of -m32 -O3 -msse2 -mfpmath=sse -ffast-math
-funroll-loops
before and after my patch:
400.perlbench -0.384615%
401.bzip2 0%
403.gcc -0.362319%
429.mcf -0.813008%
445.gobmk 0.921659%
456.hmmer 0.549451%
458.sjeng -0.438596%
462.libquantum 0%
464.h264ref 0%
471.omnetpp -0.478469%
473.astar -0.645161%
483.xalancbmk -0.727273%
SPECint(R)_base2006 -0.411523%
410.bwaves -0.406504%
416.gamess 0%
433.milc -1.36986%
434.zeusmp -0.44843%
435.gromacs 0%
436.cactusADM 0%
437.leslie3d -0.888889%
444.namd 1.20482%
447.dealII -0.350877%
450.soplex -0.31746%
453.povray 0.458716%
454.calculix 0%
459.GemsFDTD 0%
465.tonto 0%
470.lbm 0%
481.wrf 0.480769%
482.sphinx3 0.940439%
SPECfp(R)_base2006 0%
I think we should align stack if SSE variables are put on stack.
--
H.J.