Bug 26650 - unaligned (SSE) stack access, smashing
Summary: unaligned (SSE) stack access, smashing
Status: RESOLVED WONTFIX
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.1.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-03-12 06:19 UTC by tbp
Modified: 2009-09-17 10:29 UTC (History)
1 user (show)

See Also:
Host:
Target: x86, x86_64
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments
testcase #1 (114.72 KB, application/octet-stream)
2006-03-12 06:21 UTC, tbp
Details
testcase #2 (114.75 KB, application/octet-stream)
2006-03-12 06:21 UTC, tbp
Details

Note You need to log in before you can comment on or make changes to this bug.
Description tbp 2006-03-12 06:19:33 UTC
This bug is transient and sensible to code/structures re-arrangements and how things get inlined. In the included testcases it shows up as unaligned stack load/store but atm in the current app, i also have values being smashed on the stack and no segfaults.

Shows up with g++ 4.1.0 and 4.2-20060225 on x86 (cygwin) and x86-64 (linux), and in fact all 4.2.x i have tried.

With this script...
#!/usr/bin/perl
while(<>) {
	chomp;
	next if !/movaps/;
	next if !/esp/;
	next if !/(0x\w+)/;
	next if substr($1, -1, 1) eq '0';
	print "$_\n";
}

... and g++4.1.0 on cygwin...
/usr/local/gcc-4.1.0/bin/g++ -march=k8 -mfpmath=sse -msse3 -O2 -fomit-frame-pointer bogus1.ii -c -o tt1.o && objdump.exe -d --no-show-raw-insn tt1.o |./check_alignment.pl
    1664:       movaps %xmm0,0x7c8(%esp)
    2054:       movaps %xmm0,0x318(%esp)
    28cd:       movaps %xmm0,0x1f8(%esp)
    4579:       movaps %xmm0,0x338(%esp)
    513d:       movaps %xmm0,0x328(%esp)

/usr/local/gcc-4.1.0/bin/g++ -march=k8 -mfpmath=sse -msse3 -fomit-frame-pointer -Os bogus2.ii -c -o tt2.o && objdump.exe -d --no-show-raw-insn tt2.o |./check_alignment.pl
     274:       movaps %xmm5,0x74(%esp)
     281:       movaps %xmm1,0x64(%esp)
     2ac:       movaps %xmm4,0x84(%esp)
     2b8:       movaps %xmm4,0x84(%esp)
     2cf:       movaps %xmm5,0x54(%esp)
     2d8:       movaps %xmm5,0x54(%esp)
     2e9:       movaps %xmm0,0x44(%esp)
     2f1:       movaps %xmm0,0x44(%esp)
     3a3:       movaps %xmm3,0x34(%esp)
     3a8:       movaps %xmm1,0x24(%esp)
     426:       movaps 0x24(%esp),%xmm7
     475:       movaps 0x34(%esp),%xmm4
     4cf:       movaps 0x64(%esp),%xmm0
     851:       movaps %xmm0,0x18(%esp)
     859:       movaps 0x18(%esp),%xmm2
     865:       movaps %xmm0,0x28(%esp)
     879:       movaps 0x18(%esp),%xmm0
     903:       movaps 0x18(%esp),%xmm0
[snipped 300 more]

Excuse those large testcases but i have no idea how to reproduce it and it only happens in that rather large unit.
Comment 1 tbp 2006-03-12 06:21:13 UTC
Created attachment 11024 [details]
testcase #1
Comment 2 tbp 2006-03-12 06:21:34 UTC
Created attachment 11025 [details]
testcase #2
Comment 3 Andrew Pinski 2006-03-12 14:50:31 UTC
_mm_store_ss((float*)(((float*) &rays[0]) + 0), (pvx));

Comment 4 Andrew Pinski 2006-03-12 14:52:01 UTC
I don't think rays[0] is a POD so this might turn out to be a bug in your code.
Comment 5 Andreas Schwab 2006-03-12 15:45:34 UTC
vec_t is a non-POD type because it has a user-defined copy assignment operator, thus ray_t can't be a POD either.
Comment 6 tbp 2006-03-12 21:03:18 UTC
You're right, but that's a _mm_store_ss/movss asking for a 4 bytes alignment (which is satisfied) and not a movaps with a 16 bytes constraint. The latter are what are causing problems.
Comment 7 tbp 2006-03-12 21:35:32 UTC
For clarification i should say that rt::mono::ray_t which uses vec_t etc, isn't a source of trouble, it's part of the single ray path where mostly scalar ops are used.

There's a symmetrical set of structures in rt::packet which deal with bundles of rays (ie 2x2) and uses packed vectors; that's what that unit is massaging.
Some functions have a bunch of live 16 bytes aligned data on the stack and depending on how they get (force_)inlined g++ goes nuts an forgets about those constraints.
Comment 8 Uroš Bizjak 2009-09-17 10:29:34 UTC
Gcc <= 4.2.x are not supported anymore (BTW: A lot of aligmnent fixes went into gcc-4.4.x, so there is a big chance of bug being fixed there).