alignment issues for sse

Eljay Love-Jensen
Wed Feb 16 15:32:00 GMT 2005

Hi Brian,

 >Could the problem be that a Camera class cannot be allocated on the heap 
in such a way that allows 16 byte alignment of the vector data types?

Oh yes, I believe that is very possibly the problem.

On my system, it appears that the memory allocation is fixed as if 
__attribute__((aligned(8))) is imposed on the allocation.  (This has no 
bearing on padding.)

For example:
struct three { char m[3]; };
three* p = new three[4];

The addresses could be...
&p[0] == 0x10008;
&p[1] == 0x1000B;
&p[2] == 0x1000E;
&p[3] == 0x10011;

Notice that the first one is aligned on an 8th byte boundary.

The alignment "promise" of the heap management subsystem is platform 
dependent.  As far as I am aware, there is no standard means to communicate 
alignment requirements to the heap manager.  :-(

Some heap managers, such as the one with SAS/C++, have lots of knobs to 
programmatically tweak heap manager behavior.  But that kind of API is not 
standard C or C++, and I'd be hesitant to rely upon it if portability is a 
concern (and for me, it is always a concern).

 >This occurs to me now because of what you said earlier about allocating 
by malloc, and also because my test program ONLY included object on the stack.

Serendipitous comment!  :-)

 >If this is the case, do I need to use a special memory allocator that 
does aligned heap allocations?

Yes.  In C++, you can override the new, new[], delete and delete[] 
operators of your class and instrument in the desired alignment behavior.

Alternatively, you can create your own custom allocator object -- but I'm 
not familiar with the caveats / pitfalls / worries of that technique.

Alternatively alternatively, you could perform the alignment yourself by 
kluge-magic, such as:

struct my_m128
   char m[32]; // auto-align.
   operator __m128& () { return *(int*)(&m[(int)(&m[0]) & 0x0F]); }

The gotchya is the wasted space, which is only worrisome for arrays.

I think your best bet is to manage your own __m128 only mini-heap manager.

 >Are any simple libraries available?

Not to my knowledge.  I do know that there are several high performance 
heap replacement libraries (each one is tuned for different performance 
characteristics) -- but I do not know the details about any of them.  I 
wouldn't be surprised if one-or-more of them are tunable to allocating only 
on 16th byte addresses.

Side note:  some heap management libraries are useful for debugging -- 
double deletes / double free, overruns, underruns, scrubbing deallocated 
memory with a known garbage value (e.g., 0xDEADBEEF), unreleased memory at 
program termination (leaks), et cetera.  These can be a very useful tools 
for the developer's arsenal.


More information about the Gcc-help mailing list