This is the mail archive of the
gcc-help@gcc.gnu.org
mailing list for the GCC project.
Re: alignment issues for sse
Hi Brian,
>Could the problem be that a Camera class cannot be allocated on the heap
in such a way that allows 16 byte alignment of the vector data types?
Oh yes, I believe that is very possibly the problem.
On my system, it appears that the memory allocation is fixed as if
__attribute__((aligned(8))) is imposed on the allocation. (This has no
bearing on padding.)
For example:
struct three { char m[3]; };
three* p = new three[4];
The addresses could be...
&p[0] == 0x10008;
&p[1] == 0x1000B;
&p[2] == 0x1000E;
&p[3] == 0x10011;
Notice that the first one is aligned on an 8th byte boundary.
The alignment "promise" of the heap management subsystem is platform
dependent. As far as I am aware, there is no standard means to communicate
alignment requirements to the heap manager. :-(
Some heap managers, such as the one with SAS/C++, have lots of knobs to
programmatically tweak heap manager behavior. But that kind of API is not
standard C or C++, and I'd be hesitant to rely upon it if portability is a
concern (and for me, it is always a concern).
>This occurs to me now because of what you said earlier about allocating
by malloc, and also because my test program ONLY included object on the stack.
Serendipitous comment! :-)
>If this is the case, do I need to use a special memory allocator that
does aligned heap allocations?
Yes. In C++, you can override the new, new[], delete and delete[]
operators of your class and instrument in the desired alignment behavior.
Alternatively, you can create your own custom allocator object -- but I'm
not familiar with the caveats / pitfalls / worries of that technique.
Alternatively alternatively, you could perform the alignment yourself by
kluge-magic, such as:
struct my_m128
{
char m[32]; // auto-align.
operator __m128& () { return *(int*)(&m[(int)(&m[0]) & 0x0F]); }
};
The gotchya is the wasted space, which is only worrisome for arrays.
I think your best bet is to manage your own __m128 only mini-heap manager.
>Are any simple libraries available?
Not to my knowledge. I do know that there are several high performance
heap replacement libraries (each one is tuned for different performance
characteristics) -- but I do not know the details about any of them. I
wouldn't be surprised if one-or-more of them are tunable to allocating only
on 16th byte addresses.
Side note: some heap management libraries are useful for debugging --
double deletes / double free, overruns, underruns, scrubbing deallocated
memory with a known garbage value (e.g., 0xDEADBEEF), unreleased memory at
program termination (leaks), et cetera. These can be a very useful tools
for the developer's arsenal.
--Eljay