-falign-loops=16 on apple gcc still gives loops not aligned to 16 byte address boundaries -why?
Wed Mar 8 19:43:00 GMT 2006
> one loop running from a VTK lib
> which takes up much processor time
> has not been aligned to 16 byte boundary
The -falign-loops option is a suggestion, not a requirements. Not
all loops are aligned to the value specified. GCC uses various heuristics
to determine if if should be aligned and only aligns loops if it will not
require more than a certain number of nops. Compiling with profiling can
help GCC determine better heuristics.
> shark also tells me this loop contains a singele-precision floating
> point computation that could be speeded up using altivec
> -fast also turns on -maltivec
-maltivec is not the same as auto-vectorization. One can try
auto-vectorization or manually convert the loop to use Altivec intrinsics.
More information about the Gcc-help