This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: auto-vectorization analysis/__builtin_assume_aligned on gcc-4.7-20120114
- From: Richard Guenther <richard dot guenther at gmail dot com>
- To: Alexander Herz <alexander dot herz at mytum dot de>
- Cc: gcc at gcc dot gnu dot org
- Date: Thu, 19 Jan 2012 11:29:09 +0100
- Subject: Re: auto-vectorization analysis/__builtin_assume_aligned on gcc-4.7-20120114
- References: <4F170362.6010000@mytum.de>
On Wed, Jan 18, 2012 at 6:37 PM, Alexander Herz <alexander.herz@mytum.de> wrote:
> Given this piece of code (gcc-4.7-20120114):
>
> ? ?static void Test(Batch* block,Batch* new_block,const uint32 offs)
> ? ?{
>
> ? ? ? ?T* __restrict old_values
> =(T*)__builtin_assume_aligned(block->items,16);
> ? ? ? ?T* __restrict new_values
> =(T*)__builtin_assume_aligned(new_block->items,16);
>
> ? ? ? ?//assert(((uint64)(&block->items)%16)==0); //OK!!
> ? ? ? ?//assert(((uint64)(&new_block->items)%16)==0);
>
> ? ? ? ?for(uint32 c=0;c<(BS<<1);c++) //hopefully compiler applies SIMD here
> ? ? ? ?{
> ? ? ? ? ? ?new_values[c]=old_values[c]*old_values[c];
> ? ? ? ?}
>
> ? ?}
>
> I would assume that the loop is always vectorized (pointers tagged as
> restricted and aligned, loop
> over fixed iteration space even a power of 2, so most likely dividable by
> 4), it is quite similar to vectorization example22
> (http://gcc.gnu.org/projects/tree-ssa/vectorization.html#vectorizab).
>
> I run the previously mentioned g++ version with this command line:
> -std=c++0x -g -O3 -msse -msse2 -msse3 -msse4.1 -Wall -Wstrict-aliasing=2
> -ftree-vectorizer-verbose=2
>
> Looking at the vectorizer output (and at the generated assembly) it looks as
> if the loop given above
> is indeed vectorized if Test() is called from main() (vectorized 1 loop).
>
> When the function Test() is called nested inside some complex code, it looks
> as if the vectorization analysis gives up because the code is too complex to
> analyze and never considers the loop inside Test() in this context even
> though it should be easily vectorizeable in any context given the hints
> inside Test().
>
> Is there anything I can do, so that Test() is analyzed in all contexts? I
> guess all methods that contain the
> __builtin_assume_aligned hint should be considered for vectorization,
> independent of their context.
Without a concrete example it is impossible to say. I suppose earlier
optimizations destroy loop structure too much?
> Thx for your help,
> Alex
>
>