This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: auto-vectorization analysis/__builtin_assume_aligned on gcc-4.7-20120114


On Thu, Jan 19, 2012 at 2:12 PM, Alexander Herz <alexander.herz@mytum.de> wrote:
> The generated non-vectorized assembly is simply the unrolled loop with >8
> iterations, so loop structure is pretty much intact (except for unrolling).
>
> Does the vectorizer fail on unrolled loops?
>
> I can compile some assembly dumps showing both the vectorized and the
> unvectorized loop?

Assembly does not help.  Loop unrolling happens after vectorization.

Richard.

> Alex
>
>
> On 01/19/2012 11:29 AM, Richard Guenther wrote:
>>
>> On Wed, Jan 18, 2012 at 6:37 PM, Alexander Herz<alexander.herz@mytum.de>
>> ?wrote:
>>>
>>> Given this piece of code (gcc-4.7-20120114):
>>>
>>> ? ?static void Test(Batch* block,Batch* new_block,const uint32 offs)
>>> ? ?{
>>>
>>> ? ? ? ?T* __restrict old_values
>>> =(T*)__builtin_assume_aligned(block->items,16);
>>> ? ? ? ?T* __restrict new_values
>>> =(T*)__builtin_assume_aligned(new_block->items,16);
>>>
>>> ? ? ? ?//assert(((uint64)(&block->items)%16)==0); //OK!!
>>> ? ? ? ?//assert(((uint64)(&new_block->items)%16)==0);
>>>
>>> ? ? ? ?for(uint32 c=0;c<(BS<<1);c++) //hopefully compiler applies SIMD
>>> here
>>> ? ? ? ?{
>>> ? ? ? ? ? ?new_values[c]=old_values[c]*old_values[c];
>>> ? ? ? ?}
>>>
>>> ? ?}
>>>
>>> I would assume that the loop is always vectorized (pointers tagged as
>>> restricted and aligned, loop
>>> over fixed iteration space even a power of 2, so most likely dividable by
>>> 4), it is quite similar to vectorization example22
>>> (http://gcc.gnu.org/projects/tree-ssa/vectorization.html#vectorizab).
>>>
>>> I run the previously mentioned g++ version with this command line:
>>> -std=c++0x -g -O3 -msse -msse2 -msse3 -msse4.1 -Wall -Wstrict-aliasing=2
>>> -ftree-vectorizer-verbose=2
>>>
>>> Looking at the vectorizer output (and at the generated assembly) it looks
>>> as
>>> if the loop given above
>>> is indeed vectorized if Test() is called from main() (vectorized 1 loop).
>>>
>>> When the function Test() is called nested inside some complex code, it
>>> looks
>>> as if the vectorization analysis gives up because the code is too complex
>>> to
>>> analyze and never considers the loop inside Test() in this context even
>>> though it should be easily vectorizeable in any context given the hints
>>> inside Test().
>>>
>>> Is there anything I can do, so that Test() is analyzed in all contexts? I
>>> guess all methods that contain the
>>> __builtin_assume_aligned hint should be considered for vectorization,
>>> independent of their context.
>>
>> Without a concrete example it is impossible to say. ?I suppose earlier
>> optimizations destroy loop structure too much?
>>
>>> Thx for your help,
>>> Alex
>>>
>>>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]