This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Fw: RFC: Representing vector lane load/store operations

From: Ira Rosen <ira dot rosen at linaro dot org>
To: Richard Guenther <richard dot guenther at gmail dot com>
Cc: gcc at gcc dot gnu dot org, rdsandiford at googlemail dot com
Date: Wed, 23 Mar 2011 12:01:59 +0200
Subject: Re: Fw: RFC: Representing vector lane load/store operations
References: <OF630F8944.96ACEF66-ONC225785C.00363150-C225785C.003637B4@il.ibm.com>

>> ...Ira would know best, but I don't think it would be used for this
>> kind of loop. ?It would be more something like:
>>
>> ? for (i=0; i<N; ++i)
>> ? ? X[i] = Y[i].red + Y[i].blue + Y[i].green;
>>
>> (not a realistic example). ?You'd then have:
>>
>> ? ?compoundY = __builtin_load_lanes (Y);
>> ? ?red = ARRAY_REF <compoundY, 0>
>> ? ?green = ARRAY_REF <compoundY, 1>
>> ? ?blue = ARRAY_REF <compoundY, 2>
>> ? ?D1 = red + green
>> ? ?D2 = D1 + blue
>> ? ?MEM_REF <X> = D2;
>>
>> My understanding is that'd we never do any operations besides ARRAY_REFs
>> on the compound value, and that the individual vectors would be treated
>> pretty much like any other.
>
> Ok, I thought it might be used to have a larger vectorization factor for
> loads and stores, basically make further unrolling cheaper because you
> don't have to duplicate the loads and stores.

Right, we can do that using vld1/vst1 instructions (full load/store
with N=1) and operate on up to 4 doubleword vectors in parallel. But
at the moment we are concentrating on efficient support of strided
memory accesses.

Ira

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]