This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH, powerpc] Add -mmass to use XL's MASS vectorization library


On Wed, Aug 18, 2010 at 15:50, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> On Wed, Aug 18, 2010 at 10:36:13PM +0200, Richard Guenther wrote:
>> On Wed, Aug 18, 2010 at 10:32 PM, Michael Meissner
>> <meissner@linux.vnet.ibm.com> wrote:
>> > This patch was cloned from the i386 -mveclibabi=<xxx> support, and it adds a
>> > new switch (-mmass) that says to vectorize various mathematical functions (sin,
>> > cos, etc.) on power7 systems. ?This patch greatly speeds up 3 of the Spec 2006
>> > floating point benchmarks (tonto, wrf, GemsFDTD) that heavily use the math
>> > functions. ?I have done bootstraps on my power systems, and comparison tests
>> > and there were no regressions. ?Is it ok to install in the tree?
>>
>> In the case that we develop a common library for all archs it would be nice
>> to have the same switch for ppc as we have for x86, so why didn't you
>> use -mveclibabi=mass?
>
> That sounds reasonable.
>
> It isn't in this patch, but at some point, I think it would be a useful to add
> a SSA pass to transform the code to call a function function that takes
> pointers and a length argument, and eliminate the loop. ?This way, the library
> can properly deal with load delays, etc. ?If memory serves, the Intel and AMD
> optimized math libraries have similar functions, though the order of the
> arguments is different than the MASS arguments. ?Is this the case?
>
> If I wasn't clear, consider the loop:
>
> ? ? ? ?for (i = 0; i < size; i++)
> ? ? ? ? ?a[i] = __builtin_sin (b[i])
>
> right now gets transformed to:
>
> ? ? ? ?V2DF_a_ptr = (V2DF *)a;
> ? ? ? ?V2DF_b_ptr = (V2DF *)b;
> ? ? ? ?for (i = 0; i < size/2; i++)
> ? ? ? ? ?V2DF_a_ptr[i] = sind2 (V2DF_b_ptr[i])
>
> and instead it should generate:
>
> ? ? ? ?len_tmp = size;
> ? ? ? ?vsin (a, b, &len_tmp);
>

I also thought about this transform, and I think it
could be called from loop distribution.

Sebastian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]