This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Info], Add suport for PowerPC IEEE 128-bit floating point


I did some timing tests to compare the new PowerPC IEEE 128-bit results to the
current implementation of long double using the IBM extended format.

The test consisted a short loop doing the operation over arrays of 1,024
elements, reading in two values, doing the operation, and then storing it back.
This loop in turn was done multiple times, with the idea that most of the
values would be in the cache, and we didn't have to worry about pre-fetching,
etc.

The float, double tests were done with vectorization disabled, while the vector
float and vector double tests, the compiler was allowed to do the normal auto
vectorization.

The number reported was how much longer the second column took over the first:

Generally, the __float128 is 2x slower than the current IBM extended double
format, except for divide, where it is 5x slower.  I must say, the software
floating point emulation routines worked well, and once the proper macros were
setup, I only needed to override the type used for IEEE 128-bit.

Add loop
========

float       vs double:          2.00x
float       vs vector float:    4.97x
double      vs vector double:   2.63x
long double vs double:         16.85x
__float128  vs double:         23.34x
__float128  vs long double:     1.39x

Subtract loop
=============

float       vs double:          1.99x
float       vs vector float:    4.66x
double      vs vector double:   2.63x
long double vs double:         14.47x
__float128  vs double:         27.65x
__float128  vs long double:     1.91x

Multiply loop
=============

float       vs double:          2.05x
float       vs vector float:    5.18x
double      vs vector double:   2.59x
long double vs double:         11.58x
__float128  vs double:         27.44x
__float128  vs long double:     2.37x

Divide loop
===========

float       vs double:          0.82x
float       vs vector float:    2.11x
double      vs vector double:   2.00x
long double vs double:          5.90x
__float128  vs double:         34.57x
__float128  vs long double:     5.86x

Maximum via comparison and ?:
=============================

float       vs double:          1.74x
float       vs vector float:    4.62x
double      vs vector double:   2.62x
long double vs double:          5.07x
__float128  vs double:         18.02x
__float128  vs long double:     3.55x

Minimum via comparison and ?:
=============================

float       vs double:          1.74x
float       vs vector float:    4.52x
double      vs vector double:   2.62x
long double vs double:          5.38x
__float128  vs double:         15.14x
__float128  vs long double:     2.82x



-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]