This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH/RFC], PR 42694/middle-end, make pow faster for some constant exponents


On Tue, Jan 26, 2010 at 1:57 AM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> I would like to get some comments on this patch that I plan to submit to GCC
> 4.6 when it is opened up.
>
> In looking at the various spec 2006 runs, I noticed one bencmark (bwaves) was
> spending a lot of time in the pow function. ?I looked at the source, and saw
> that all of the pow's came from the use of x**0.75 in the Fortran source (x**2
> having already been optimized out by the compiler).
>
> This patch changes the compiler so that under -ffast-math:

When not optimizing for size only.

> ? ? pow (x, 0.25) ? ? ?replaced by sqrt (sqrt (x))

should be always profitable if an optab for sqrt exists

> ? ? pow (x, 0.125) ? ? replaced by sqrt (sqrt (sqrt (x)))

I'm not sure.

> ? ? pow (x, 0.75) ? ? ?replaced by sqrt (x) * sqrt (sqrt (x))

always profitable

> ? ? pow (x, 1./6.) ? ? replaced by cbrt (sqrt (x))

I'm not sure.

Did all these happen in bwaves?

> And have the compiler stop replacing sqrt (sqrt (x)), sqrt (cbrt (x)), and
> cbrt (sqrt (x)) with equivalent pow calls.

These are for canonicalization, so please retain them.

For exposing CSE opportunities this kind of transformations should
probably be done earlier than expansion while still in gimple
(look at tree-ssa-math-opts.c where we for example do some
transformations only if there are CSE opportunities with other
expressions).

> I added switches to control this, but I could just as easily go with --param or
> target hooks.

I'd say avoid all of them by applying common sense, if at all
use target hooks (I suppose a generic builtin_cost that returns
values that can be compared with rtx_cost would be useful).

> I looked at the differences between what the compiler is doing now, and with
> the optimization, and I feel they are not out of bounds (in the range of what
> the current -ffast-math optimizations does already), but I would welcome
> comments of how many sqrts/cbrts should be generated before the error gets out
> of hand.

We already do unbound expansion of integer powers, so I guess
we don't care (with -ffast-math).

> In general, on x86 and power systems, sqrt is an order of magntude faster than
> pow, and cbrt is about 1.5 times faster.

Does power have a cbrt instruction or is it the libcall that is faster?

Thanks,
Richard.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]