This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Fix PR25620, pow() expansion missed-optimization, 2nd try
- From: Richard Guenther <rguenther at suse dot de>
- To: Roger Sayle <roger at eyesopen dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Mon, 13 Nov 2006 10:49:08 +0100 (CET)
- Subject: Re: [PATCH] Fix PR25620, pow() expansion missed-optimization, 2nd try
- References: <Pine.LNX.4.44.0611120926370.27268-100000@www.eyesopen.com>
On Sun, 12 Nov 2006, Roger Sayle wrote:
>
> Hi Richard,
>
> On Sun, 12 Nov 2006, Richard Guenther wrote:
> > Bootstrapped and regtested on x86_64-unknown-linux-gnu. SPEC 2k
> > doesn't show a difference, Polyhedron has an improvement for air
> > (down from 21s to 16.5s runtime).
>
> Cool! I hadn't appreciated this had such a significant impact on
> polyhedron.
>
>
> This is much better, thanks. However, there are still a number of
> (new) issues that need to be resolved.
>
> The first is a potential bug...
>
> > ! expand_simple_binop (mode, MULT, op, op2, op, 0, OPTAB_DIRECT);
>
> The expand_binop, and expand_simple_binop, APIs treat their target
> argument as a hint. Hence the return result of expand_simple_binop
> isn't guaranteed to be in target_rtx. Hence you need to use
>
> op = expand_simple_binop (....)
Whoops. Looks like I have to fix some of the i386 rounding expanders
as well here...
> in this and similar uses. It turns out for your uses in expand_pow,
> it makes sense to pass NULL_RTX as the target, allowing the expanders
> to select the best destination for these intermediate results, often
> a unique pseudo. This allows for more efficient initial RTL, and
> improves the change of CSEing the intermediate results.
Ok, I'll do so.
> Secondly, you'll need to use save_expr or some similar mechanism to
> avoid the argument arg0 from being evaluated multiple times. It may
> have side-effects (such as a function call or volatile), and your
> current implementation calls expand_expr on the same tree twice
> without taking care to avoid re-evaluation. See the uses of
> builtin_save_expr in builtins.c.
Ok.
> Next, I think the code could be made cleaner and more efficient
> by nesting conditionals. For example, you test for things such as
> the existance of a sqrt builtin and -Os very late, after we've done
> a significant amount of work in the soft-float emulation. It makes
> much more sense to test these conditions early, and avoid wasting
> fruitless effort. In fact, I'd take the same opportunity to allow
> flag_unsafe_math optimizations to fall through to the bottom of the
> function.
Ok, I'll see how I can make this some prettier. I wonder if it is
safe to expand pow (x, 1/3) and pow (x, 1/2) to cbrt / sqrt always?
I see in fold_builtin_pow we conditionalize both of them to
flag_unsafe_math_optimizations - but as for example both functions
set errno in the same conditions and exchanging one function call
for another shouldn't affect rounding I don't see why we are
overly careful here.
>
>
>
> Something like:
>
> if (flag_unsafe_math_optimizations && !optimize_size)
> {
> fn = mathfn_builtin (type, BUILT_IN_SQRT);
> if (fn != NULL_TREE)
> {
> real_arithmetic (...);
> ...
> }
> fn = mathfn_builtin (type, BUILT_IN_CBRT);
> if (fn != NULL_TREE)
> {
> ...
> }
> }
>
> Finally,
>
> > Index: testsuite/gcc.dg/builtins-58.c
> > + /* { dg-do compile } */
> > + /* { dg-options "-O -ffast-math -std=c99" } */
> > ...
> > + /* { dg-final { scan-assembler-times "cbrt" 3 } } */
>
>
> I think you need to do something more to make this test case portable.
> We'll only reduce to "cbrt" on targets that define TARGET_C99_FUNCTIONS,
> which isn't the same as passing -std=c99 on the command line. I think
> you need to use the builtins-config.h header so that you can guard this
> test with #ifdef HAVE_C99_RUNTIME.
Hm, yes.
> Sorry that this is becoming a bit of an effort. All of the points
> above are "new" with your latest "2nd try" attempt at this patch.
> Hopefully, third time is a charm. The patch is looking pretty good,
> and works fine on x86 on your testcases. It's just the corner cases
> and compile-time issues that need to be addressed.
No problem ;)
> Many thanks again for working on this. I think its reasonable to
> leave the cbrt bits in. Is it usage of sqrt or cbrt that benefits
> polyhedron's air?
It has lots of exponentiations of which at least one **1.5 is in a
hot loop (I didn't try finding out what the other exponents are). So
I guess it's sqrt.
Richard.
--
Richard Guenther <rguenther@suse.de>
Novell / SUSE Labs