[PATCH] Fix PR25620, pow() expansion missed-optimization, 2nd try

Roger Sayle roger@eyesopen.com
Sun Nov 12 18:10:00 GMT 2006


Hi Richard,

On Sun, 12 Nov 2006, Richard Guenther wrote:
> Bootstrapped and regtested on x86_64-unknown-linux-gnu.  SPEC 2k
> doesn't show a difference, Polyhedron has an improvement for air
> (down from 21s to 16.5s runtime).

Cool!  I hadn't appreciated this had such a significant impact on
polyhedron.


This is much better, thanks.  However, there are still a number of
(new) issues that need to be resolved.

The first is a potential bug...

> ! 	      expand_simple_binop (mode, MULT, op, op2, op, 0, OPTAB_DIRECT);

The expand_binop, and expand_simple_binop, APIs treat their target
argument as a hint.  Hence the return result of expand_simple_binop
isn't guaranteed to be in target_rtx.  Hence you need to use

	op = expand_simple_binop (....)

in this and similar uses.  It turns out for your uses in expand_pow,
it makes sense to pass NULL_RTX as the target, allowing the expanders
to select the best destination for these intermediate results, often
a unique pseudo.  This allows for more efficient initial RTL, and
improves the change of CSEing the intermediate results.


Secondly, you'll need to use save_expr or some similar mechanism to
avoid the argument arg0 from being evaluated multiple times.  It may
have side-effects (such as a function call or volatile), and your
current implementation calls expand_expr on the same tree twice
without taking care to avoid re-evaluation.  See the uses of
builtin_save_expr in builtins.c.

Next, I think the code could be made cleaner and more efficient
by nesting conditionals.  For example, you test for things such as
the existance of a sqrt builtin and -Os very late, after we've done
a significant amount of work in the soft-float emulation.  It makes
much more sense to test these conditions early, and avoid wasting
fruitless effort.  In fact, I'd take the same opportunity to allow
flag_unsafe_math optimizations to fall through to the bottom of the
function.



Something like:

	if (flag_unsafe_math_optimizations && !optimize_size)
	  {
	    fn = mathfn_builtin (type, BUILT_IN_SQRT);
	    if (fn != NULL_TREE)
	      {
		real_arithmetic (...);
		...
	      }
	    fn = mathfn_builtin (type, BUILT_IN_CBRT);
	    if (fn != NULL_TREE)
	      {
		...
	      }
	  }

Finally,

> Index: testsuite/gcc.dg/builtins-58.c
> + /* { dg-do compile } */
> + /* { dg-options "-O -ffast-math -std=c99" } */
> ...
> + /* { dg-final { scan-assembler-times "cbrt" 3 } } */


I think you need to do something more to make this test case portable.
We'll only reduce to "cbrt" on targets that define TARGET_C99_FUNCTIONS,
which isn't the same as passing -std=c99 on the command line.  I think
you need to use the builtins-config.h header so that you can guard this
test with #ifdef HAVE_C99_RUNTIME.


Sorry that this is becoming a bit of an effort.  All of the points
above are "new" with your latest "2nd try" attempt at this patch.
Hopefully, third time is a charm.  The patch is looking pretty good,
and works fine on x86 on your testcases.  It's just the corner cases
and compile-time issues that need to be addressed.

Many thanks again for working on this.  I think its reasonable to
leave the cbrt bits in.  Is it usage of sqrt or cbrt that benefits
polyhedron's air?

Roger
--



More information about the Gcc-patches mailing list