This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix PR25620, pow() expansion missed-optimization, 2nd try


On Sun, 12 Nov 2006, Roger Sayle wrote:

> 
> Hi Richard,
> 
> On Sun, 12 Nov 2006, Richard Guenther wrote:
> > Bootstrapped and regtested on x86_64-unknown-linux-gnu.  SPEC 2k
> > doesn't show a difference, Polyhedron has an improvement for air
> > (down from 21s to 16.5s runtime).
> 
> Cool!  I hadn't appreciated this had such a significant impact on
> polyhedron.
> 
> 
> This is much better, thanks.  However, there are still a number of
> (new) issues that need to be resolved.
> 
> The first is a potential bug...
> 
> > ! 	      expand_simple_binop (mode, MULT, op, op2, op, 0, OPTAB_DIRECT);
> 
> The expand_binop, and expand_simple_binop, APIs treat their target
> argument as a hint.  Hence the return result of expand_simple_binop
> isn't guaranteed to be in target_rtx.  Hence you need to use
> 
> 	op = expand_simple_binop (....)

Whoops.  Looks like I have to fix some of the i386 rounding expanders
as well here...

> in this and similar uses.  It turns out for your uses in expand_pow,
> it makes sense to pass NULL_RTX as the target, allowing the expanders
> to select the best destination for these intermediate results, often
> a unique pseudo.  This allows for more efficient initial RTL, and
> improves the change of CSEing the intermediate results.

Ok, I'll do so.

> Secondly, you'll need to use save_expr or some similar mechanism to
> avoid the argument arg0 from being evaluated multiple times.  It may
> have side-effects (such as a function call or volatile), and your
> current implementation calls expand_expr on the same tree twice
> without taking care to avoid re-evaluation.  See the uses of
> builtin_save_expr in builtins.c.

Ok.

> Next, I think the code could be made cleaner and more efficient
> by nesting conditionals.  For example, you test for things such as
> the existance of a sqrt builtin and -Os very late, after we've done
> a significant amount of work in the soft-float emulation.  It makes
> much more sense to test these conditions early, and avoid wasting
> fruitless effort.  In fact, I'd take the same opportunity to allow
> flag_unsafe_math optimizations to fall through to the bottom of the
> function.

Ok, I'll see how I can make this some prettier.  I wonder if it is
safe to expand pow (x, 1/3) and pow (x, 1/2) to cbrt / sqrt always?
I see in fold_builtin_pow we conditionalize both of them to
flag_unsafe_math_optimizations - but as for example both functions
set errno in the same conditions and exchanging one function call
for another shouldn't affect rounding I don't see why we are
overly careful here.


> 
> 
> 
> Something like:
> 
> 	if (flag_unsafe_math_optimizations && !optimize_size)
> 	  {
> 	    fn = mathfn_builtin (type, BUILT_IN_SQRT);
> 	    if (fn != NULL_TREE)
> 	      {
> 		real_arithmetic (...);
> 		...
> 	      }
> 	    fn = mathfn_builtin (type, BUILT_IN_CBRT);
> 	    if (fn != NULL_TREE)
> 	      {
> 		...
> 	      }
> 	  }
> 
> Finally,
> 
> > Index: testsuite/gcc.dg/builtins-58.c
> > + /* { dg-do compile } */
> > + /* { dg-options "-O -ffast-math -std=c99" } */
> > ...
> > + /* { dg-final { scan-assembler-times "cbrt" 3 } } */
> 
> 
> I think you need to do something more to make this test case portable.
> We'll only reduce to "cbrt" on targets that define TARGET_C99_FUNCTIONS,
> which isn't the same as passing -std=c99 on the command line.  I think
> you need to use the builtins-config.h header so that you can guard this
> test with #ifdef HAVE_C99_RUNTIME.

Hm, yes.

> Sorry that this is becoming a bit of an effort.  All of the points
> above are "new" with your latest "2nd try" attempt at this patch.
> Hopefully, third time is a charm.  The patch is looking pretty good,
> and works fine on x86 on your testcases.  It's just the corner cases
> and compile-time issues that need to be addressed.

No problem ;)

> Many thanks again for working on this.  I think its reasonable to
> leave the cbrt bits in.  Is it usage of sqrt or cbrt that benefits
> polyhedron's air?

It has lots of exponentiations of which at least one **1.5 is in a
hot loop (I didn't try finding out what the other exponents are).  So
I guess it's sqrt.

Richard.

--
Richard Guenther <rguenther@suse.de>
Novell / SUSE Labs


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]