[PATCH][PING] Some pending middle-end patches

Richard Guenther rguenther@suse.de
Sat Nov 11 21:00:00 GMT 2006

On Sat, 11 Nov 2006, Roger Sayle wrote:

> Hi Richard,
> On Sat, 11 Nov 2006, Richard Guenther wrote:
> > [PATCH] Fix PR25620, pow() expansion missed-optimization
> > http://gcc.gnu.org/ml/gcc-patches/2006-11/msg00216.html
> Sorry for the delay, but I think this can be implemented better.
> Consider the FORTRAN example X**1.5 described in the PR.
> With you'r patch we'll now generate the sequence.
> 	tmp = sqrt(x);
> 	return tmp*tmp*tmp;
> But notice that the PGF90 compiler that's reportedly 10 times
> faster than gfortran (in comment #9) expands the sequence:
> 	tmp = sqrt(x);
> 	return x*tmp;
> which is still better, as it only requires one multiplication
> and has less rounding.  I do like the idea of your patch.

Hm, you are right.  I'll try again.

> One other issue is that whilst I approve of the factoring of
> the common bits of expand_builtin_pow and expand_builtin_powi,
> I suspect that you've factored too much, and the processing
> of REAL_CST exponents at the top of expand_constant_power,
> should be hosted back into build_builtin_pow.  All of the
> logic is only shared once we've an integer exponent, which is
> always the case for powi.
> I wonder if you can do some timings, for things like pow(x,5.0/3.0)
> vs. cbrt(powi(x,5))?  I especially cautious of you comments about
> forcing c99 functions in gfortran front-end.  A fall-back in
> libgfortran's instrinsics/c99_functions.c is very likely to be
> implemented as pow(x,1.0/3.0), which means we'll ultimately be
> transforming one call to pow, into a call to pow and three additional
> multiplications :-(

I know...  it's hard to tell if the system math library has a
fast cbrt implementation.  As of the fortran front-end the problem
I don't know - at the moment we won't create a call to cbrt because
the builtin is not available, so it is up to the fortran maintainers
to decide.

> > [PATCH] Fix PR29719, ICE on expanding __builtin_lfloor/ceil
> > http://gcc.gnu.org/ml/gcc-patches/2006-11/msg00269.html
> > (don't know if this is really the way to go)
> I'm not sure this is the correct way to go either.  Firstly, the
> normal rule is that it is only safe to call __builtin_foo when
> foo is available on the system.  For example, when not optimizing
> we'll always call the function foo!  However, as an exception it
> might be reasonable to always inline something for lfloor and
> lceil.  However, I'm surprised at your choice of implementation,
> using:
> 	tmp = (int)x;
> 	return ((double)x > x ? tmp-1 : tmp;
> instead of the alternative:
> 	return (int)floor(x);
> On some platforms, one may be faster than the other.  Admittedly,
> x87's floor implementation is ugly enough to prefer the first,
> but with hardware support or software floating point the second
> may be preferable.
> My immediate preference would be to correct the middle-end code
> that makes assumptions about __builtin_foo when used by the user,
> and ends up calling "foo" (i.e. a linker error), as we do with
> other builtins.  We certainly shouldn't ICE.

The problem is that we make __builtin_lfloorf and 
__builtin_lfloorl available to the user even if the target doesn't
provide a floorf or floorl implementation.  So I chose to expand
via truncation (which any target has to supply) and compensation.

Currently we have the possibility to go either via the lfloor optab
or by using floor{f,,l} and truncation.  An alternative solution
to the problem would be making the {l,ll}{floor,ceil}{f,,l} builtins
internal to gcc only.


Richard Guenther <rguenther@suse.de>
Novell / SUSE Labs

More information about the Gcc-patches mailing list