This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] do not lower a/b to a*(1/b)

On Fri, 13 May 2005, Paolo Bonzini wrote:
> > We still don't do it for straight line code on the tree level.
> > For an example:
> > double f(double a, double b, double c)
> > {
> >   return a/c + b/c;
> > }
> I know, but the biggest benefit is to do it in loops; a function such as
> f will hopefully be inlined (and if in a loop, we'll optimize it).
> extract_muldiv could easily be improved to handle this; right now, I
> believe that the code in expr.c does more harm than good.

Unfortunately, David Edelsohn's benchmarking shows that this isn't
the case, which is why he didn't remove this code from expr.c when
he submitted his patch.  For example, normalizing vectors typically
uses an idiom such as:

	modulus = sqrt(x*x + y*y + z*z);
	x = x / modulus;
	y = y / modulus;
	z = z / modulus;

Feel free to post your own numbers that refute DJE's figures.

The compromise that I proposed on IRC was that I was happy to allow
this code to be removed provided that a "missed optimization" regression
PR is filed against mainline.  I'd asked Daniel Berlin to investigate
whether it was possible to perform the missing transformation in basic
blocks at the tree-level (either as its own pass or part of another).
It's a form of pseudo-CSE/PRE, where when a/b is evaluated, you record
that 1.0/b is potentially available, and only when a second use is
discovered do you decompose the original division into a reciprocal
and multiplication.

> 2005-05-13  Paolo Bonzini  <>
> 	* expr.c (expand_expr_real_1) <case RDIV_EXPR>: Never emit as
>	a*(1/b).

With the above caveat (and the stigma of slowing down the compiler :),
you should tweak your patch to avoid leaving:

	case RDIV_EXPR:
	  goto binop;

which is far better (re)written by placing the RDIV_EXPR case label
immediately before "binop:".  With that change and the missed optimization
regression PR filed in bugzilla, this patch is OK for mainline.

I totally agree that this particular optimization *during RTL expansion*
has been as much trouble to the RTL optimizers and combine, as it has
been a benefit to code performance.  But you also know my position on
removing RTL optimizations *before* they are replaced with similar
functionality elsewhere.  In several recent posting gcc 4.0 has been
reported slower than gcc 3.x, and it's a trend I'm not happy with, even
if attempts to "stop the rot" make reviewers unpopular.

Thanks in advance,


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]