This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Optimize x*x*x*x*x*x using 3 multiplications.


On Wed, Jul 30, 2003 at 09:53:36PM -0600, Roger Sayle wrote:
> I'm not sure whether reassociation alone could work out that x^23
> is better calculated as x^13*x^10 instead of x^22*x.

Well, lets see:

   a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a
 = (a*a)*(a*a)*(a*a)*(a*a)*(a*a)*(a*a)*(a*a)*(a*a)*(a*a)*(a*a)*(a*a)*a

   b = a*a = a^2
 = b*b*b*b*b*b*b*b*b*b*b*a
 = (b*b)*(b*b)*(b*b)*(b*b)*(b*b)*(b*a)

   c = b*a = a^3
   d = b*b = a^4
 = d*d*d*d*d*c
 = (d*d)*(d*d)*(d*c)

   e = d*c = a^7
   f = d*d = a^8
 = f*f*e

Yields 7 multiplications via bottom-up reassociation.

   r = a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a
     = (a*a*a*a*a*a*a*a*a*a*a*a)*(a*a*a*a*a*a*a*a*a*a*a)
     = b * c
   b = a*a*a*a*a*a*a*a*a*a*a*a		= a^12
     = (a*a*a*a*a*a)*(a*a*a*a*a*a)
     = d * d
   c = a*a*a*a*a*a*a*a*a*a*a		= a^11
     = (a*a*a*a*a*a)*(a*a*a*a*a)
     = d * e
   d = a*a*a*a*a*a			= a^6
     = (a*a*a)*(a*a*a)
     = f * f
   e = a*a*a*a*a			= a^5
     = (a*a*a)*(a*a)
     = f * g
   f = a*a*a				= a^3
     = (a*a)*a
     = g * a
   g = a*a				= a^2

Yields a different set of 7 multiplications via top-down reassociation.

	r = x^23 = x^13 * x^10	= a * b
	a = x^13 = x^10 * x^3	= b * d
	b = x^10 = x^5 * x^5	= c * c
	c = x^5  = x^3 * x^2	= d * e
	d = x^3  = x^2 * x	= e * x
        e = x^2  = x * x	= x * x

Your power table yields 6 multiplications total.  Hmm...

So while reassociation can produce similar results for some inputs,
(e.g. the x^6 in the subject line) it can't for all inputs.  I can't
argue with results like that.

> I haven't checked but I'd guess GCC also handles "x+x+x+x+x+x" the same
> way as my patch, i.e. by representing it internally by a multiplication.

Yes, it should.

> It's probably not clear from the immediate context of the patch, but
> the transformation to convert x*x into pow(x,2) is already nested inside
> an "if (flag_unsafe_math_optimizations)" condition [fold-const.c:5944].

Ah.

> I could repeat the flag_unsafe_math_optimizations check redundantly
> if you think that would make things clearer?

No, that's ok.

> Ultimately, my suggestion of adding a __builtin_powi [or an equivalent
> EXP_EXPR tree node]

Or both.  Ask the Fortran folk what they need from such a node.
I'm not sure what they're doing with "**" at the moment...

> Similarly, do you have a good name for __builtin_memeq?

Huh?


r~


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]