This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Optimize x*x*x*x*x*x using 3 multiplications.
On Wed, Jul 30, 2003 at 09:53:36PM -0600, Roger Sayle wrote:
> I'm not sure whether reassociation alone could work out that x^23
> is better calculated as x^13*x^10 instead of x^22*x.
Well, lets see:
a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a
= (a*a)*(a*a)*(a*a)*(a*a)*(a*a)*(a*a)*(a*a)*(a*a)*(a*a)*(a*a)*(a*a)*a
b = a*a = a^2
= b*b*b*b*b*b*b*b*b*b*b*a
= (b*b)*(b*b)*(b*b)*(b*b)*(b*b)*(b*a)
c = b*a = a^3
d = b*b = a^4
= d*d*d*d*d*c
= (d*d)*(d*d)*(d*c)
e = d*c = a^7
f = d*d = a^8
= f*f*e
Yields 7 multiplications via bottom-up reassociation.
r = a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a
= (a*a*a*a*a*a*a*a*a*a*a*a)*(a*a*a*a*a*a*a*a*a*a*a)
= b * c
b = a*a*a*a*a*a*a*a*a*a*a*a = a^12
= (a*a*a*a*a*a)*(a*a*a*a*a*a)
= d * d
c = a*a*a*a*a*a*a*a*a*a*a = a^11
= (a*a*a*a*a*a)*(a*a*a*a*a)
= d * e
d = a*a*a*a*a*a = a^6
= (a*a*a)*(a*a*a)
= f * f
e = a*a*a*a*a = a^5
= (a*a*a)*(a*a)
= f * g
f = a*a*a = a^3
= (a*a)*a
= g * a
g = a*a = a^2
Yields a different set of 7 multiplications via top-down reassociation.
r = x^23 = x^13 * x^10 = a * b
a = x^13 = x^10 * x^3 = b * d
b = x^10 = x^5 * x^5 = c * c
c = x^5 = x^3 * x^2 = d * e
d = x^3 = x^2 * x = e * x
e = x^2 = x * x = x * x
Your power table yields 6 multiplications total. Hmm...
So while reassociation can produce similar results for some inputs,
(e.g. the x^6 in the subject line) it can't for all inputs. I can't
argue with results like that.
> I haven't checked but I'd guess GCC also handles "x+x+x+x+x+x" the same
> way as my patch, i.e. by representing it internally by a multiplication.
Yes, it should.
> It's probably not clear from the immediate context of the patch, but
> the transformation to convert x*x into pow(x,2) is already nested inside
> an "if (flag_unsafe_math_optimizations)" condition [fold-const.c:5944].
Ah.
> I could repeat the flag_unsafe_math_optimizations check redundantly
> if you think that would make things clearer?
No, that's ok.
> Ultimately, my suggestion of adding a __builtin_powi [or an equivalent
> EXP_EXPR tree node]
Or both. Ask the Fortran folk what they need from such a node.
I'm not sure what they're doing with "**" at the moment...
> Similarly, do you have a good name for __builtin_memeq?
Huh?
r~