This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Optimize x*x*x*x*x*x using 3 multiplications.


On Fri, 2003-08-01 at 21:33, Geoff Keating wrote:
> dewar@gnat.com (Robert Dewar) writes:
> 
> > > Optimize x*x*x*x*x*x using 3 multiplications.
> > 
> > It is interesting to note that in general this kind of optimization decreases
> > accuracy (it is more accurate to do the full 5 multiplications). Nevertheless
> > it is probably reasonable. In Ada 83, this optimization is in fact not
> > permitted, since it is not guaranteed to have sufficient accuracy. Ada 95
> > deliberately relaxed this accuracy rule to allow this kind of optimization.
> 
> We can model a floating-point multiplication '*' of two values, x and y,
> as computing
> 
> x y (1 + e) < x * y < x y (1 - e)
> 
> where 'e' is 1/2 ulp, that is 2^-24 for single-precision IIRC.
> 
> So, if we're computing x*x*x*x from left-to-right, we get:
> 
> x x (1 + e) < x * x < x x (1 - e)
> x x (1 + e) x (1 + e) < (x * x) * x < x x (1 - e) x (1 - e)
> x x (1 + e) x (1 + e) x (1 + e) 
>    < ((x * x) * x) * x 
>    < x x (1 - e) x (1 - e) x (1 - e)
> 
> If we compute it as (x * x) * (x * x), we get
> 
> x x (1 + e) x x (1 + e) (1 + e)
>   < (x * x) * (x * x)
>   < x x (1 - e) x x (1 - e) (1 - e)
> 
> which, since real arithmetic is associative, is exactly the same as
> the previous answer.

You've just showed that a simple error analysis gives the same
error bounds, Robert is likely to reference more advanced (understand
really ugly :) analysis, where all the "e" terms are not blindly set to
the same value for all the domain and their relative value carefully
checked. 

If you compute 14^4 using perfectly rounded 2 digit base 10
multiplication, you get the following:

repeated squaring gives 40e3 (14*14=196=>20e1, 20e1*20e1=40e3
sequential multiplication gives 39e3 (20e1*14=28e2, 28e2*14=392e2=>39e3)

The correctly rounded result is 38416=>38e3, so sequential
multiplication is more accurate here.

This small computation proves nothing general of course but you see the
first multiplication error gets amplified more by squaring seemingly
accordingly to Robert next email proposed intuitive reason :). 

Laurent








Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]