This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH] Optimize x*x*x*x*x*x using 3 multiplications.
- From: Laurent GUERBY <guerby at acm dot org>
- To: Geoff Keating <geoffk at geoffk dot org>
- Cc: Robert Dewar <dewar at gnat dot com>, gcc at gcc dot gnu dot org
- Date: Fri, 01 Aug 2003 23:40:45 +0200
- Subject: Re: [PATCH] Optimize x*x*x*x*x*x using 3 multiplications.
- References: <20030801041740.88506F2E2E@nile.gnat.com> <firstname.lastname@example.org>
- Reply-to: guerby at acm dot org
On Fri, 2003-08-01 at 21:33, Geoff Keating wrote:
> email@example.com (Robert Dewar) writes:
> > > Optimize x*x*x*x*x*x using 3 multiplications.
> > It is interesting to note that in general this kind of optimization decreases
> > accuracy (it is more accurate to do the full 5 multiplications). Nevertheless
> > it is probably reasonable. In Ada 83, this optimization is in fact not
> > permitted, since it is not guaranteed to have sufficient accuracy. Ada 95
> > deliberately relaxed this accuracy rule to allow this kind of optimization.
> We can model a floating-point multiplication '*' of two values, x and y,
> as computing
> x y (1 + e) < x * y < x y (1 - e)
> where 'e' is 1/2 ulp, that is 2^-24 for single-precision IIRC.
> So, if we're computing x*x*x*x from left-to-right, we get:
> x x (1 + e) < x * x < x x (1 - e)
> x x (1 + e) x (1 + e) < (x * x) * x < x x (1 - e) x (1 - e)
> x x (1 + e) x (1 + e) x (1 + e)
> < ((x * x) * x) * x
> < x x (1 - e) x (1 - e) x (1 - e)
> If we compute it as (x * x) * (x * x), we get
> x x (1 + e) x x (1 + e) (1 + e)
> < (x * x) * (x * x)
> < x x (1 - e) x x (1 - e) (1 - e)
> which, since real arithmetic is associative, is exactly the same as
> the previous answer.
You've just showed that a simple error analysis gives the same
error bounds, Robert is likely to reference more advanced (understand
really ugly :) analysis, where all the "e" terms are not blindly set to
the same value for all the domain and their relative value carefully
If you compute 14^4 using perfectly rounded 2 digit base 10
multiplication, you get the following:
repeated squaring gives 40e3 (14*14=196=>20e1, 20e1*20e1=40e3
sequential multiplication gives 39e3 (20e1*14=28e2, 28e2*14=392e2=>39e3)
The correctly rounded result is 38416=>38e3, so sequential
multiplication is more accurate here.
This small computation proves nothing general of course but you see the
first multiplication error gets amplified more by squaring seemingly
accordingly to Robert next email proposed intuitive reason :).