This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math



------- Comment #19 from rguenther at suse dot de  2007-06-10 21:39 -------
Subject: Re:  Use reciprocal and reciprocal square root
 with -ffast-math

On Sun, 10 Jun 2007, ubizjak at gmail dot com wrote:

> 
> 
> ------- Comment #18 from ubizjak at gmail dot com  2007-06-10 17:34 -------
> (In reply to comment #14)
> > The interesting difference between sqrtss, divss and rcpss, rsqrtss is that
> > the former have throughput of 1/16 while the latter are 1/1 (latencies compare
> > 21 vs. 3).  This is on K10.  The optimization guide only mentions calculating
> > the reciprocal y = a/b via rcpss and the square root (!) via rsqrtss
> > (sqrt a = 0.5 * a * rsqrtss(a) * (3.0 - a * rsqrtss(a) * rsqrtss(a)))
> > 
> > So the optimization would be mainly to improve instruction throughput, not
> > overall latency.
> 
> If this is the case, then middle-end will need to fold sqrtss in different way
> for targets that prefer rsqrtss. According to Comment #16, it is better to fold
> to 1.0/sqrt(c/b) instead of sqrt(b/c) because this way, we will loose one
> multiplication during NR expansion by rsqrt [due to sqrt(x) <=>  x * (1.0 /
> sqrt(x))].
> 
> IMO we need a new tree code to handle reciprocal sqrt - RSQRT_EXPR, together
> with proper folding functionality that expands directly to (NR-enhanced) rsqrt
> optab. If we consider a*sqrt(b/c), then b/c will be expanded as b* NR-rcp(c)
> [where NR-rcp stands for NR enhanced rcp] and sqrt will be expanded as
> NR-rsqrt. In this case, I see no RTL pass that would be able to combine
> everything together in order to swap (b/c) operands to produce NR-enhanced
> a*rsqrt(c/b) equivalent.

We just need a new builtin function, __builtin_rsqrt and at some stage
replace reciprocals of sqrt with the new builtin.  For example in
tree-ssa-math-opts.c which does the existing reciprocal transforms.
For example a target hook could be provided that would for example look
like

   tree target_fn_for_expr (tree expr);

and return a target builtin decl for the given expression.

And we should start splitting this PR ;)  One for a/sqrt(b/c) and one
for the above transformation.

Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31723


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]