This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH] Constant fold sqrt at compile time (take 3)
- From: Roger Sayle <roger at eyesopen dot com>
- To: <gcc-patches at gcc dot gnu dot org>
- Cc: "Kaveh R. Ghazi" <ghazi at caip dot rutgers dot edu>, Richard Henderson <rth at redhat dot com>
- Date: Sun, 3 Nov 2002 12:50:23 -0700 (MST)
- Subject: [PATCH] Constant fold sqrt at compile time (take 3)
On Sun, 3 Nov 2002, Kaveh R. Ghazi wrote:
> Since 0.0 and 1.0 are always safe to optimize, rather than delete
> them IMO you should do this:
>
> if (0.0 || 1.0)
> return arg;
> else if (flag_unsafe_math_optimizations ...
>
> Only kill 0.0 & 1.0 if and when you get this working without
> flag_unsafe_math_optimizations.
An excellent suggestion.
The revision of the patch below has three changes since "take 2".
Firstly, sqrt(0.0) and sqrt(1.0) are always constant folded at
the tree-level, per Kaveh's suggestion. This code can be removed
once we're happy with real_sqrt and remove the "unsafe math" flags.
The second change is to use HONOR_SNANS(mode) rather than just
check flag_signaling_nans. This only disables this optimization
for modes that support signaling NaNs.
The third and final change is to implement Karp and Markstein's
final iteration method. The Newtonian iteration produces the
approximation 1/sqrt(x), and previously we'd get our final result
by just multiplying this value by the original argument x. The
"Karp and Markstein" approach is to use a slightly modified final
iteration to produce sqrt(x). Due to the self-correcting nature
of Newton-Raphson, this results in an answer accurate to 1 ulps.
For more details see equation (5) of "High Precision Division
and Square Root", Alan H. Karp and Peter Markstein, HP Lab Report
93-93-42, June 1993.
http://www.hpl.hp.com/techreports/93/HPL-93-42.pdf
This patch has been retested on the 3.4 BIB with a complete "make
bootstrap" (all languages except Ada and treelang) and "make -k
check" on i686-pc-linux-gnu with no new regressions. Once again
I've also checked by hand that we get sensible results with
-ffast-math.
Ok for the gcc-3_4-basic-improvements-branch?
2002-11-03 Roger Sayle <roger@eyesopen.com>
* real.c (real_sqrt): New function to calculate square roots.
* real.h (real_sqrt): Add function prototype.
* builtins.c (fold_builtin): Fold sqrt of constant argument.
* simplify-rtx.c (simplify_unary_operation): Simplify sqrt
of constant argument.
Index: real.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/real.c,v
retrieving revision 1.75.4.7
diff -c -3 -p -r1.75.4.7 real.c
*** real.c 28 Oct 2002 19:47:07 -0000 1.75.4.7
--- real.c 3 Nov 2002 15:26:13 -0000
*************** const struct real_format *real_format_fo
*** 4381,4383 ****
--- 4381,4464 ----
NULL, /* XFmode */
&ieee_quad_format /* TFmode */
};
+
+
+ /* Calculate the square root of X in mode MODE, and store the result
+ in R. */
+
+ void
+ real_sqrt (r, mode, x)
+ REAL_VALUE_TYPE *r;
+ enum machine_mode mode;
+ const REAL_VALUE_TYPE *x;
+ {
+ static REAL_VALUE_TYPE halfthree;
+ static REAL_VALUE_TYPE half;
+ static bool init = false;
+ REAL_VALUE_TYPE h, t, i;
+ int iter, exp;
+
+ /* sqrt(-0.0) is -0.0. */
+ if (real_isnegzero (x))
+ {
+ *r = *x;
+ return;
+ }
+
+ /* Negative arguments return NaN. */
+ if (real_isneg (x))
+ {
+ /* Mode is ignored for canonical NaN. */
+ real_nan (r, "", 1, SFmode);
+ return;
+ }
+
+ /* Infinity and NaN return themselves. */
+ if (real_isinf (x) || real_isnan (x))
+ {
+ *r = *x;
+ return;
+ }
+
+ if (!init)
+ {
+ real_arithmetic (&half, RDIV_EXPR, &dconst1, &dconst2);
+ real_arithmetic (&halfthree, PLUS_EXPR, &dconst1, &half);
+ init = true;
+ }
+
+ /* Initial guess for reciprocal sqrt, i. */
+ exp = real_exponent (x);
+ real_ldexp (&i, &dconst1, -exp/2);
+
+ /* Newton's iteration for reciprocal sqrt, i. */
+ for (iter = 0; iter < 16; iter++)
+ {
+ /* i(n+1) = i(n) * (1.5 - 0.5*i(n)*i(n)*x). */
+ real_arithmetic (&t, MULT_EXPR, x, &i);
+ real_arithmetic (&h, MULT_EXPR, &t, &i);
+ real_arithmetic (&t, MULT_EXPR, &h, &half);
+ real_arithmetic (&h, MINUS_EXPR, &halfthree, &t);
+ real_arithmetic (&t, MULT_EXPR, &i, &h);
+
+ /* Check for early convergence. */
+ if (iter >= 6 && real_identical (&i, &t))
+ break;
+
+ /* ??? Unroll loop to avoid copying. */
+ i = t;
+ }
+
+ /* Final iteration: r = i*x + 0.5*i*x*(1.0 - i*(i*x)). */
+ real_arithmetic (&t, MULT_EXPR, x, &i);
+ real_arithmetic (&h, MULT_EXPR, &t, &i);
+ real_arithmetic (&i, MINUS_EXPR, &dconst1, &h);
+ real_arithmetic (&h, MULT_EXPR, &t, &i);
+ real_arithmetic (&i, MULT_EXPR, &half, &h);
+ real_arithmetic (&h, PLUS_EXPR, &t, &i);
+
+ /* ??? We need a Tuckerman test to get the last bit. */
+
+ real_convert (r, mode, &h);
+ }
+
Index: real.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/real.h,v
retrieving revision 1.44.8.6
diff -c -3 -p -r1.44.8.6 real.h
*** real.h 28 Oct 2002 19:47:08 -0000 1.44.8.6
--- real.h 3 Nov 2002 15:26:13 -0000
*************** extern bool exact_real_inverse PARAMS ((
*** 346,350 ****
--- 346,354 ----
/* In tree.c: wrap up a REAL_VALUE_TYPE in a tree node. */
extern tree build_real PARAMS ((tree, REAL_VALUE_TYPE));
+ /* Calculate R as the square root of X in the given machine mode. */
+ extern void real_sqrt PARAMS ((REAL_VALUE_TYPE *,
+ enum machine_mode,
+ const REAL_VALUE_TYPE *));
#endif /* ! GCC_REAL_H */
Index: builtins.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/builtins.c,v
retrieving revision 1.159.4.3
diff -c -3 -p -r1.159.4.3 builtins.c
*** builtins.c 17 Sep 2002 22:58:36 -0000 1.159.4.3
--- builtins.c 3 Nov 2002 15:26:14 -0000
*************** fold_builtin (exp)
*** 4245,4250 ****
--- 4245,4267 ----
if (real_zerop (arg) || real_onep (arg))
return arg;
+ /* Optimize sqrt of constant value. */
+ if (flag_unsafe_math_optimizations
+ && TREE_CODE (arg) == REAL_CST
+ && ! TREE_CONSTANT_OVERFLOW (arg))
+ {
+ enum machine_mode mode;
+ REAL_VALUE_TYPE r, x;
+
+ x = TREE_REAL_CST (arg);
+ mode = TYPE_MODE (TREE_TYPE (arg));
+ if (!HONOR_SNANS (mode) || !real_isnan (&x))
+ {
+ real_sqrt (&r, mode, &x);
+ return build_real (TREE_TYPE (arg), r);
+ }
+ }
+
/* Optimize sqrt(exp(x)) = exp(x/2.0). */
fcode = builtin_mathfn_code (arg);
if (flag_unsafe_math_optimizations
Index: simplify-rtx.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/simplify-rtx.c,v
retrieving revision 1.118.4.6
diff -c -3 -p -r1.118.4.6 simplify-rtx.c
*** simplify-rtx.c 21 Oct 2002 17:52:02 -0000 1.118.4.6
--- simplify-rtx.c 3 Nov 2002 15:26:14 -0000
*************** simplify_unary_operation (code, mode, op
*** 571,585 ****
else if (GET_CODE (trueop) == CONST_DOUBLE
&& GET_MODE_CLASS (mode) == MODE_FLOAT)
{
! REAL_VALUE_TYPE d;
REAL_VALUE_FROM_CONST_DOUBLE (d, trueop);
switch (code)
{
case SQRT:
! /* We don't attempt to optimize this. */
! return 0;
!
case ABS:
d = REAL_VALUE_ABS (d);
break;
--- 571,589 ----
else if (GET_CODE (trueop) == CONST_DOUBLE
&& GET_MODE_CLASS (mode) == MODE_FLOAT)
{
! REAL_VALUE_TYPE d, t;
REAL_VALUE_FROM_CONST_DOUBLE (d, trueop);
switch (code)
{
case SQRT:
! if (! flag_unsafe_math_optimizations)
! return 0;
! if (HONOR_SNANS (mode) && real_isnan (&d))
! return 0;
! real_sqrt (&t, mode, &d);
! d = t;
! break;
case ABS:
d = REAL_VALUE_ABS (d);
break;
Roger
--
Roger Sayle, E-mail: roger@eyesopen.com
OpenEye Scientific Software, WWW: http://www.eyesopen.com/
Suite 1107, 3600 Cerrillos Road, Tel: (+1) 505-473-7385
Santa Fe, New Mexico, 87507. Fax: (+1) 505-473-0833