This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Implement "exp" builtin as x87 intrinsic


To complement last week's patch to implement log as an x87 intrinsic:
http://gcc.gnu.org/ml/gcc-patches/2003-05/msg00983.html  The following
patch implements the exp, expf and expl mathematical built-in
functions as inline instrinsics on x86.

Originally, I tried to model "fscale" as requiring two operands
and leaving the result and the second input operand as results on
the FP stack.  Unfortunately, having instructions that leave dead
values on the FP stack really confuses reg-stack.c.  The current
code seems to assume that if an FP register is dead then the
instruction that created it must have been dead/eliminated.  Alas,
this is not the case if we have patterns with more than one result,
only some of which are actually unused.

En route, I tried adding instructions to pop the dead/unused values
off of the stack at the end of the basic block.  I almost had it
working, but unfortunately, having several dead values in a single basic
block really screws up the register to stack slot mapping code and
occassionally resulted either in incorrect code generation or the
compiler calling abort.

I finally simplified everything and modelled the "fscale?f3" patterns
as an "fscale" instruction followed by an explicit pop.  This dramatically
reduced the size of my changes to reg-stack.c, and allowed me to
reuse the same idiom as fpatan and fyl2x (including the explicit
clobbers necessary for PR opt/10764).


The following patch has been tested on i686-pc-linux-gnu, with a
complete "make bootstrap", all languages except treelang (including
ada), and regression tested with a top-level "make check" with no
new failures.  I've once again confirmed by hand that it produces
the expected results with -ffast-math, including "exp(exp(x))"
which was originally problematic for last week's log patch.

Ok for mainline?


2003-05-17  Roger Sayle  <roger@eyesopen.com>

	* config/i386/i386.md (expsf2, expdf2, expxf2): New patterns to
	implement exp, expf and expl built-ins as inline x87 intrinsics.
	(UNSPEC_FSCALE, UNSPEC_FRNDINT, UNSPEC_F2XM1): New unspecs to
	represent x87's fscale, frndint and f2xm1 insns respectively.
	(*fscale_sfxf3, *fscale_dfxf3, *fscale_xf3): New insn patterns
	to encode x87's "fscale" instruction followed by a pop.
	(*frndintxf2): New insn pattern for "frndint".
	(*f2xm1xf2): New insn pattern for "f2xm1".

	* reg-stack.c (subst_stack_regs_pat): Handle UNSPEC_FRNDINT and
	UNSPEC_F2XM1 like UNSPEC_{SIN,COS} and handle UNSPEC_FSCALE like
	UNSPEC_FPATAN.

	* gcc.dg/builtins-16.c: New test case.
	* gcc.dg/i386-387-1.c: Update to test exp.
	* gcc.dg/i386-387-2.c: Likewise.


Index: config/i386/i386.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/i386/i386.md,v
retrieving revision 1.460
diff -c -3 -p -r1.460 i386.md
*** config/i386/i386.md	14 May 2003 21:13:45 -0000	1.460
--- config/i386/i386.md	17 May 2003 19:23:04 -0000
***************
*** 113,118 ****
--- 113,121 ----
     ; x87 Floating point
     (UNSPEC_FPATAN		65)
     (UNSPEC_FYL2X		66)
+    (UNSPEC_FSCALE		67)
+    (UNSPEC_FRNDINT		68)
+    (UNSPEC_F2XM1		69)
    ])

  (define_constants
***************
*** 15634,15639 ****
--- 15637,15769 ----
    operands[2] = gen_reg_rtx (XFmode);
    temp = standard_80387_constant_rtx (4); /* fldln2 */
    emit_move_insn (operands[2], temp);
+ })
+
+ (define_insn "*fscale_sfxf3"
+   [(parallel [(set (match_operand:SF 0 "register_operand" "=f")
+ 		   (unspec:SF [(match_operand:XF 2 "register_operand" "0")
+ 			       (match_operand:XF 1 "register_operand" "u")]
+ 		    UNSPEC_FSCALE))
+ 	      (clobber (match_dup 1))])]
+   "! TARGET_NO_FANCY_MATH_387 && TARGET_80387
+    && flag_unsafe_math_optimizations"
+   "fscale\;fstp\t%y1"
+   [(set_attr "type" "fpspc")
+    (set_attr "mode" "SF")])
+
+ (define_insn "*fscale_dfxf3"
+   [(parallel [(set (match_operand:DF 0 "register_operand" "=f")
+ 		   (unspec:DF [(match_operand:XF 2 "register_operand" "0")
+ 			       (match_operand:XF 1 "register_operand" "u")]
+ 		    UNSPEC_FSCALE))
+ 	      (clobber (match_dup 1))])]
+   "! TARGET_NO_FANCY_MATH_387 && TARGET_80387
+    && flag_unsafe_math_optimizations"
+   "fscale\;fstp\t%y1"
+   [(set_attr "type" "fpspc")
+    (set_attr "mode" "DF")])
+
+ (define_insn "*fscale_xf3"
+   [(parallel [(set (match_operand:XF 0 "register_operand" "=f")
+ 		   (unspec:XF [(match_operand:XF 2 "register_operand" "0")
+ 			       (match_operand:XF 1 "register_operand" "u")]
+ 		    UNSPEC_FSCALE))
+ 	      (clobber (match_dup 1))])]
+   "! TARGET_NO_FANCY_MATH_387 && TARGET_80387
+    && flag_unsafe_math_optimizations"
+   "fscale\;fstp\t%y1"
+   [(set_attr "type" "fpspc")
+    (set_attr "mode" "XF")])
+
+ (define_insn "*frndintxf2"
+   [(set (match_operand:XF 0 "register_operand" "=f")
+ 	(unspec:XF [(match_operand:XF 1 "register_operand" "0")]
+ 	 UNSPEC_FRNDINT))]
+   "! TARGET_NO_FANCY_MATH_387 && TARGET_80387
+    && flag_unsafe_math_optimizations"
+   "frndint"
+   [(set_attr "type" "fpspc")
+    (set_attr "mode" "XF")])
+
+ (define_insn "*f2xm1xf2"
+   [(set (match_operand:XF 0 "register_operand" "=f")
+ 	(unspec:XF [(match_operand:XF 1 "register_operand" "0")]
+ 	 UNSPEC_F2XM1))]
+   "! TARGET_NO_FANCY_MATH_387 && TARGET_80387
+    && flag_unsafe_math_optimizations"
+   "f2xm1"
+   [(set_attr "type" "fpspc")
+    (set_attr "mode" "XF")])
+
+ (define_expand "expsf2"
+   [(set (match_dup 2)
+ 	(float_extend:XF (match_operand:SF 1 "register_operand" "")))
+    (set (match_dup 4) (mult:XF (match_dup 2) (match_dup 3)))
+    (set (match_dup 5) (unspec:XF [(match_dup 4)] UNSPEC_FRNDINT))
+    (set (match_dup 6) (minus:XF (match_dup 4) (match_dup 5)))
+    (set (match_dup 7) (unspec:XF [(match_dup 6)] UNSPEC_F2XM1))
+    (set (match_dup 9) (plus:XF (match_dup 7) (match_dup 8)))
+    (parallel [(set (match_operand:SF 0 "register_operand" "")
+ 		   (unspec:SF [(match_dup 9) (match_dup 5)] UNSPEC_FSCALE))
+ 	      (clobber (match_dup 5))])]
+   "! TARGET_NO_FANCY_MATH_387 && TARGET_80387
+    && flag_unsafe_math_optimizations"
+ {
+   rtx temp;
+   int i;
+
+   for (i=2; i<10; i++)
+     operands[i] = gen_reg_rtx (XFmode);
+   temp = standard_80387_constant_rtx (5); /* fldl2e */
+   emit_move_insn (operands[3], temp);
+   emit_move_insn (operands[8], CONST1_RTX (XFmode));  /* fld1 */
+ })
+
+ (define_expand "expdf2"
+   [(set (match_dup 2)
+ 	(float_extend:XF (match_operand:DF 1 "register_operand" "")))
+    (set (match_dup 4) (mult:XF (match_dup 2) (match_dup 3)))
+    (set (match_dup 5) (unspec:XF [(match_dup 4)] UNSPEC_FRNDINT))
+    (set (match_dup 6) (minus:XF (match_dup 4) (match_dup 5)))
+    (set (match_dup 7) (unspec:XF [(match_dup 6)] UNSPEC_F2XM1))
+    (set (match_dup 9) (plus:XF (match_dup 7) (match_dup 8)))
+    (parallel [(set (match_operand:DF 0 "register_operand" "")
+ 		   (unspec:DF [(match_dup 9) (match_dup 5)] UNSPEC_FSCALE))
+ 	      (clobber (match_dup 5))])]
+   "! TARGET_NO_FANCY_MATH_387 && TARGET_80387
+    && flag_unsafe_math_optimizations"
+ {
+   rtx temp;
+   int i;
+
+   for (i=2; i<10; i++)
+     operands[i] = gen_reg_rtx (XFmode);
+   temp = standard_80387_constant_rtx (5); /* fldl2e */
+   emit_move_insn (operands[3], temp);
+   emit_move_insn (operands[8], CONST1_RTX (XFmode));  /* fld1 */
+ })
+
+ (define_expand "expxf2"
+   [(set (match_dup 3) (mult:XF (match_operand:XF 1 "register_operand" "")
+ 			       (match_dup 2)))
+    (set (match_dup 4) (unspec:XF [(match_dup 3)] UNSPEC_FRNDINT))
+    (set (match_dup 5) (minus:XF (match_dup 3) (match_dup 4)))
+    (set (match_dup 6) (unspec:XF [(match_dup 5)] UNSPEC_F2XM1))
+    (set (match_dup 8) (plus:XF (match_dup 6) (match_dup 7)))
+    (parallel [(set (match_operand:XF 0 "register_operand" "")
+ 		   (unspec:XF [(match_dup 8) (match_dup 4)] UNSPEC_FSCALE))
+ 	      (clobber (match_dup 4))])]
+   "! TARGET_NO_FANCY_MATH_387 && TARGET_80387
+    && flag_unsafe_math_optimizations"
+ {
+   rtx temp;
+   int i;
+
+   for (i=2; i<9; i++)
+     operands[i] = gen_reg_rtx (XFmode);
+   temp = standard_80387_constant_rtx (5); /* fldl2e */
+   emit_move_insn (operands[2], temp);
+   emit_move_insn (operands[7], CONST1_RTX (XFmode));  /* fld1 */
  })

  ;; Block operation instructions
Index: reg-stack.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/reg-stack.c,v
retrieving revision 1.126
diff -c -3 -p -r1.126 reg-stack.c
*** reg-stack.c	12 May 2003 02:51:36 -0000	1.126
--- reg-stack.c	17 May 2003 19:23:07 -0000
*************** subst_stack_regs_pat (insn, regstack, pa
*** 1707,1712 ****
--- 1707,1714 ----
  	      {
  	      case UNSPEC_SIN:
  	      case UNSPEC_COS:
+ 	      case UNSPEC_FRNDINT:
+ 	      case UNSPEC_F2XM1:
  		/* These insns only operate on the top of the stack.  */

  		src1 = get_true_reg (&XVECEXP (pat_src, 0, 0));
*************** subst_stack_regs_pat (insn, regstack, pa
*** 1730,1735 ****
--- 1732,1738 ----

  	      case UNSPEC_FPATAN:
  	      case UNSPEC_FYL2X:
+ 	      case UNSPEC_FSCALE:
  		/* These insns operate on the top two stack slots.  */

  		src1 = get_true_reg (&XVECEXP (pat_src, 0, 0));

Index: gcc.dg/i386-387-1.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/testsuite/gcc.dg/i386-387-1.c,v
retrieving revision 1.5
diff -c -3 -p -r1.5 i386-387-1.c
*** gcc.dg/i386-387-1.c	12 May 2003 02:51:39 -0000	1.5
--- gcc.dg/i386-387-1.c	17 May 2003 23:13:56 -0000
***************
*** 6,14 ****
--- 6,16 ----
  /* { dg-final { scan-assembler "call\t_?sqrt" } } */
  /* { dg-final { scan-assembler "call\t_?atan2" } } */
  /* { dg-final { scan-assembler "call\t_?log" } } */
+ /* { dg-final { scan-assembler "call\t_?exp" } } */

  double f1(double x) { return __builtin_sin(x); }
  double f2(double x) { return __builtin_cos(x); }
  double f3(double x) { return __builtin_sqrt(x); }
  double f4(double x, double y) { return __builtin_atan2(x,y); }
  double f5(double x) { return __builtin_log(x); }
+ double f6(double x) { return __builtin_exp(x); }
Index: gcc.dg/i386-387-2.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/testsuite/gcc.dg/i386-387-2.c,v
retrieving revision 1.4
diff -c -3 -p -r1.4 i386-387-2.c
*** gcc.dg/i386-387-2.c	12 May 2003 02:51:40 -0000	1.4
--- gcc.dg/i386-387-2.c	17 May 2003 23:13:56 -0000
***************
*** 6,14 ****
--- 6,16 ----
  /* { dg-final { scan-assembler "fsqrt" } } */
  /* { dg-final { scan-assembler "fpatan" } } */
  /* { dg-final { scan-assembler "fyl2x" } } */
+ /* { dg-final { scan-assembler "f2xm1" } } */

  double f1(double x) { return __builtin_sin(x); }
  double f2(double x) { return __builtin_cos(x); }
  double f3(double x) { return __builtin_sqrt(x); }
  double f4(double x, double y) { return __builtin_atan2(x,y); }
  double f5(double x) { return __builtin_log(x); }
+ double f6(double x) { return __builtin_exp(x); }


/* Related to PR optimization/10764  */

/* { dg-do compile } */
/* { dg-options "-O2 -ffast-math" } */

double exp(double x);

double foo(double x)
{
  return exp(exp(x));
}


Roger
--
Roger Sayle,                         E-mail: roger@eyesopen.com
OpenEye Scientific Software,         WWW: http://www.eyesopen.com/
Suite 1107, 3600 Cerrillos Road,     Tel: (+1) 505-473-7385
Santa Fe, New Mexico, 87507.         Fax: (+1) 505-473-0833


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]