This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[i386 PATCH] Expand negation directly to neg?f2_1 when approriate


The following patch fixes a floating point code generation regression
in the x86 backend where we no longer optimize -(-x*y) to x*y in the
RTL optimizers when we don't care about sign-dependent rounding (the
default).  c.f. PR middle-end/21067.

The issue is that since the introduction of SSE support, the x86 backend
now represents floating point negation as a PARALLEL containing a USE
that represents the constant pool constant of the SSE mode sign bit.
Awkwardly, even when the SSE patterns aren't available, such as
!TARGET_SSE_MATH or even !TARGET_SSE, we continue to use the PARALLEL but
containing a use of const0_rtx, which eventually gets split post-reload
into the simpler x87 patterns.

The problem is that this USE complicates the task of the RTL optimizers
to perform simplifying transformations, as -(-x*y) can no longer be
recognized (by combine) as x*y, as the MULT preserves the USE 0!
I've a patch to recog_for_combine to more agressively recognize such
parallels, but that's potentially risky as it'll affect all targets
and is therefore waiting for stage1.  The alternate approach to
resolve this particular regression is to avoid the (use (const_int 0))
and lower directly to the inevitable x87 form.


Ideologically, this is similar to the recent SUBREG lowering discussion,
where some choices need to be postponed until postreload, but if the
outcome is inevitable, its better to "bite bullet" and instantiate the
final pattern early, so the RTL optimizers have a clearer view of what's
going on.  There are similar analogies in chess, where the "horizon
effect" involves delaying inevitable decisions, and a common heuristic is
that if there's only one possible move, don't waste time figuring out
how good it is.


The comment in the original code suggested the reason why the current
strategy of keeping all the neg/abs patterns symmetric was to avoid
duplication in the splitters.  However as shown below, we don't need
additional splitters or patterns if we expand to an instructions final
form, and we only have to update the existing pattern's constraints
to allow x87 patterns pre-reload when the SSE patterns aren't an option.


The following patch has been tested on i686-pc-linux-gnu with a full
"make bootstrap", all default languages including Ada, and regression
tested with a top-level "make -k check" with no new failures.

Ok for mainline?  If this solution is acceptable, I can backport it
to the gcc-4_1-branch after a while.  Also any objections to more
agressively stripping USEs in recog_for_combine, so that the testcase
below can also be optimized with -mfpmath=sse (i.e. on x86_64)?

Thanks in advance.



2006-05-30  Roger Sayle  <roger@eyesopen.com>

	* config/i386/i386.c (ix86_expand_fp_absneg_operator): When SSE
	isn't available, directly generate the simpler x87 patterns without
	the (use (const_int 0)).
	* config/i386/i386.md (*negsf2_1): Enable pre-reload if the SSE
	implementation isn't available.
	(*negdf2_1): Likewise, if the SSE2 pattern isn't available.
	(*negxf2_1): XF mode negation is always done using the x87.
	(*abssf2_1, *absdf2_1, *absxf2_1): Likewise^3 for fabs.

	* gcc.target/i386/387-11.c: New test case.


Index: config/i386/i386.c
===================================================================
*** config/i386/i386.c	(revision 113957)
--- config/i386/i386.c	(working copy)
*************** ix86_expand_fp_absneg_operator (enum rtx
*** 9305,9326 ****
    if (use_sse)
      mask = ix86_build_signbit_mask (elt_mode, vector_mode, code == ABS);
    else
!     {
!       /* When not using SSE, we don't use the mask, but prefer to keep the
! 	 same general form of the insn pattern to reduce duplication when
! 	 it comes time to split.  */
!       mask = const0_rtx;
!     }

    dst = operands[0];
    src = operands[1];

    /* If the destination is memory, and we don't have matching source
!      operands, do things in registers.  */
    matching_memory = false;
    if (MEM_P (dst))
      {
!       if (rtx_equal_p (dst, src))
  	matching_memory = true;
        else
  	dst = gen_reg_rtx (mode);
--- 9305,9321 ----
    if (use_sse)
      mask = ix86_build_signbit_mask (elt_mode, vector_mode, code == ABS);
    else
!     mask = NULL_RTX;

    dst = operands[0];
    src = operands[1];

    /* If the destination is memory, and we don't have matching source
!      operands or we're using the x87, do things in registers.  */
    matching_memory = false;
    if (MEM_P (dst))
      {
!       if (use_sse && rtx_equal_p (dst, src))
  	matching_memory = true;
        else
  	dst = gen_reg_rtx (mode);
*************** ix86_expand_fp_absneg_operator (enum rtx
*** 9338,9346 ****
      {
        set = gen_rtx_fmt_e (code, mode, src);
        set = gen_rtx_SET (VOIDmode, dst, set);
!       use = gen_rtx_USE (VOIDmode, mask);
!       clob = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, FLAGS_REG));
!       emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (3, set, use, clob)));
      }

    if (dst != operands[0])
--- 9333,9347 ----
      {
        set = gen_rtx_fmt_e (code, mode, src);
        set = gen_rtx_SET (VOIDmode, dst, set);
!       if (mask)
!         {
!           use = gen_rtx_USE (VOIDmode, mask);
!           clob = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, FLAGS_REG));
!           emit_insn (gen_rtx_PARALLEL (VOIDmode,
! 				       gen_rtvec (3, set, use, clob)));
!         }
!       else
! 	emit_insn (set);
      }

    if (dst != operands[0])
Index: config/i386/i386.md
===================================================================
*** config/i386/i386.md	(revision 113957)
--- config/i386/i386.md	(working copy)
***************
*** 9905,9911 ****
  (define_insn "*negsf2_1"
    [(set (match_operand:SF 0 "register_operand" "=f")
  	(neg:SF (match_operand:SF 1 "register_operand" "0")))]
!   "TARGET_80387 && reload_completed"
    "fchs"
    [(set_attr "type" "fsgn")
     (set_attr "mode" "SF")])
--- 9905,9911 ----
  (define_insn "*negsf2_1"
    [(set (match_operand:SF 0 "register_operand" "=f")
  	(neg:SF (match_operand:SF 1 "register_operand" "0")))]
!   "TARGET_80387 && (reload_completed || !TARGET_SSE_MATH || !TARGET_SSE)"
    "fchs"
    [(set_attr "type" "fsgn")
     (set_attr "mode" "SF")])
***************
*** 9913,9919 ****
  (define_insn "*negdf2_1"
    [(set (match_operand:DF 0 "register_operand" "=f")
  	(neg:DF (match_operand:DF 1 "register_operand" "0")))]
!   "TARGET_80387 && reload_completed"
    "fchs"
    [(set_attr "type" "fsgn")
     (set_attr "mode" "DF")])
--- 9913,9919 ----
  (define_insn "*negdf2_1"
    [(set (match_operand:DF 0 "register_operand" "=f")
  	(neg:DF (match_operand:DF 1 "register_operand" "0")))]
!   "TARGET_80387 && (reload_completed || !TARGET_SSE_MATH || !TARGET_SSE2)"
    "fchs"
    [(set_attr "type" "fsgn")
     (set_attr "mode" "DF")])
***************
*** 9921,9927 ****
  (define_insn "*negxf2_1"
    [(set (match_operand:XF 0 "register_operand" "=f")
  	(neg:XF (match_operand:XF 1 "register_operand" "0")))]
!   "TARGET_80387 && reload_completed"
    "fchs"
    [(set_attr "type" "fsgn")
     (set_attr "mode" "XF")])
--- 9921,9927 ----
  (define_insn "*negxf2_1"
    [(set (match_operand:XF 0 "register_operand" "=f")
  	(neg:XF (match_operand:XF 1 "register_operand" "0")))]
!   "TARGET_80387"
    "fchs"
    [(set_attr "type" "fsgn")
     (set_attr "mode" "XF")])
***************
*** 9929,9935 ****
  (define_insn "*abssf2_1"
    [(set (match_operand:SF 0 "register_operand" "=f")
  	(abs:SF (match_operand:SF 1 "register_operand" "0")))]
!   "TARGET_80387 && reload_completed"
    "fabs"
    [(set_attr "type" "fsgn")
     (set_attr "mode" "SF")])
--- 9929,9935 ----
  (define_insn "*abssf2_1"
    [(set (match_operand:SF 0 "register_operand" "=f")
  	(abs:SF (match_operand:SF 1 "register_operand" "0")))]
!   "TARGET_80387 && (reload_completed || !TARGET_SSE_MATH || !TARGET_SSE)"
    "fabs"
    [(set_attr "type" "fsgn")
     (set_attr "mode" "SF")])
***************
*** 9937,9943 ****
  (define_insn "*absdf2_1"
    [(set (match_operand:DF 0 "register_operand" "=f")
  	(abs:DF (match_operand:DF 1 "register_operand" "0")))]
!   "TARGET_80387 && reload_completed"
    "fabs"
    [(set_attr "type" "fsgn")
     (set_attr "mode" "DF")])
--- 9937,9943 ----
  (define_insn "*absdf2_1"
    [(set (match_operand:DF 0 "register_operand" "=f")
  	(abs:DF (match_operand:DF 1 "register_operand" "0")))]
!   "TARGET_80387 && (reload_completed || !TARGET_SSE_MATH || !TARGET_SSE2)"
    "fabs"
    [(set_attr "type" "fsgn")
     (set_attr "mode" "DF")])
***************
*** 9945,9951 ****
  (define_insn "*absxf2_1"
    [(set (match_operand:XF 0 "register_operand" "=f")
  	(abs:XF (match_operand:XF 1 "register_operand" "0")))]
!   "TARGET_80387 && reload_completed"
    "fabs"
    [(set_attr "type" "fsgn")
     (set_attr "mode" "DF")])
--- 9945,9951 ----
  (define_insn "*absxf2_1"
    [(set (match_operand:XF 0 "register_operand" "=f")
  	(abs:XF (match_operand:XF 1 "register_operand" "0")))]
!   "TARGET_80387"
    "fabs"
    [(set_attr "type" "fsgn")
     (set_attr "mode" "DF")])


/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
/* { dg-options "-O2 -mfpmath=387" } */

double foo(double x, double y)
{
  double t = -x * y;
  return -t;
}

/* { dg-final { scan-assembler-not "fchs" } } */

Roger
--


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]