[PATCH] Attempt harder to emit a cmove in emit_conditional_move (PR tree-optimization/79390)

Jakub Jelinek jakub@redhat.com
Wed Apr 12 16:33:00 GMT 2017


Hi!

As mentioned in the PR, for LU benchmark we generate worse code with -Ofast
compared to -O3, because in the former we don't use a conditional move.

The problem is during emit_conditional_move, while in both cases
swap_commutative_operands_p (op2, op3) tells us it might be better to swap
them, the difference between -Ofast and -O3 is that reversed_comparison_code_parts
returns in -Ofast case LE, but in -O3 UNKNOWN.  For -O3 this means we don't
swap o2 and o3, emit a GT and successfully emit a conditional move.
For -Ofast, we swap the arguments and attempt to emit a LE, but the
predicate on the cbranchdf4 expander fails in that case, because "LE" is not
considered a trivial comparison operator, is more expensive, thus the only
attempt we try fails and we don't emit a cmov at all.

The following patch will do the same thing as previously, but if we were to
return NULL_RTX, it will try to swap op2/op3 (either back, if we swapped
them first, effectively returning to the original comparison, or if we
haven't swapped them first, trying to reverse the comparison) and see if
that leads to a usable sequence.

The patch caused a FAIL in the pr70465-2.c testcase, where previously in ce1
we weren't able to emit any cmov and now are, which tiny bit changes the
IL that gets through regstack and thus the testcase is not testing anymore
what it wanted - the %st(1) and %st comparison arguments are swapped and
the fcmov operands too and thus a fxchg is seen as needed (before regstack
it is impossible to find out which ordering is more beneficial actually).
We could teach regstack to attempt to reverse the comparisons and swap
corresponding fcmov arguments, but that does look like a GCC8 material to
me.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-04-12  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/79390
	* optabs.c (emit_conditional_move): If the preferred op2/op3 operand
	order does not result in usable sequence, retry with reversed operand
	order.

	* gcc.target/i386/pr70465-2.c: Xfail the scan-assembler-not test.

--- gcc/optabs.c.jj	2017-02-02 08:44:12.000000000 +0100
+++ gcc/optabs.c	2017-04-12 14:30:14.905433771 +0200
@@ -4258,12 +4258,15 @@ emit_conditional_move (rtx target, enum
   if (cmode == VOIDmode)
     cmode = GET_MODE (op0);
 
+  enum rtx_code orig_code = code;
+  bool swapped = false;
   if (swap_commutative_operands_p (op2, op3)
       && ((reversed = reversed_comparison_code_parts (code, op0, op1, NULL))
           != UNKNOWN))
     {
       std::swap (op2, op3);
       code = reversed;
+      swapped = true;
     }
 
   if (mode == VOIDmode)
@@ -4272,45 +4275,62 @@ emit_conditional_move (rtx target, enum
   icode = direct_optab_handler (movcc_optab, mode);
 
   if (icode == CODE_FOR_nothing)
-    return 0;
+    return NULL_RTX;
 
   if (!target)
     target = gen_reg_rtx (mode);
 
-  code = unsignedp ? unsigned_condition (code) : code;
-  comparison = simplify_gen_relational (code, VOIDmode, cmode, op0, op1);
-
-  /* We can get const0_rtx or const_true_rtx in some circumstances.  Just
-     return NULL and let the caller figure out how best to deal with this
-     situation.  */
-  if (!COMPARISON_P (comparison))
-    return NULL_RTX;
-
-  saved_pending_stack_adjust save;
-  save_pending_stack_adjust (&save);
-  last = get_last_insn ();
-  do_pending_stack_adjust ();
-  prepare_cmp_insn (XEXP (comparison, 0), XEXP (comparison, 1),
-		    GET_CODE (comparison), NULL_RTX, unsignedp, OPTAB_WIDEN,
-		    &comparison, &cmode);
-  if (comparison)
+  for (int pass = 0; ; pass++)
     {
-      struct expand_operand ops[4];
+      code = unsignedp ? unsigned_condition (code) : code;
+      comparison = simplify_gen_relational (code, VOIDmode, cmode, op0, op1);
 
-      create_output_operand (&ops[0], target, mode);
-      create_fixed_operand (&ops[1], comparison);
-      create_input_operand (&ops[2], op2, mode);
-      create_input_operand (&ops[3], op3, mode);
-      if (maybe_expand_insn (icode, 4, ops))
+      /* We can get const0_rtx or const_true_rtx in some circumstances.  Just
+	 punt and let the caller figure out how best to deal with this
+	 situation.  */
+      if (COMPARISON_P (comparison))
 	{
-	  if (ops[0].value != target)
-	    convert_move (target, ops[0].value, false);
-	  return target;
+	  saved_pending_stack_adjust save;
+	  save_pending_stack_adjust (&save);
+	  last = get_last_insn ();
+	  do_pending_stack_adjust ();
+	  prepare_cmp_insn (XEXP (comparison, 0), XEXP (comparison, 1),
+			    GET_CODE (comparison), NULL_RTX, unsignedp,
+			    OPTAB_WIDEN, &comparison, &cmode);
+	  if (comparison)
+	    {
+	      struct expand_operand ops[4];
+
+	      create_output_operand (&ops[0], target, mode);
+	      create_fixed_operand (&ops[1], comparison);
+	      create_input_operand (&ops[2], op2, mode);
+	      create_input_operand (&ops[3], op3, mode);
+	      if (maybe_expand_insn (icode, 4, ops))
+		{
+		  if (ops[0].value != target)
+		    convert_move (target, ops[0].value, false);
+		  return target;
+		}
+	    }
+	  delete_insns_since (last);
+	  restore_pending_stack_adjust (&save);
 	}
+
+      if (pass == 1)
+	return NULL_RTX;
+
+      /* If the preferred op2/op3 order is not usable, retry with other
+	 operand order, perhaps it will expand successfully.  */
+      if (swapped)
+	code = orig_code;
+      else if ((reversed = reversed_comparison_code_parts (orig_code, op0, op1,
+							   NULL))
+	       != UNKNOWN)
+	code = reversed;
+      else
+	return NULL_RTX;
+      std::swap (op2, op3);
     }
-  delete_insns_since (last);
-  restore_pending_stack_adjust (&save);
-  return NULL_RTX;
 }
 
 
--- gcc/testsuite/gcc.target/i386/pr70465-2.c.jj	2017-02-22 18:15:13.000000000 +0100
+++ gcc/testsuite/gcc.target/i386/pr70465-2.c	2017-04-12 18:07:00.784761399 +0200
@@ -1,7 +1,7 @@
 /* PR target/70465 */
 /* { dg-do compile } */
 /* { dg-options "-Ofast -mfpmath=387 -fomit-frame-pointer" } */
-/* { dg-final { scan-assembler-not "fxch\t%st.1" } } */
+/* { dg-final { scan-assembler-not "fxch\t%st.1" { xfail *-*-* } } } */
 
 extern float d[1024];
 

	Jakub



More information about the Gcc-patches mailing list