This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH][RTL-ifcvt] Make non-conditional execution if-conversion more aggressive


Hi all,

This patch makes if-conversion more aggressive when handling code of the form:
if (test)
  x := a  //THEN
else
  x := b  //ELSE

Currently, we can handle this case only if x:=a and x:=b are simple single set instructions.
With this patch we will be able to handle the cases where x:=a and x:=b take multiple instructions.
This can be done under the condition that all the instructions in the THEN and ELSE basic blocks are
only used to compute a value for x.  I suppose we could generalise even further (perhaps to handle
cases where multiple x's are being set) but that's out of the scope of this patch.

This was sparked by some cases in aarch64 where the THEN or ELSE branches contained an extra
zero_extend operation after an arithmetic instruction which prevented if-conversion.

To implement this approach noce_process_if_block in ifcvt.c is relaxed to allow multi-instruction
basic blocks when the intermediate values produced in them don't escape the basic block except
through x.  noce_process_if_block then calls a number of other functions to detect various
patterns and if-convert. Most of them don't actually make sense for multi-instruction basic blocks
so they are updated to reject them and operate only on the existing single-instruction case.

However, noce_try_cmove_arith can take advantage of multi-instruction basic blocks and is thus
updated to emit the whole basic blocks rather than just one instruction.

The transformation is, of course, guarded on a cost calculation.
The current code adds the costs of both the THEN and ELSE blocks and proceeds if they don't
exceed the branch cost. I don't think that's quite a right calculation.
We're going to be executing at least one of the basic blocks anyway.
This patch we instead check the *maximum* of the two blocks against the branch cost.
This should still catch cases where a high latency instruction appears in one or both of
the paths.


This transformation applies to targets with conditional move operations but no conditional
execution. Thus, it applies to aarch64 and x86_64, but not arm.

The effect of this patch is more noticeable if the backend branch cost is higher (like you'd expect).


Not increasing the branch cost we still get more aggressive if-conversion.
Across the whole of SPEC2006 I saw a 5.8% increase in the number of csel instructions generated
(from 41242 -> 43637)

Bootstrapped and tested on aarch64, x86_6, arm.
I've made the testcases aarch64-specific since they depend on backend branch costs that are hard
to predict across all platforms (we don't have a -mbranch-cost= option ;))
No performance regressions on SPEC2006 on aarch64 and x86_64.
On aarch64 I've seen 482.sphinx3 improve by 2.3% and 459.GemsFDTD by 2.1%

Some of the testcases in aarch64.exp now fail their scan-assembler patterns due to if-conversion.
I've updated those testcases to properly generate the pattern they expect. The changes are mostly
due to add+compare-style instructions now appearing in the same basic blocks as their result uses,
which, I think, scares combine away from combining them into one.

Does this approach look reasonable?
If so, ok for trunk?

Thanks,
Kyrill


2015-07-10  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

    * ifcvt.c (struct noce_if_info): Add then_simple, else_simple,
    then_cost, else_cost fields.
    (end_ifcvt_sequence): Call set_used_flags on each insn in the
    sequence.
    (noce_simple_bbs): New function.
    (noce_try_move): Bail if basic blocks are not simple.
    (noce_try_store_flag): Likewise.
    (noce_try_store_flag_constants): Likewise.
    (noce_try_addcc): Likewise.
    (noce_try_store_flag_mask): Likewise.
    (noce_try_cmove): Likewise.
    (noce_try_minmax): Likewise.
    (noce_try_abs): Likewise.
    (noce_try_sign_mask): Likewise.
    (noce_try_bitop): Likewise.
    (bbs_ok_for_cmove_arith): New function.
    (noce_emit_all_but_last): Likewise.
    (noce_emit_insn): Likewise.
    (noce_emit_bb): Likewise.
    (noce_try_cmove_arith): Handle non-simple basic blocks.
    (insn_valid_noce_process_p): New function.
    (bb_valid_for_noce_process_p): Likewise.
    (noce_process_if_block): Allow non-simple basic blocks
    where appropriate.


2015-07-10  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

    * gcc.target/aarch64/ifcvt_csel_1.c: New test.
    * gcc.target/aarch64/ifcvt_csel_2.c: New test.
    * gcc.target/aarch64/ifcvt_csel_3.c: New test.
commit b6fe0e0a5f64fdc11fbbd7c9e05caeeb23e21662
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Wed Jul 8 15:45:04 2015 +0100

    [PATCH][ifcvt] Make non-conditional execution if-conversion more aggressive

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 31849ee..3d324257 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -815,6 +815,15 @@ struct noce_if_info
      form as well.  */
   bool then_else_reversed;
 
+  /* True if the contents of then_bb and else_bb are a
+     simple single set instruction.  */
+  bool then_simple;
+  bool else_simple;
+
+  /* The total rtx cost of the instructions in then_bb and else_bb.  */
+  int then_cost;
+  int else_cost;
+
   /* Estimated cost of the particular branch instruction.  */
   int branch_cost;
 };
@@ -1036,6 +1045,10 @@ end_ifcvt_sequence (struct noce_if_info *if_info)
   set_used_flags (if_info->cond);
   set_used_flags (if_info->a);
   set_used_flags (if_info->b);
+
+  for (insn = seq; insn; insn = NEXT_INSN (insn))
+    set_used_flags (insn);
+
   unshare_all_rtl_in_chain (seq);
   end_sequence ();
 
@@ -1053,6 +1066,21 @@ end_ifcvt_sequence (struct noce_if_info *if_info)
   return seq;
 }
 
+/* Return true iff the then and else basic block (if it exists)
+   consist of a single simple set instruction.  */
+
+static bool
+noce_simple_bbs (struct noce_if_info *if_info)
+{
+  if (!if_info->then_simple)
+    return false;
+
+  if (if_info->else_bb)
+    return if_info->else_simple;
+
+  return true;
+}
+
 /* Convert "if (a != b) x = a; else x = b" into "x = a" and
    "if (a == b) x = a; else x = b" into "x = b".  */
 
@@ -1067,6 +1095,9 @@ noce_try_move (struct noce_if_info *if_info)
   if (code != NE && code != EQ)
     return FALSE;
 
+  if (!noce_simple_bbs (if_info))
+    return FALSE;
+
   /* This optimization isn't valid if either A or B could be a NaN
      or a signed zero.  */
   if (HONOR_NANS (if_info->x)
@@ -1115,6 +1146,9 @@ noce_try_store_flag (struct noce_if_info *if_info)
   rtx target;
   rtx_insn *seq;
 
+  if (!noce_simple_bbs (if_info))
+    return FALSE;
+
   if (CONST_INT_P (if_info->b)
       && INTVAL (if_info->b) == STORE_FLAG_VALUE
       && if_info->a == const0_rtx)
@@ -1163,6 +1197,9 @@ noce_try_store_flag_constants (struct noce_if_info *if_info)
   int normalize, can_reverse;
   machine_mode mode;
 
+  if (!noce_simple_bbs (if_info))
+    return FALSE;
+
   if (CONST_INT_P (if_info->a)
       && CONST_INT_P (if_info->b))
     {
@@ -1291,6 +1328,9 @@ noce_try_addcc (struct noce_if_info *if_info)
   rtx_insn *seq;
   int subtract, normalize;
 
+  if (!noce_simple_bbs (if_info))
+    return FALSE;
+
   if (GET_CODE (if_info->a) == PLUS
       && rtx_equal_p (XEXP (if_info->a, 0), if_info->b)
       && (reversed_comparison_code (if_info->cond, if_info->jump)
@@ -1382,6 +1422,9 @@ noce_try_store_flag_mask (struct noce_if_info *if_info)
   rtx_insn *seq;
   int reversep;
 
+  if (!noce_simple_bbs (if_info))
+    return FALSE;
+
   reversep = 0;
   if ((if_info->branch_cost >= 2
        || STORE_FLAG_VALUE == -1)
@@ -1550,6 +1593,9 @@ noce_try_cmove (struct noce_if_info *if_info)
   rtx target;
   rtx_insn *seq;
 
+  if (!noce_simple_bbs (if_info))
+    return FALSE;
+
   if ((CONSTANT_P (if_info->a) || register_operand (if_info->a, VOIDmode))
       && (CONSTANT_P (if_info->b) || register_operand (if_info->b, VOIDmode)))
     {
@@ -1584,6 +1630,92 @@ noce_try_cmove (struct noce_if_info *if_info)
   return FALSE;
 }
 
+/* Return iff the registers that the insns in BB_A set do not
+   get used in BB_B.  */
+
+static bool
+bbs_ok_for_cmove_arith (basic_block bb_a, basic_block bb_b)
+{
+  rtx_insn *a_insn;
+  FOR_BB_INSNS (bb_a, a_insn)
+    {
+      if (!active_insn_p (a_insn))
+	continue;
+
+      rtx sset_a = single_set (a_insn);
+
+      if (!sset_a)
+	return false;
+
+      rtx dest_reg = SET_DEST (sset_a);
+      rtx_insn *b_insn;
+
+      FOR_BB_INSNS (bb_b, b_insn)
+	{
+	  if (!active_insn_p (b_insn))
+	    continue;
+
+	  rtx sset_b = single_set (b_insn);
+
+	  if (!sset_b)
+	    return false;
+
+	  if (reg_referenced_p (dest_reg, sset_b))
+	    return false;
+	}
+    }
+
+  return true;
+}
+
+/* Emit copies of all the active instructions in BB except the last.
+   This is a helper for noce_try_cmove_arith.  */
+
+static void
+noce_emit_all_but_last (basic_block bb)
+{
+  rtx_insn *last = last_active_insn (bb, FALSE);
+  rtx_insn *insn;
+  FOR_BB_INSNS (bb, insn)
+    {
+      if (insn != last && active_insn_p (insn))
+	{
+	  rtx_insn *to_emit = as_a <rtx_insn *> (copy_rtx (insn));
+
+	  emit_insn (PATTERN (to_emit));
+	}
+    }
+}
+
+/* Helper for noce_try_cmove_arith.  Emit the pattern TO_EMIT and return
+   the resulting insn or NULL if it's not a valid insn.  */
+
+static rtx_insn *
+noce_emit_insn (rtx to_emit)
+{
+  gcc_assert (to_emit);
+  rtx_insn *insn = emit_insn (to_emit);
+
+  if (recog_memoized (insn) < 0)
+    return NULL;
+
+  return insn;
+}
+
+/* Helper for noce_try_cmove_arith.  */
+
+static bool
+noce_emit_bb (rtx last_insn, basic_block bb, bool simple)
+{
+  if (bb && !simple)
+    noce_emit_all_but_last (bb);
+
+  if (last_insn && !noce_emit_insn (last_insn))
+    return false;
+
+  return true;
+}
+
 /* Try more complex cases involving conditional_move.  */
 
 static int
@@ -1594,9 +1726,12 @@ noce_try_cmove_arith (struct noce_if_info *if_info)
   rtx x = if_info->x;
   rtx orig_a, orig_b;
   rtx_insn *insn_a, *insn_b;
+  bool a_simple = if_info->then_simple;
+  bool b_simple = if_info->else_simple;
+  basic_block then_bb = if_info->then_bb;
+  basic_block else_bb = if_info->else_bb;
   rtx target;
   int is_mem = 0;
-  int insn_cost;
   enum rtx_code code;
   rtx_insn *ifcvt_seq;
 
@@ -1635,27 +1770,23 @@ noce_try_cmove_arith (struct noce_if_info *if_info)
   insn_a = if_info->insn_a;
   insn_b = if_info->insn_b;
 
-  /* Total insn_rtx_cost should be smaller than branch cost.  Exit
-     if insn_rtx_cost can't be estimated.  */
+  int then_cost;
+  int else_cost;
   if (insn_a)
-    {
-      insn_cost
-	= insn_rtx_cost (PATTERN (insn_a),
-      			 optimize_bb_for_speed_p (BLOCK_FOR_INSN (insn_a)));
-      if (insn_cost == 0 || insn_cost > COSTS_N_INSNS (if_info->branch_cost))
-	return FALSE;
-    }
+    then_cost = if_info->then_cost;
   else
-    insn_cost = 0;
+    then_cost = 0;
 
   if (insn_b)
-    {
-      insn_cost
-	+= insn_rtx_cost (PATTERN (insn_b),
-      			  optimize_bb_for_speed_p (BLOCK_FOR_INSN (insn_b)));
-      if (insn_cost == 0 || insn_cost > COSTS_N_INSNS (if_info->branch_cost))
-        return FALSE;
-    }
+    else_cost = if_info->else_cost;
+  else
+    else_cost = 0;
+
+  /* We're going to execute one of the basic blocks anyway, so
+     bail out if the most expensive of the two blocks is unacceptable.  */
+  if (MAX (then_cost, else_cost)
+      > COSTS_N_INSNS (if_info->branch_cost))
+    return FALSE;
 
   /* Possibly rearrange operands to make things come out more natural.  */
   if (reversed_comparison_code (if_info->cond, if_info->jump) != UNKNOWN)
@@ -1671,26 +1802,35 @@ noce_try_cmove_arith (struct noce_if_info *if_info)
 	  code = reversed_comparison_code (if_info->cond, if_info->jump);
 	  std::swap (a, b);
 	  std::swap (insn_a, insn_b);
+	  std::swap (a_simple, b_simple);
+	  std::swap (then_bb, else_bb);
 	}
     }
 
+  if (!a_simple && then_bb && !b_simple && else_bb
+      && !bbs_ok_for_cmove_arith (then_bb, else_bb))
+    return FALSE;
+
   start_sequence ();
 
   orig_a = a;
   orig_b = b;
 
+  rtx emit_a = NULL_RTX;
+  rtx emit_b = NULL_RTX;
+
   /* If either operand is complex, load it into a register first.
      The best way to do this is to copy the original insn.  In this
      way we preserve any clobbers etc that the insn may have had.
      This is of course not possible in the IS_MEM case.  */
+
   if (! general_operand (a, GET_MODE (a)))
     {
-      rtx_insn *insn;
 
       if (is_mem)
 	{
 	  rtx reg = gen_reg_rtx (GET_MODE (a));
-	  insn = emit_insn (gen_rtx_SET (reg, a));
+	  emit_a = gen_rtx_SET (reg, a);
 	}
       else if (! insn_a)
 	goto end_seq_and_fail;
@@ -1700,21 +1840,17 @@ noce_try_cmove_arith (struct noce_if_info *if_info)
 	  rtx_insn *copy_of_a = as_a <rtx_insn *> (copy_rtx (insn_a));
 	  rtx set = single_set (copy_of_a);
 	  SET_DEST (set) = a;
-	  insn = emit_insn (PATTERN (copy_of_a));
+
+	  emit_a = PATTERN (copy_of_a);
 	}
-      if (recog_memoized (insn) < 0)
-	goto end_seq_and_fail;
     }
+
   if (! general_operand (b, GET_MODE (b)))
     {
-      rtx pat;
-      rtx_insn *last;
-      rtx_insn *new_insn;
-
       if (is_mem)
 	{
           rtx reg = gen_reg_rtx (GET_MODE (b));
-	  pat = gen_rtx_SET (reg, b);
+	  emit_b = gen_rtx_SET (reg, b);
 	}
       else if (! insn_b)
 	goto end_seq_and_fail;
@@ -1723,26 +1859,39 @@ noce_try_cmove_arith (struct noce_if_info *if_info)
           b = gen_reg_rtx (GET_MODE (b));
 	  rtx_insn *copy_of_insn_b = as_a <rtx_insn *> (copy_rtx (insn_b));
 	  rtx set = single_set (copy_of_insn_b);
+
 	  SET_DEST (set) = b;
-	  pat = PATTERN (copy_of_insn_b);
+	  emit_b = PATTERN (copy_of_insn_b);
 	}
+    }
 
-      /* If insn to set up A clobbers any registers B depends on, try to
-	 swap insn that sets up A with the one that sets up B.  If even
-	 that doesn't help, punt.  */
-      last = get_last_insn ();
-      if (last && modified_in_p (orig_b, last))
-	{
-	  new_insn = emit_insn_before (pat, get_insns ());
-	  if (modified_in_p (orig_a, new_insn))
-	    goto end_seq_and_fail;
-	}
-      else
-	new_insn = emit_insn (pat);
+    /* If insn to set up A clobbers any registers B depends on, try to
+       swap insn that sets up A with the one that sets up B.  If even
+       that doesn't help, punt.  */
 
-      if (recog_memoized (new_insn) < 0)
-	goto end_seq_and_fail;
-    }
+    if (emit_a && modified_in_p (orig_b, emit_a))
+      {
+	if (modified_in_p (orig_a, emit_b))
+	  goto end_seq_and_fail;
+
+	if (else_bb && !b_simple)
+	  {
+	    if (!noce_emit_bb (emit_b, else_bb, b_simple))
+	      goto end_seq_and_fail;
+	  }
+
+	if (!noce_emit_bb (emit_a, then_bb, a_simple))
+	  goto end_seq_and_fail;
+      }
+    else
+      {
+	if (!noce_emit_bb (emit_a, then_bb, a_simple))
+	  goto end_seq_and_fail;
+
+	if (!noce_emit_bb (emit_b, else_bb, b_simple))
+	  goto end_seq_and_fail;
+
+      }
 
   target = noce_emit_cmove (if_info, x, code, XEXP (if_info->cond, 0),
 			    XEXP (if_info->cond, 1), a, b);
@@ -1946,6 +2095,9 @@ noce_try_minmax (struct noce_if_info *if_info)
   enum rtx_code code, op;
   int unsignedp;
 
+  if (!noce_simple_bbs (if_info))
+    return FALSE;
+
   /* ??? Reject modes with NaNs or signed zeros since we don't know how
      they will be resolved with an SMIN/SMAX.  It wouldn't be too hard
      to get the target to tell us...  */
@@ -2042,6 +2194,9 @@ noce_try_abs (struct noce_if_info *if_info)
   int negate;
   bool one_cmpl = false;
 
+  if (!noce_simple_bbs (if_info))
+    return FALSE;
+
   /* Reject modes with signed zeros.  */
   if (HONOR_SIGNED_ZEROS (if_info->x))
     return FALSE;
@@ -2190,6 +2345,9 @@ noce_try_sign_mask (struct noce_if_info *if_info)
   enum rtx_code code;
   bool t_unconditional;
 
+  if (!noce_simple_bbs (if_info))
+    return FALSE;
+
   cond = if_info->cond;
   code = GET_CODE (cond);
   m = XEXP (cond, 0);
@@ -2273,6 +2431,9 @@ noce_try_bitop (struct noce_if_info *if_info)
   cond = if_info->cond;
   code = GET_CODE (cond);
 
+  if (!noce_simple_bbs (if_info))
+    return FALSE;
+
   /* Check for no else condition.  */
   if (! rtx_equal_p (x, if_info->b))
     return FALSE;
@@ -2523,6 +2684,121 @@ noce_can_store_speculate_p (basic_block top_bb, const_rtx mem)
   return false;
 }
 
+/* Helper for bb_valid_for_noce_process_p.  */
+
+static bool
+insn_valid_noce_process_p (rtx_insn *insn, rtx cc)
+{
+  if (!insn
+      || !NONJUMP_INSN_P (insn)
+      || (cc && set_of (cc, insn)))
+      return false;
+
+  rtx sset = single_set (insn);
+
+  /* Currently support only simple single sets in test_bb.  */
+  if (!sset
+      || !noce_operand_ok (SET_DEST (sset))
+      || !noce_operand_ok (SET_SRC (sset)))
+    return false;
+
+  return true;
+}
+
+/* Return true iff basic block TEST_BB followed by a join basic block
+   JOIN_BB is valid for noce if-conversion.
+   The condition used in this if-conversion is in COND.
+   In practice, check that TEST_BB ends with a single set
+   x := a and all previous computations
+   in TEST_BB don't produce any values that are live after TEST_BB.
+   In other words, all the insns in TEST_BB are there only
+   to compute a value for x.  Put the rtx cost of the insns
+   in TEST_BB into COST.  Record whether TEST_BB is a single simple
+   set instruction in SIMPLE_P.  If the bb is not simple place all insns
+   except the last insn into SEQ.  */
+
+static bool
+bb_valid_for_noce_process_p (basic_block test_bb, rtx cond,
+			      int *cost, bool *simple_p)
+{
+  if (!test_bb)
+    return false;
+
+  rtx_insn *last_insn = last_active_insn (test_bb, FALSE);
+  rtx last_set = NULL_RTX;
+
+  rtx cc = cc_in_cond (cond);
+
+  if (!insn_valid_noce_process_p (last_insn, cc))
+    return FALSE;
+  last_set = single_set (last_insn);
+
+  rtx x = SET_DEST (last_set);
+
+  rtx_insn *first_insn = first_active_insn (test_bb);
+  rtx first_set = single_set (first_insn);
+  bool speed_p = optimize_bb_for_speed_p (test_bb);
+
+  *cost = insn_rtx_cost (last_set, speed_p);
+  if (!first_set)
+    return false;
+  /* We have a single simple set, that's okay.  */
+  else if (first_insn == last_insn)
+    {
+      *simple_p = noce_operand_ok (SET_DEST (first_set));
+      *cost = insn_rtx_cost (first_set, speed_p);
+      return *simple_p;
+    }
+
+  *simple_p = false;
+
+  rtx_insn *prev_last_insn = PREV_INSN (last_insn);
+  gcc_assert (prev_last_insn);
+
+  /* For now, disallow setting x multiple times in test_bb.  */
+  if (REG_P (x) && reg_set_between_p (x, first_insn, prev_last_insn))
+    return false;
+
+  bitmap test_bb_temps = BITMAP_ALLOC (&reg_obstack);
+
+  /* The regs that are live out of test_bb.  */
+  bitmap test_bb_live_out = df_get_live_out (test_bb);
+
+  rtx_insn *insn;
+  FOR_BB_INSNS (test_bb, insn)
+    {
+      if (insn != last_insn)
+	{
+	  if (!active_insn_p (insn))
+	    continue;
+
+	  if (!insn_valid_noce_process_p (insn, cc))
+	    goto free_bitmap_and_fail;
+
+	  rtx sset = single_set (insn);
+	  gcc_assert (sset);
+
+	  if (MEM_P (SET_SRC (sset)) || MEM_P (SET_DEST (sset)))
+	    goto free_bitmap_and_fail;
+
+	  *cost += insn_rtx_cost (sset, speed_p);
+	  bitmap_set_bit (test_bb_temps, REGNO (SET_DEST (sset)));
+	}
+    }
+
+  /* If any of the intermediate results in test_bb are live after test_bb
+     then fail.  */
+  if (bitmap_intersect_p (test_bb_live_out, test_bb_temps))
+    goto free_bitmap_and_fail;
+
+  BITMAP_FREE (test_bb_temps);
+  return true;
+
+  free_bitmap_and_fail:
+    BITMAP_FREE (test_bb_temps);
+    return false;
+}
+
 /* Given a simple IF-THEN-JOIN or IF-THEN-ELSE-JOIN block, attempt to convert
    it without using conditional execution.  Return TRUE if we were successful
    at converting the block.  */
@@ -2539,7 +2815,6 @@ noce_process_if_block (struct noce_if_info *if_info)
   rtx_insn *insn_a, *insn_b;
   rtx set_a, set_b;
   rtx orig_x, x, a, b;
-  rtx cc;
 
   /* We're looking for patterns of the form
 
@@ -2548,16 +2823,31 @@ noce_process_if_block (struct noce_if_info *if_info)
      (3) if (...) x = a;   // as if with an initial x = x.
 
      The later patterns require jumps to be more expensive.
-
+     For the if (...) x = a; else x = b; case we allow multiple insns
+     inside the then and else blocks as long as their only effect is
+     to calculate a value for x.
      ??? For future expansion, look for multiple X in such patterns.  */
 
-  /* Look for one of the potential sets.  */
-  insn_a = first_active_insn (then_bb);
-  if (! insn_a
-      || insn_a != last_active_insn (then_bb, FALSE)
-      || (set_a = single_set (insn_a)) == NULL_RTX)
+  bool then_bb_valid
+    = bb_valid_for_noce_process_p (then_bb, cond, &if_info->then_cost,
+				    &if_info->then_simple);
+
+  bool else_bb_valid = false;
+  if (else_bb)
+    else_bb_valid
+      = bb_valid_for_noce_process_p (else_bb, cond, &if_info->else_cost,
+				      &if_info->else_simple);
+
+  if (!then_bb_valid)
+    return FALSE;
+
+  if (else_bb && !else_bb_valid)
     return FALSE;
 
+  insn_a = last_active_insn (then_bb, FALSE);
+  set_a = single_set (insn_a);
+  gcc_assert (set_a);
+
   x = SET_DEST (set_a);
   a = SET_SRC (set_a);
 
@@ -2571,12 +2861,12 @@ noce_process_if_block (struct noce_if_info *if_info)
   set_b = NULL_RTX;
   if (else_bb)
     {
-      insn_b = first_active_insn (else_bb);
-      if (! insn_b
-	  || insn_b != last_active_insn (else_bb, FALSE)
-	  || (set_b = single_set (insn_b)) == NULL_RTX
-	  || ! rtx_interchangeable_p (x, SET_DEST (set_b)))
-	return FALSE;
+      insn_b = last_active_insn (else_bb, FALSE);
+      set_b = single_set (insn_b);
+      gcc_assert (set_b);
+
+      if (!rtx_interchangeable_p (x, SET_DEST (set_b)))
+        return FALSE;
     }
   else
     {
@@ -2651,20 +2941,14 @@ noce_process_if_block (struct noce_if_info *if_info)
   if_info->a = a;
   if_info->b = b;
 
-  /* Skip it if the instruction to be moved might clobber CC.  */
-  cc = cc_in_cond (cond);
-  if (cc
-      && (set_of (cc, insn_a)
-	  || (insn_b && set_of (cc, insn_b))))
-    return FALSE;
-
   /* Try optimizations in some approximation of a useful order.  */
   /* ??? Should first look to see if X is live incoming at all.  If it
      isn't, we don't need anything but an unconditional set.  */
 
   /* Look and see if A and B are really the same.  Avoid creating silly
      cmove constructs that no one will fix up later.  */
-  if (rtx_interchangeable_p (a, b))
+  if (noce_simple_bbs (if_info)
+      && rtx_interchangeable_p (a, b))
     {
       /* If we have an INSN_B, we don't have to create any new rtl.  Just
 	 move the instruction that we already have.  If we don't have an
diff --git a/gcc/testsuite/gcc.target/aarch64/ifcvt_csel_1.c b/gcc/testsuite/gcc.target/aarch64/ifcvt_csel_1.c
new file mode 100644
index 0000000..1836f57
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/ifcvt_csel_1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-fdump-rtl-ce1 -O2" } */
+
+int
+foo (int x)
+{
+  return x > 100 ? x - 2 : x - 1;
+}
+
+/* { dg-final { scan-rtl-dump "3 true changes made" "ce1" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/ifcvt_csel_2.c b/gcc/testsuite/gcc.target/aarch64/ifcvt_csel_2.c
new file mode 100644
index 0000000..8c48270
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/ifcvt_csel_2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-fdump-rtl-ce1 -O2" } */
+
+
+typedef unsigned char uint8_t;
+typedef unsigned int uint16_t;
+
+uint8_t
+_xtime (const uint8_t byte, const uint16_t generator)
+{
+  if (byte & 0x80)
+    return byte ^ generator;
+  else
+    return byte << 1;
+}
+
+/* { dg-final { scan-rtl-dump "3 true changes made" "ce1" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/ifcvt_csel_3.c b/gcc/testsuite/gcc.target/aarch64/ifcvt_csel_3.c
new file mode 100644
index 0000000..1aecbc9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/ifcvt_csel_3.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-save-temps -O2" } */
+
+
+typedef long long s64;
+
+int
+foo (s64 a, s64 b, s64 c)
+{
+ s64 d = a - b;
+
+  if (d == 0)
+    return a + c;
+  else
+    return b + d + c;
+}
+
+/* This test can be reduced to just return a + c;  */
+/* { dg-final { scan-assembler-not "sub\.*\tx\[0-9\]+, x\[0-9\]+, x\[0-9\]+\.*" } } */

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]