[PATCH] ARM cmpsi2_addneg fixes (PR target/89506)

Thu Feb 28 22:16:00 GMT 2019

Hi!

The following testcase ICEs on ARM, because the backend creates CONST_INTs
that aren't valid for SImode, in which they are used (0x80000000 rather than
the canonical -0x80000000).  This is fixed by the 3 gen_int_mode calls
instead of just GEN_INT.
Once that is fixed, we ICE because if both the CONST_INTs are -0x80000000,
then INTVAL (operands[2]) == -INTVAL (operands[3]) is no longer true and
thus cmpsi2_addneg doesn't match and no other pattern does.
Fixing that is pretty obvious thing to do as well, similarly to the recent
*subsi3_carryin_compare_const fix.

The next problem is that the result doesn't bootstrap.  The problem is that
the instruction emitted for that -0x80000000 immediate - adds reg, reg, #-0x80000000
actually doesn't do what the RTL pattern for it says.
The pattern is like subsi3_compare, except that the MINUS with
constant second argument is canonicalized as RTL normally does into PLUS of
the negated constant.  The mode on the condition code register is CCmode,
so all Z, N, C and V bits should be valid, and the pattern is like that of
a cmp instruction, so the behavior vs. condition codes should be like cmp
instruction, which is what the subs instruction does.  The adds r1, r2, #N
and subs r1, r2, #-N instructions actually behave the same most of the time.
The result is the same, Z and N flags are always the same as well.  And,
it seems that for all N except for 0 and 0x80000000 also the C and V bits
are the same (I've proved that by using the valgrind subs condition code
algorithm (which is the same as cmp) vs. valgrind adds condition code
algorithm (which is the same as cmn) and even emulated it using 8-bit x
8-bit exhaustive check).  For 0 and 0x80000000 it is different and as can be
seen by the bootstrap failure, it is significant.
Previously before the above change we got by by the pattern never triggering
by the combiner for 0x80000000 (because the combiner would always try
canonical CONST_INTs for both and those don't have INTVAL == -INTVAL), but
it worked for the *compare_scc splitter/*negscc/*thumb2_negscc patterns
(which created invalid RTL and used adds rather than subs, but didn't care,
as it only tested the Z bit using ne condition).

My next attempt was just to switch the two alternatives, so that if
operands[2] matches both "I" and "L" constraints and we can choose, we'd
use subs.  That cured the bootstrap failure, but introduced
+FAIL: gcc.target/arm/pr43920-2.c object-size text <= 54
+FAIL: gcc.target/arm/thumb2-cmpneg2add-2.c scan-assembler adds
regressions, in those 2 testcases neither the problematic 0 nor INT_MIN
is used, so both adds and subs are equivalent, but
-	adds	r2, r4, #1
+	subs	r2, r4, #-1
results in larger code.
Note, I've seen the patch creating smaller code in some cases too,
when the combiner matched the cmpsi2_addneg pattern with INT_MIN and
previously emitted loading the constant into some register and performing
subs with 3 registers instead of 2 and immediate.

So, this is another alternative I'm proposing, just make sure we use
subs for #0 and #-2147483648 (where it is essential for correctness)
and for others keep previous behavior.

Ok for trunk?

Or is there an easy way to estimate if a constant satisfies both "I" and "L"
constraints at the same time and is not one of 0 and INT_MIN, which
of the two (positive or negated) would have smaller encoding and decide
based on that?

2019-02-28  Jakub Jelinek  <jakub@redhat.com>

	PR target/89506
	* config/arm/arm.md (cmpsi2_addneg): Use
	trunc_int_for_mode (-INTVAL (...), SImode) instead of -INTVAL (...).
	If operands[2] is 0 or INT_MIN, force use of subs.
	(*compare_scc splitter): Use gen_int_mode.
	(*negscc): Likewise.
	* config/arm/thumb2.md (*thumb2_negscc): Likewise.

	* gcc.dg/pr89506.c: New test.

--- gcc/config/arm/arm.md.jj	2019-02-28 14:13:08.368267536 +0100
+++ gcc/config/arm/arm.md	2019-02-28 22:09:03.089588596 +0100
@@ -867,10 +867,21 @@ (define_insn "cmpsi2_addneg"
    (set (match_operand:SI 0 "s_register_operand" "=r,r")
 	(plus:SI (match_dup 1)
 		 (match_operand:SI 3 "arm_addimm_operand" "I,L")))]
-  "TARGET_32BIT && INTVAL (operands[2]) == -INTVAL (operands[3])"
-  "@
-   adds%?\\t%0, %1, %3
-   subs%?\\t%0, %1, #%n3"
+  "TARGET_32BIT
+   && (INTVAL (operands[2])
+       == trunc_int_for_mode (-INTVAL (operands[3]), SImode))"
+{
+  /* For 0 and INT_MIN it is essential that we use subs, as adds
+     will result in different condition codes (like cmn rather than
+     like cmp).  For other immediates, we should choose whatever
+     will have smaller encoding.  */
+  if (operands[2] == const0_rtx
+      || INTVAL (operands[2]) == -HOST_WIDE_INT_C (0x80000000)
+      || which_alternative == 1)
+    return "subs%?\\t%0, %1, #%n3";
+  else
+    return "adds%?\\t%0, %1, %3";
+}
   [(set_attr "conds" "set")
    (set_attr "type" "alus_sreg")]
 )
@@ -9302,7 +9313,7 @@ (define_split
    (cond_exec (ne:CC (reg:CC CC_REGNUM) (const_int 0))
 	      (set (match_dup 0) (const_int 1)))]
 {
-  operands[3] = GEN_INT (-INTVAL (operands[2]));
+  operands[3] = gen_int_mode (-INTVAL (operands[2]), SImode);
 })
 
 (define_split
@@ -10082,7 +10093,8 @@ (define_insn_and_split "*negscc"
         /* Emit subs\\t%0, %1, %2\;mvnne\\t%0, #0 */
         if (CONST_INT_P (operands[2]))
           emit_insn (gen_cmpsi2_addneg (operands[0], operands[1], operands[2],
-                                        GEN_INT (- INTVAL (operands[2]))));
+                                        gen_int_mode (-INTVAL (operands[2]),
+						      SImode)));
         else
           emit_insn (gen_subsi3_compare (operands[0], operands[1], operands[2]));
 
--- gcc/config/arm/thumb2.md.jj	2019-02-28 08:14:55.970600405 +0100
+++ gcc/config/arm/thumb2.md	2019-02-28 21:51:20.387902673 +0100
@@ -913,7 +913,8 @@ (define_insn_and_split "*thumb2_negscc"
         /* Emit subs\\t%0, %1, %2\;it\\tne\;mvnne\\t%0, #0 */
         if (CONST_INT_P (operands[2]))
           emit_insn (gen_cmpsi2_addneg (operands[0], operands[1], operands[2],
-                                        GEN_INT (- INTVAL (operands[2]))));
+                                        gen_int_mode (-INTVAL (operands[2]),
+						      SImode)));
         else
           emit_insn (gen_subsi3_compare (operands[0], operands[1], operands[2]));
 
--- gcc/testsuite/gcc.dg/pr89506.c.jj	2019-02-28 21:51:20.387902673 +0100
+++ gcc/testsuite/gcc.dg/pr89506.c	2019-02-28 21:51:20.387902673 +0100
@@ -0,0 +1,14 @@
+/* PR target/89506 */
+/* { dg-do compile } */
+/* { dg-options "-Og -g -w" } */
+
+long long a;
+int c;
+
+int
+foo (long long d, short e)
+{
+  __builtin_sub_overflow (0xffffffff, c, &a);
+  e >>= ~2147483647 != (int) a;
+  return d + e;
+}

	Jakub