This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Take known zero bits into account when checking extraction.


On 04/27/2016 02:20 AM, Dominik Vogt wrote:
The attached patch is a result of discussing an S/390 issue with
"and with complement" in some cases.

  https://gcc.gnu.org/ml/gcc/2016-03/msg00163.html
  https://gcc.gnu.org/ml/gcc-patches/2016-04/msg01586.html

Combine would merge a ZERO_EXTEND and a SET taking the known zero
bits into account, resulting in an AND.  Later on,
make_compound_operation() fails to replace that with a ZERO_EXTEND
which we get for free on S/390 but leaves the AND, eventually
resulting in two consecutive AND instructions.

The current code in make_compound_operation() that detects
opportunities for ZERO_EXTEND does not work here because it does
not take the known zero bits into account:

      /* If the constant is one less than a power of two, this might be
	 representable by an extraction even if no shift is present.
	 If it doesn't end up being a ZERO_EXTEND, we will ignore it unless
	 we are in a COMPARE.  */
      else if ((i = exact_log2 (UINTVAL (XEXP (x, 1)) + 1)) >= 0)
	new_rtx = make_extraction (mode,
			       make_compound_operation (XEXP (x, 0),
							next_code),
			       0, NULL_RTX, i, 1, 0, in_code == COMPARE);

An attempt to use the zero bits in the above conditions resulted
in many situations that generated worse code, so the patch tries
to fix this in a more conservative way.  While the effect is
completely positive on S/390, this will very likely have
unforeseeable consequences on other targets.

Bootstrapped and regression tested on s390 and s390x only at the
moment.

Ciao

Dominik ^_^  ^_^

-- Dominik Vogt IBM Germany


0001-ChangeLog


gcc/ChangeLog

	* combine.c (make_compound_operation): Take known zero bits into
	account when checking for possible zero_extend.
I'd strongly recommend writing some tests for this. Extra credit if they can be run on an x86 target which gets more testing than s390.

If I go back to our original discussion, we have this going into combine:

(insn 6 3 7 2 (parallel [
            (set (reg:SI 64)
                (and:SI (mem:SI (reg/v/f:DI 63 [ a ]) [1 *a_2(D)+0 S4 A32])
                    (const_int -65521 [0xffffffffffff000f])))
            (clobber (reg:CC 33 %cc))
        ]) andc-immediate.c:21 1481 {*andsi3_zarch}
     (expr_list:REG_DEAD (reg/v/f:DI 63 [ a ])
        (expr_list:REG_UNUSED (reg:CC 33 %cc)
            (nil))))
(insn 7 6 12 2 (set (reg:DI 65)
(zero_extend:DI (reg:SI 64))) andc-immediate.c:21 1207 {*zero_extendsidi2}
     (expr_list:REG_DEAD (reg:SI 64)
        (nil)))
(insn 12 7 13 2 (set (reg/i:DI 2 %r2)
        (reg:DI 65)) andc-immediate.c:22 1073 {*movdi_64}
     (expr_list:REG_DEAD (reg:DI 65)
        (nil)))

Which combine turns into:

(insn 6 3 7 2 (parallel [
            (set (reg:SI 64)
                (and:SI (mem:SI (reg:DI 2 %r2 [ a ]) [1 *a_2(D)+0 S4 A32])
                    (const_int -65521 [0xffffffffffff000f])))
            (clobber (reg:CC 33 %cc))
        ]) andc-immediate.c:21 1481 {*andsi3_zarch}
     (expr_list:REG_DEAD (reg:DI 2 %r2 [ a ])
        (expr_list:REG_UNUSED (reg:CC 33 %cc)
            (nil))))
(insn 12 7 13 2 (parallel [
            (set (reg/i:DI 2 %r2)
                (and:DI (subreg:DI (reg:SI 64) 0)
                 ^^^
                    (const_int 4294901775 [0xffff000f])))
                                           ^^^^^^^^^^
            (clobber (reg:CC 33 %cc))
        ]) andc-immediate.c:22 1474 {*anddi3}
     (expr_list:REG_UNUSED (reg:CC 33 %cc)
        (expr_list:REG_DEAD (reg:SI 64)
            (nil))))


Instead you want insn 12 to use a zero-extend to extend (reg:SI 64) into (reg:DI 2)?

Can't you achieve this in this clause:

     /* If the constant is one less than a power of two, this might be
         representable by an extraction even if no shift is present.
         If it doesn't end up being a ZERO_EXTEND, we will ignore it unless
         we are in a COMPARE.  */

You extract the constant via UINTVAL (XEXP (x, 1)), then munge it based on nonzero_bits and pass the result to exact_log2?

Though I do like how you've conditionalized on the cost of the and vs the cost of hte zero-extend. So maybe your approach is ultimately better. Still curious your thoughts on doing it by just munging the constant you pass off to exact_log2 in that earlier clause.


+      /* If the one operand is a paradoxical subreg of a register or memory and
+	 the constant (limited to the smaller mode) has only zero bits where
+	 the sub expression has known zero bits, this can be expressed as
+	 a zero_extend.  */
+      else if (GET_CODE (XEXP (x, 0)) == SUBREG)
+	{
+	  rtx sub;
+
+	  sub = XEXP (XEXP (x, 0), 0);
+	  machine_mode sub_mode = GET_MODE (sub);
+	  if ((REG_P (sub) || MEM_P (sub))
+	      && GET_MODE_PRECISION (sub_mode) < mode_width
+	      && (UINTVAL (XEXP (x, 1))
+		  | (~nonzero_bits (sub, sub_mode) & GET_MODE_MASK (sub_mode))
+		  ) == GET_MODE_MASK (sub_mode))
I'd either factor something out or write a nested conditional to avoid the horrid line wrapping here and hopefully make the conditions easier to read.

Jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]