This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFA:] Fix PR rtl-optimization/17933: parallel set destinationsunhandled by dead_or_set_regno_p.

PR rtl-optimization/17933 is an abort on hppa64-hp-hpux11.11 at -O2 for
a detected inconsistency in the number of REG_DEAD notes before and after
the first scheduling pass (which doesn't rearrange anything in this case).
The problem is not otherwise related to scheduling.  The number of
REG_DEAD notes change on one particular insn.  Before scheduling:

(call_insn 18 17 19 0 (parallel [
            (set (parallel:BLK [
                        (expr_list:REG_DEP_TRUE (reg:DI 28 %r28)
                            (const_int 0 [0x0]))
                (call (mem:SI (reg:DI 28 %r28) [0 S4 A32])
                    (const_int 64 [0x40])))
            (clobber (reg:DI 2 %r2))
            (clobber (reg:DI 4 %r4))
            (use (reg:DI 27 %r27))
            (use (reg/f:DI 29 %r29))
            (use (const_int 1 [0x1]))
        ]) 263 {call_val_reg_64bit} (insn_list:REG_DEP_TRUE 17 (insn_list:REG_DEP_TRUE 16 (insn_list:REG_DEP_TRUE 13 (nil))))
    (expr_list:REG_UNUSED (reg:DI 4 %r4)
        (expr_list:REG_UNUSED (reg:DI 2 %r2)
            (expr_list:REG_UNUSED (reg:DI 28 %r28)
                (expr_list:REG_DEAD (reg:DI 28 %r28)
                    (expr_list:REG_UNUSED (reg:DI 28 %r28)
                        (expr_list:REG_UNUSED (reg:DI 2 %r2)
                            (expr_list:REG_UNUSED (reg:DI 4 %r4)
                                (expr_list:REG_DEAD (reg:DI 27 %r27)
                                    (expr_list:REG_DEAD (reg/f:DI 29 %r29)
                                        (expr_list:REG_DEAD (reg:SI 26 %r26)
    (expr_list:REG_DEP_TRUE (use (reg:SI 26 %r26))

(Yes, a parallel like that for a return value is valid.)  After scheduling:

(call_insn 18 17 28 0 (parallel [
            (set (parallel:BLK [
                        (expr_list:REG_DEP_TRUE (reg:DI 28 %r28)
                            (const_int 0 [0x0]))
                (call (mem:SI (reg:DI 28 %r28) [0 S4 A32])
                    (const_int 64 [0x40])))
            (clobber (reg:DI 2 %r2))
            (clobber (reg:DI 4 %r4))
            (use (reg:DI 27 %r27))
            (use (reg/f:DI 29 %r29))
            (use (const_int 1 [0x1]))
        ]) 263 {call_val_reg_64bit} (insn_list:REG_DEP_ANTI 12 (insn_list:REG_DEP_TRUE 17 (insn_list:REG_DEP_TRUE 16 (insn_list:R\
EG_DEP_TRUE 13 (nil)))))
    (expr_list:REG_DEAD (reg:SI 26 %r26)
        (expr_list:REG_DEAD (reg/f:DI 29 %r29)
            (expr_list:REG_DEAD (reg:DI 27 %r27)
                (expr_list:REG_UNUSED (reg:DI 4 %r4)
                    (expr_list:REG_UNUSED (reg:DI 2 %r2)
                        (expr_list:REG_UNUSED (reg:DI 28 %r28)
    (expr_list:REG_DEP_TRUE (use (reg:SI 26 %r26))

The REG_DEAD note for register 28 is gone.  The REG_DEAD notes after the
scheduling pass are computed from scratch by update_life_info.  Before
that, the REG_DEAD notes were last modified by combine.  That's where
things went wrong.  You guessed it -- it's REG_DEAD-updating code doesn't
handle the SET destination being a parallel.  This is dead_or_set_regno_p.

In order to make the PARALLEL check happen at the right spot for
dead_or_set_regno_p, I noticed I had to introduce an extra copy of the
regno/endregno test, and then I noticed that I *should* also (though not
trigged by the test-case) cover the case where the (set (parallel ...)
...) is by itself and not in a parallel, so I'd have to introduce another
copy of the code I wrote (or rather, copied and adjust from mark_set_1).
I just had to break out the code copies to a separate helper function.

Beware that the helper function does recursive calls, so it might have a
negative effect on compilation times due to loss of inlineability of the
broken-out functions, not counting the added code.  This being a
correctness patch, I just wanted to mention it, not make it an argument
for or against the patch.  (It could be tweaked by breaking out code to a
separate function, but that would be less readable and premature at this

Bootstrapped and tested (no Ada) on i686-pc-linux-gnu (FC2) which alas
AFAIK doesn't use this kind of return-value-parallel construct (and ditto
mmix-knuth-mmixware FWIW).  The PR is for hppa64-hp-hpux11.11, which I do
not have access to.  I need access to one, or have someone else to
bootstrap and make check there.  (Unfortunately I see no such system among
Hewlett-Packard's testdrive systems; their installed gcc-3.4.2 on
hppa2.0w-hp-hpux11.11 does not expose the bug.)  The test-case succeeds in
a cross cc1 and seems to give the expected results, of course.  I've also
started a test on ia64-unknown-linux-gnu (RH E AS 3.0) (no ada or fortran;
hacked around the libunwind issue) which does have this parallel construct
(according to ia64_function_value), but the baseline (unpatched) results
are so bad that I believe the result will be of marginal interest, unless
of course a regression shows up.

It's true that the original test is in the ObjC testsuite, so there might
be arguments against adding this derived C test.  Some counterarguments:
that test in the ObjC testsuite isn't expected to expose this specific
bug.  Also, it's better to have a dedicated test written in C, one that
other (more) people than ObjC aficionados can digest.

Ok to commit?

Footnote1: Exactly why both distribute_notes (from combine) and
mark_used_reg (from update_life_info) deliberately avoid having REG_DEAD
notes on insns where the reg is also set (i.e. SET_DEST is not used and
the register is also mentioned in SET_SRC) I do not understand, but this
behavior of both of them is deliberate and noted in comments.  For
mark_used_reg: "If it was set in this insn, we do not make a REG_DEAD
note; likewise if we already made such a note." (it = the register) and in
combine's distribute_notes: "If the register is set or already dead at
PLACE, we needn't do anything with this note if it is still a REG_DEAD
note."  So they both avoid putting REG_DEAD notes on an insn that sets the
register in the note, despite this being valid (at least according to
rtl.texi; cf. REG_DEAD, REG_UNUSED).

A better but more intrusive solution for the future might be to make
combine rebuild the REG_DEAD notes by using the same machinery as the
scheduling pass, through update_life_info et al.  At present it uses the
notes itself, so they need to be up-to-date after every successful insn

I see semi-random SEGV:s during bootstrap, happened for both
baseline and patched compilers.  Actually, once it killed emacs.
Maybe bad machine/memory here, maybe some oom-killer.  Ugh.

	PR rtl-optimization/17933
	* rtlanal.c (dead_or_set_regno_p): Break out common code to...
	(covers_regno_p): New function.  Handle SETs of PARALLEL.

testsuite (with pinskia as author):
	PR rtl-optimization/17933
	* gcc.dg/torture/pr17933-1.c: New test.

Index: rtlanal.c
RCS file: /mnt/auto/localgcccvs/gcc/gcc/rtlanal.c,v
retrieving revision 1.203
diff -u -p -r1.203 rtlanal.c
--- rtlanal.c	10 Oct 2004 22:06:04 -0000	1.203
+++ rtlanal.c	31 Oct 2004 13:44:28 -0000
@@ -41,6 +41,7 @@ Software Foundation, 59 Temple Place - S
 /* Forward declarations */
 static int global_reg_mentioned_p_1 (rtx *, void *);
 static void set_of_1 (rtx, rtx, void *);
+static bool covers_regno_p (rtx, unsigned int);
 static int rtx_referenced_p_1 (rtx *, void *);
 static int computed_jump_p_1 (rtx);
 static void parms_set (rtx, rtx, void *);
@@ -1549,13 +1550,53 @@ dead_or_set_p (rtx insn, rtx x)
   return 1;

+/* Return TRUE iff DEST is a register or subreg of a register and
+   doesn't change the number of words of the inner register, or a
+   parallel containing an instance of a register, and any part of the
+   register is TEST_REGNO.  */
+static bool
+covers_regno_p (rtx dest, unsigned int test_regno)
+  unsigned int regno, endregno;
+  if (GET_CODE (dest) == SUBREG
+      && (((GET_MODE_SIZE (GET_MODE (dest))
+	       + UNITS_PER_WORD - 1) / UNITS_PER_WORD)))
+    dest = SUBREG_REG (dest);
+  else if (GET_CODE (dest) == PARALLEL)
+    {
+      /* Some targets place small structures in registers for return
+	 values of functions, and those registers are wrapped in
+	 PARALLELs that we may see as the destination of a SET.  */
+      int i;
+      for (i = XVECLEN (dest, 0) - 1; i >= 0; i--)
+	{
+	  rtx inner = XEXP (XVECEXP (dest, 0, i), 0);
+	  if (inner != NULL_RTX && REG_P (inner)
+	      && covers_regno_p (inner, test_regno))
+	    return true;
+	}
+    }
+  if (!REG_P (dest))
+    return false;
+  regno = REGNO (dest);
+  endregno = (regno >= FIRST_PSEUDO_REGISTER ? regno + 1
+	      : regno + hard_regno_nregs[regno][GET_MODE (dest)]);
+  return (test_regno >= regno && test_regno < endregno);
 /* Utility function for dead_or_set_p to check an individual register.  Also
    called from flow.c.  */

 dead_or_set_regno_p (rtx insn, unsigned int test_regno)
-  unsigned int regno, endregno;
   rtx pattern;

   /* See if there is a death note for something that includes TEST_REGNO.  */
@@ -1572,28 +1613,7 @@ dead_or_set_regno_p (rtx insn, unsigned
     pattern = COND_EXEC_CODE (pattern);

   if (GET_CODE (pattern) == SET)
-    {
-      rtx dest = SET_DEST (pattern);
-      /* A value is totally replaced if it is the destination or the
-	 destination is a SUBREG of REGNO that does not change the number of
-	 words in it.  */
-      if (GET_CODE (dest) == SUBREG
-	  && (((GET_MODE_SIZE (GET_MODE (dest))
-	      == ((GET_MODE_SIZE (GET_MODE (SUBREG_REG (dest)))
-	dest = SUBREG_REG (dest);
-      if (!REG_P (dest))
-	return 0;
-      regno = REGNO (dest);
-      endregno = (regno >= FIRST_PSEUDO_REGISTER ? regno + 1
-		  : regno + hard_regno_nregs[regno][GET_MODE (dest)]);
-      return (test_regno >= regno && test_regno < endregno);
-    }
+    return covers_regno_p (SET_DEST (pattern), test_regno);
   else if (GET_CODE (pattern) == PARALLEL)
       int i;
@@ -1605,27 +1625,9 @@ dead_or_set_regno_p (rtx insn, unsigned
 	  if (GET_CODE (body) == COND_EXEC)
 	    body = COND_EXEC_CODE (body);

-	  if (GET_CODE (body) == SET || GET_CODE (body) == CLOBBER)
-	    {
-	      rtx dest = SET_DEST (body);
-	      if (GET_CODE (dest) == SUBREG
-		  && (((GET_MODE_SIZE (GET_MODE (dest))
-		      == ((GET_MODE_SIZE (GET_MODE (SUBREG_REG (dest)))
-			   + UNITS_PER_WORD - 1) / UNITS_PER_WORD)))
-		dest = SUBREG_REG (dest);
-	      if (!REG_P (dest))
-		continue;
-	      regno = REGNO (dest);
-	      endregno = (regno >= FIRST_PSEUDO_REGISTER ? regno + 1
-			  : regno + hard_regno_nregs[regno][GET_MODE (dest)]);
-	      if (test_regno >= regno && test_regno < endregno)
-		return 1;
-	    }
+	  if ((GET_CODE (body) == SET || GET_CODE (body) == CLOBBER)
+	      && covers_regno_p (SET_DEST (body), test_regno))
+	    return 1;

--- /dev/null	2004-02-23 22:02:56.000000000 +0100
+++ gcc.dg/torture/pr17933-1.c	2004-10-31 14:53:53.000000000 +0100
@@ -0,0 +1,14 @@
+/* PR rtl-optimization/17933.
+   Test-case from the ObjC test-suite, execute/class_self-2.m,
+   translated to C from and reduced by Andrew Pinski.  */
+struct d
+{ int a; };
+void abort(void);
+typedef struct d (*f) (int i);
+f ff(void);
+void test1()
+  f t = ff();
+  t(0);

brgds, H-P

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]