This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] PR optimization/11741: Fix PRE multiple sets (take 2)


On Fri, 26 Sep 2003, Richard Henderson wrote:
> On Sat, Sep 20, 2003 at 11:28:56AM -0600, Roger Sayle wrote:
> > http://gcc.gnu.org/ml/gcc-patches/2003-09/msg00071.html
>
> I've had another hard look at this test case.
>
> I think that this *particular* change is safe.

Very many thanks.


> There is one problem with the patch for mainline: We don't verify
> that we can replace registers in the middle of the parallel (such
> patterns often make heavy use of match_dup).  This point post-dates
> Roger's posting by one day, due to
>
> 	2003-09-03  Mostafa Hagog  <mustafa@il.ibm.com>
>         * gcse.c (replace_one_set): New function.
>         (pre_insert_copy_insn): Change the order of copying
>         to make copy propagation discover additional PRE opportunities.
>
> Nevertheless this must now be addressed on mainline.  I believe that
> you could validate_change the substitution into the parallel and
> if that fails, fall back to the previous method of generating a copy
> to the reaching reg.

I've implemented this suggestion, and updated the patch to account for
Mostafa's changes to pre_insert_copy_insn since I developed the original
version of my patch.

Mostafa introduced a new abort checking that the newly created insn
is a single_set.  Its not clear that this check is needed or what
exactly he was concerned about.  However, I've kept a equivalent
test in the code below with a ??? comment.


> I see two outstanding problems with this deconstruction of parallels
> thing that Kenner introduced.
>
> First, if we were to try to make a set of a parallel redundant, we'd
> hit the abort in the middle of pre_delete.  I believe it would be safe,
> though inefficient, to skip processing this insn.  If we do that, we
> will have redundant computations (though no worse than before), and
> most likely two regs live (where before we had one).
>
>    [ As an aside, I wonder how much change to the solution there would
>      be to recompute redundancy and placement without the insn that we
>      discovered that we can't kill.  Given we've never hit this case,
>      it's clearly not worth implementing; just an idle thought.  ]
>
> Second, if we were to try to insert a new computation of the expression
> on an edge, we don't know for sure that it is possible to evaluate the
> expression outside of the parallel.  We will properly abort in
> process_insert_insn if this ever happens.  The fix would seem to be to
> verify the possibility *before* entering the expression into the hash
> table in the first place.
>
> I won't hold you responsible for either of these.

As a gesture of gratitude for taking the time to look over my patch
a second time, which has obviously included a significant audit
of the code paths in GCSE, I've also implemented the first of these
optional suggestions in my revised patch below.  Thanks again.


The following patch has been tested on i686-pc-linux-gnu with a complete
"make bootstrap", all languages except treelang, and regression tested
with a top-level "make -k check" with no new failures.  Once again this
patch fixes the new testcase extracted from PR 11741.  I believe that it
will also resolve PRs 6860 and 10467, but I'll need to build a cross
compiler for the ARM to confirm it.

Ok for mainline?


2003-09-26  Roger Sayle  <roger@eyesopen.com>
	    Richard Henderson  <rth@redhat.com>

	PR optimization/11741
	* gcse.c (pre_insert_copy_insn): Tweak the logic for finding the
	appropriate set to match that in hash_scan_insn.  Fall back to
	the original copy method, if we can't validate changing insn.
	(pre_delete): Only delete instructions that have a single_set,
	instead of aborting when we encounter an PARALLEL insn with more
	then one SET.

	* gcc.dg/20030926-1.c: New test case.


Index: gcse.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/gcse.c,v
retrieving revision 1.274
diff -c -3 -p -r1.274 gcse.c
*** gcse.c	15 Sep 2003 23:07:29 -0000	1.274
--- gcse.c	26 Sep 2003 12:49:01 -0000
*************** pre_edge_insert (struct edge_list *edge_
*** 5353,5366 ****
    return did_insert;
  }

! /* Copy the result of INSN to REG.  INDX is the expression number.
     Given "old_reg <- expr" (INSN), instead of adding after it
       reaching_reg <- old_reg
     it's better to do the following:
       reaching_reg <- expr
       old_reg      <- reaching_reg
     because this way copy propagation can discover additional PRE
!    opportunuties.  */

  static void
  pre_insert_copy_insn (struct expr *expr, rtx insn)
--- 5353,5366 ----
    return did_insert;
  }

! /* Copy the result of EXPR->EXPR generated by INSN to EXPR->REACHING_REG.
     Given "old_reg <- expr" (INSN), instead of adding after it
       reaching_reg <- old_reg
     it's better to do the following:
       reaching_reg <- expr
       old_reg      <- reaching_reg
     because this way copy propagation can discover additional PRE
!    opportunuties.  But if this fails, we try the old way.  */

  static void
  pre_insert_copy_insn (struct expr *expr, rtx insn)
*************** pre_insert_copy_insn (struct expr *expr,
*** 5368,5394 ****
    rtx reg = expr->reaching_reg;
    int regno = REGNO (reg);
    int indx = expr->bitmap_index;
!   rtx set = single_set (insn);
!   rtx new_insn;
!   rtx new_set;
    rtx old_reg;

!   if (!set)
      abort ();

    old_reg = SET_DEST (set);
!   new_insn = emit_insn_after (gen_move_insn (old_reg,
!                                              reg),
!                               insn);
!   new_set = single_set (new_insn);
!
!   if (!new_set)
!     abort();
!   SET_DEST (set) = reg;
!
!   /* Keep register set table up to date.  */
!   replace_one_set (REGNO (old_reg), insn, new_insn);
!   record_one_set (regno, insn);

    gcse_create_count++;

--- 5368,5424 ----
    rtx reg = expr->reaching_reg;
    int regno = REGNO (reg);
    int indx = expr->bitmap_index;
!   rtx pat = PATTERN (insn);
!   rtx set, new_insn;
    rtx old_reg;
+   int i;

!   /* This block matches the logic in hash_scan_insn.  */
!   if (GET_CODE (pat) == SET)
!     set = pat;
!   else if (GET_CODE (pat) == PARALLEL)
!     {
!       /* Search through the parallel looking for the set whose
! 	 source was the expression that we're interested in.  */
!       set = NULL_RTX;
!       for (i = 0; i < XVECLEN (pat, 0); i++)
! 	{
! 	  rtx x = XVECEXP (pat, 0, i);
! 	  if (GET_CODE (x) == SET
! 	      && expr_equiv_p (SET_SRC (x), expr->expr))
! 	    {
! 	      set = x;
! 	      break;
! 	    }
! 	}
!     }
!   else
      abort ();

    old_reg = SET_DEST (set);
!
!   /* Check if we can modify the set destination in the original insn.  */
!   if (validate_change (insn, &SET_DEST (set), reg, 0))
!     {
!       new_insn = gen_move_insn (old_reg, reg);
!       new_insn = emit_insn_after (new_insn, insn);
!
!       /* Keep register set table up to date.  */
!       replace_one_set (REGNO (old_reg), insn, new_insn);
!       record_one_set (regno, insn);
!     }
!   else
!     {
!       new_insn = gen_move_insn (reg, old_reg);
!       new_insn = emit_insn_after (new_insn, insn);
!
!       /* Keep register set table up to date.  */
!       record_one_set (regno, new_insn);
!     }
!
!   /* ??? I'm not sure if/why this is still needed.  */
!   if (! single_set (new_insn))
!     abort ();

    gcse_create_count++;

*************** pre_delete (void)
*** 5505,5511 ****

    changed = 0;
    for (i = 0; i < expr_hash_table.size; i++)
!     for (expr = expr_hash_table.table[i]; expr != NULL; expr = expr->next_same_hash)
        {
  	int indx = expr->bitmap_index;

--- 5535,5543 ----

    changed = 0;
    for (i = 0; i < expr_hash_table.size; i++)
!     for (expr = expr_hash_table.table[i];
! 	 expr != NULL;
! 	 expr = expr->next_same_hash)
        {
  	int indx = expr->bitmap_index;

*************** pre_delete (void)
*** 5518,5529 ****
  	    rtx set;
  	    basic_block bb = BLOCK_FOR_INSN (insn);

! 	    if (TEST_BIT (pre_delete_map[bb->index], indx))
  	      {
- 		set = single_set (insn);
- 		if (! set)
- 		  abort ();
-
  		/* Create a pseudo-reg to store the result of reaching
  		   expressions into.  Get the mode for the new pseudo from
  		   the mode of the original destination pseudo.  */
--- 5550,5559 ----
  	    rtx set;
  	    basic_block bb = BLOCK_FOR_INSN (insn);

! 	    /* We only delete insns that have a single_set.  */
! 	    if (TEST_BIT (pre_delete_map[bb->index], indx)
! 		&& (set = single_set (insn)) != 0)
  	      {
  		/* Create a pseudo-reg to store the result of reaching
  		   expressions into.  Get the mode for the new pseudo from
  		   the mode of the original destination pseudo.  */


/* PR optimization/11741  */
/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
/* { dg-options "-O2 -minline-all-stringops -march=pentium4" } */

void
foo (char *p)
{
  for (;;)
    {
      memcpy (p, p + 1, strlen (p));
      p++;
    }
}


Roger
--


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]