This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Reenable CSE of non-volatile inline asm (PR rtl-optimization/63637)


On 01/13/15 09:18, Jakub Jelinek wrote:
Hi!

My PR60663 fix unfortunately stopped CSE of all inline-asms, even when
they e.g. only have the clobbers added by default.

This patch attempts to restore the old behavior, with the exceptions:
1) as always, asm volatile is not CSEd
2) inline-asm with multiple outputs are not CSEd
3) on request from Richard (which Segher on IRC argues against), "memory"
    clobber also prevents CSE; this can be removed by removing the
    int j, lim = XVECLEN (x, 0); and loop below it
4) inline-asm with clobbers is never copied into an insn that wasn't
    inline-asm before, so if there are clobbers, we allow CSEing of
    e.g. two same inline-asms, but only by reusing results of one
    of those

Bootstrapped/regtested on x86_64-linux and i686-linux, tested also
with arm cross after reverting the PR60663 arm cost fix.

Ok for trunk this way, or with 3) removed?

2015-01-13  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/63637
	PR rtl-optimization/60663
	* cse.c (merge_equiv_classes): Set new_elt->cost to MAX_COST
	if elt->cost is MAX_COST for ASM_OPERANDS.
	(find_sets_in_insn): Fix up comment typo.
	(cse_insn): Don't set src_volatile for all non-volatile
	ASM_OPERANDS in PARALLELs, but just those with multiple outputs
	or with "memory" clobber.  Set elt->cost to MAX_COST
	for ASM_OPERANDS in PARALLEL.  Set src_elt->cost to MAX_COST
	if new_src is ASM_OPERANDS and elt->cost is MAX_COST.

	* gcc.dg/pr63637-1.c: New test.
	* gcc.dg/pr63637-2.c: New test.
	* gcc.dg/pr63637-3.c: New test.
	* gcc.dg/pr63637-4.c: New test.
	* gcc.dg/pr63637-5.c: New test.
	* gcc.dg/pr63637-6.c: New test.
	* gcc.target/i386/pr63637-1.c: New test.
	* gcc.target/i386/pr63637-2.c: New test.
	* gcc.target/i386/pr63637-3.c: New test.
	* gcc.target/i386/pr63637-4.c: New test.
	* gcc.target/i386/pr63637-5.c: New test.
	* gcc.target/i386/pr63637-6.c: New test.

--- gcc/cse.c.jj	2015-01-09 21:59:44.000000000 +0100
+++ gcc/cse.c	2015-01-13 13:26:23.391216064 +0100
@@ -1792,6 +1792,8 @@ merge_equiv_classes (struct table_elt *c
  	    }
  	  new_elt = insert (exp, class1, hash, mode);
  	  new_elt->in_memory = hash_arg_in_memory;
+	  if (GET_CODE (exp) == ASM_OPERANDS && elt->cost == MAX_COST)
+	    new_elt->cost = MAX_COST;
  	}
      }
  }
@@ -4258,7 +4260,7 @@ find_sets_in_insn (rtx_insn *insn, struc
      {
        int i, lim = XVECLEN (x, 0);

-      /* Go over the epressions of the PARALLEL in forward order, to
+      /* Go over the expressions of the PARALLEL in forward order, to
  	 put them in the same order in the SETS array.  */
        for (i = 0; i < lim; i++)
  	{
@@ -4634,12 +4636,27 @@ cse_insn (rtx_insn *insn)
  	  && REGNO (dest) >= FIRST_PSEUDO_REGISTER)
  	sets[i].src_volatile = 1;

-      /* Also do not record result of a non-volatile inline asm with
-	 more than one result or with clobbers, we do not want CSE to
-	 break the inline asm apart.  */
        else if (GET_CODE (src) == ASM_OPERANDS
  	       && GET_CODE (x) == PARALLEL)
-	sets[i].src_volatile = 1;
+	{
+	  /* Do not record result of a non-volatile inline asm with
+	     more than one result.  */
+	  if (n_sets > 1)
+	    sets[i].src_volatile = 1;
+
+	  int j, lim = XVECLEN (x, 0);
+	  for (j = 0; j < lim; j++)
+	    {
+	      rtx y = XVECEXP (x, 0, j);
+	      /* And do not record result of a non-volatile inline asm
+		 with "memory" clobber.  */
+	      if (GET_CODE (y) == CLOBBER && MEM_P (XEXP (y, 0)))
Can you please add a comment here which references the full form of the "memory" tag. (clobber (mem:BLK (scratch))).

If we ever have to look at this again (say perhaps to break out the read anything vs write anything into separate tags :-) it'll save considerable time and angst trying to track all this stuff down.

The tests you've got are a step forward, but there's obviously a lot more we could do. For example testing DSE around ASMs without and without a memory "clobber", testing CSE of unrelated memory references around an ASM without and without a memory clobber come to mind. You don't have to add them to get approval, but if you were to take the time to cobble them together it'd be hugely appreciated.


Given the discussion with Segher, let's give him a chance to chime in on tonight's messages before we make a final decision.

jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]