This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Add "Implicit sets" to GCSE's cprop.


The following patch is the third and final installment of my
"jump bypassing" trilogy.

This patch is an improvement to GCSE to consider "implicit
sets" or "pseudo assignments".  These are constraints on
register values that are introduced by conditional jumps on
their equality with a constant.  For example, in the C
statement "if (x == 2) y = x+1;", we know that in the
assignment x must have the value 2, and therefore y must
always be assigned the value 3.

It turns out that GCC already takes advantage of these implicit
sets in its common subexpression elimination (CSE) pass, but the
patch below modifies GCC's implementation of Morel and Renvoise's
PRE to allow them to be handled in gcse.c.

To demonstrate the difference consider the following:

void foo(int x, int y)
{
  if (x == 2)
    {
      z1 = x*384;
      if (y == 1)
        a ();
      else
        b ();
      z2 = x*913;
    }
}

On mainline CVS the assignment to z1 is known to be a constant
thank's to CSE, however the fact that z2 is also a constant is
lost as CSE only considers extended basic blocks.  GCSE, however,
doesn't suffer from this limitation, and the patch below allows
GCC to optimize the assignment to z2.


Whilst this is cool, the greatest benefit of "implicit sets in
GCSE" comes from its interaction with the jump bypassing pass.
Consider for example the following example,

void foo(int x)
{
  if (x == 0) bar();
  if (x == 1) baz();
  if (x == 2) baf();
}

On i686-pc-linux-gnu, current mainline generates the following
code with -O2 -fomit-frame-pointer:

foo:	push	%ebx
	subl	$8, %esp
	movl	16(%esp), %ebx
	testl	%ebx, %ebx
	je	.L5
.L2:
	cmpl	$1, %ebx
	je	.L6
.L3:
	cmpl	$2, %ebx
	je	.L7
	addl	$8, %esp
	popl	%ebx
	ret
.L7:
	addl	$8, %esp
	popl	%ebx
	jmp	baf
.L6:
	call	baz
	jmp	.L3
.L5:
	call	bar
	jmp	.L2


but with the addition of the patch below, we now generate:

foo:	subl	$12, %esp
	movl	16(%esp), %eax
	testl	%eax, %eax
	je	.L5
	cmpl	$1, %ebx
	je	.L6
	cmpl	$2, %ebx
	je	.L7
.L1:
	addl	$12, %esp
	ret
.L7:
	addl	$12, %esp
	jmp	baf
.L6:
	call	baz
	jmp	.L1
.L5:
	call	bar
	jmp	.L1


The significant difference is that code following .L6 and .L5
now jumps to .L1 the end of the function.  Effectively, we've
turned the sequence of if-then statements, into a chain of
if-then-elses, and only test the value of x once.


void foo(int x)
{
  if (x == 0) bar();
  else if (x == 1) baz();
  else if (x == 2) baf();
}


This occurs the information provided by the "implicit set" of
the then block, can be used by jump bypassing to forward the
edges of the control flow graph.


Now to explain how the magic works.  As Jeff Law pointed out in
response to my original post describing this strategy last May,
http://gcc.gnu.org/ml/gcc/2002-05/msg01658.html, changing the
PRE infrastructure to operate on edges rather than blocks would
be a large and invasive task.  Instead, we only consider the
"implicit sets" on edges that lead to basic blocks with only a
single predecessor.  Then instead of associating the "implicit
set" with the edge, we record and process it as taking place at
the beginning of the destination basic block.

This makes for a very small patch, as most of the GCSE infrastructure
is unaffected, but we need keep track of a single additional SET rtx
per basic block in the CFG.  Unfortunately, it means we do miss some
cases, such as the one below:

	done = false;
	do {
	  x = done;
	  done = bar();
	} while (!done);

In this example the assignment to x is constant, as done must always be
false at that point, but unfortunately the implicit set on the backedge
of the loop leads to a basic block with more than one predecessor.


Unfortunately, there's a 23 second increase in bootstrap time (about
0.7%), but this is more than reasonable considering the improvement
in code quality.  [Indeed the reason the overhead is so low, is
that GCC benefits a lot from these optimizations].  The plan to
recover these 23 seconds (and more) is to realize that following this
patch there's now very little benefit in CSE operating on extended
basic blocks.  i.e. in my first example above, the assignment to z1
can be optimized by GCSE rather than CSE.


The following patch has been retested on i686-pc-linux-gnu with
a full bootstrap, all languages except Ada and treelang, and by
a complete "make -k check" regression test, with no new regressions.


Ok for mainline?


2003-01-25  Roger Sayle  <roger@eyesopen.com>

	* gcse.c (implicit_sets): New global array of implicit set patterns
	indexed by basic block index.
	(find_implicit_sets): New routine to determine the implicit sets of
	a function, i.e. "if (x == 2)".
	(compute_hash_table_work): Include implicit sets in hash table of
	available sets for GCSE's constant/copy propagation passes.
	(one_cprop_pass): Allocate implicit_sets and call find_implicit_sets
	before building set_hash_table for cprop.  Free implicit_sets after
	the hash table is built.


Index: gcse.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/gcse.c,v
retrieving revision 1.231
diff -c -3 -p -r1.231 gcse.c
*** gcse.c	25 Jan 2003 21:49:52 -0000	1.231
--- gcse.c	26 Jan 2003 01:10:22 -0000
*************** struct null_pointer_info
*** 551,556 ****
--- 551,559 ----
    sbitmap *nonnull_local;
    sbitmap *nonnull_killed;
  };
+
+ /* Array of implicit set patterns indexed by basic block index.  */
+ static rtx *implicit_sets;

  static void compute_can_copy	PARAMS ((void));
  static char *gmalloc		PARAMS ((unsigned int));
*************** static int load_killed_in_block_p    PAR
*** 614,619 ****
--- 617,623 ----
  static void canon_list_insert        PARAMS ((rtx, rtx, void *));
  static int cprop_insn		PARAMS ((rtx, int));
  static int cprop		PARAMS ((int));
+ static void find_implicit_sets	PARAMS ((void));
  static int one_cprop_pass	PARAMS ((int, int, int));
  static bool constprop_register	PARAMS ((rtx, rtx, rtx, int));
  static struct expr *find_bypass_set PARAMS ((int, int));
*************** record_last_set_info (dest, setter, data
*** 2470,2476 ****

     Currently src must be a pseudo-reg or a const_int.

-    F is the first insn.
     TABLE is the table computed.  */

  static void
--- 2474,2479 ----
*************** compute_hash_table_work (table)
*** 2532,2537 ****
--- 2535,2546 ----
  	  note_stores (PATTERN (insn), record_last_set_info, insn);
  	}

+       /* Insert implicit sets in the hash table.  */
+       if (table->set_p
+ 	  && implicit_sets[current_bb->index] != NULL_RTX)
+ 	hash_scan_set (implicit_sets[current_bb->index],
+ 		       current_bb->head, table);
+
        /* The next pass builds the hash table.  */

        for (insn = current_bb->head, in_libcall_block = 0;
*************** cprop (alter_jumps)
*** 4477,4482 ****
--- 4486,4581 ----
    return changed;
  }

+ /* Find the implicit sets of a function.  An "implicit set" is a constraint
+    on the value of a variable, implied by a conditional jump.  For example,
+    following "if (x == 2)", the then branch may be optimized as though the
+    conditional performed an "explicit set", in this example, "x = 2".  This
+    function records the set patterns that are implicit at the start of each
+    basic block.  */
+
+ static void
+ find_implicit_sets ()
+ {
+   basic_block bb, dest_bb;
+   unsigned int count;
+   rtx insn, setcc;
+   rtx new, cond;
+   rtx dest_rtx;
+
+   count = 0;
+   FOR_EACH_BB (bb)
+     /* Check for more than one sucessor.  */
+     if (bb->succ && bb->succ->succ_next)
+       {
+ 	setcc = NULL_RTX;
+ 	for (insn = bb->head;
+ 	     insn != NULL && insn != NEXT_INSN (bb->end);
+ 	     insn = NEXT_INSN (insn))
+ 	  if (GET_CODE (insn) == INSN
+ 	      && GET_CODE (PATTERN (insn)) == SET)
+ 	    {
+ 	      setcc = PATTERN (insn);
+ 	      if (! REG_P (SET_DEST (setcc))
+ 		  && ! CC0_P (SET_DEST (setcc)))
+ 		setcc = NULL_RTX;
+ 	    }
+ 	  else if (GET_CODE (insn) == JUMP_INSN
+ 		   && any_condjump_p (insn))
+ 	    {
+ 	      new = SET_SRC (pc_set (insn));
+ 	      if (setcc != NULL_RTX)
+ 		new = simplify_replace_rtx (new,
+ 					    SET_DEST (setcc),
+ 					    SET_SRC (setcc));
+ 	      if (GET_CODE (new) == IF_THEN_ELSE)
+ 		{
+ 		  cond = XEXP (new, 0);
+ 		  if (GET_CODE (cond) == EQ)
+ 		    dest_rtx = XEXP (new, 1);
+ 		  else if (GET_CODE (cond) == NE)
+ 		    dest_rtx = XEXP (new, 2);
+ 		  else
+ 		    dest_rtx = NULL_RTX;
+
+ 		  if (dest_rtx == NULL_RTX)
+ 		    dest_bb = NULL;
+ 		  else if (dest_rtx == pc_rtx)
+ 		    dest_bb = FALLTHRU_EDGE (bb)->dest;
+ 		  else if (GET_CODE (dest_rtx) == LABEL_REF)
+ 		    dest_bb = BRANCH_EDGE (bb)->dest;
+ 		  else
+ 		    dest_bb = NULL;
+
+ 		  if (dest_bb && ! dest_bb->pred->pred_next
+ 		      && dest_bb != EXIT_BLOCK_PTR
+ 		      && GET_CODE (XEXP (cond, 0)) == REG
+ 		      && REGNO (XEXP (cond, 0)) >= FIRST_PSEUDO_REGISTER
+ 		      && CONSTANT_P (XEXP (cond, 1)))
+ 		    {
+ 		      new = gen_rtx_SET (VOIDmode,
+ 					 XEXP (cond, 0),
+ 					 XEXP (cond, 1));
+ 		      implicit_sets[dest_bb->index] = new;
+ 		      if (gcse_file)
+ 			{
+ 			  fprintf(gcse_file, "Implicit set of reg %d",
+ 				  REGNO (XEXP (cond, 0)));
+ 			  fprintf(gcse_file, " in basic block %d\n",
+ 				  dest_bb->index);
+ 			}
+ 		      count++;
+ 		    }
+ 		}
+ 	      break;
+ 	    }
+ 	  else if (INSN_P (insn))
+ 	    setcc = NULL_RTX;
+       }
+
+   if (gcse_file)
+     fprintf (gcse_file, "Found %d implicit sets\n", count);
+ }
+
  /* Perform one copy/constant propagation pass.
     PASS is the pass count.  If CPROP_JUMPS is true, perform constant
     propagation into conditional jumps.  If BYPASS_JUMPS is true,
*************** one_cprop_pass (pass, cprop_jumps, bypas
*** 4495,4502 ****
--- 4594,4610 ----

    local_cprop_pass (cprop_jumps);

+   /* Determine implicit sets.  */
+   implicit_sets = (rtx *) xcalloc (last_basic_block, sizeof (rtx));
+   find_implicit_sets ();
+
    alloc_hash_table (max_cuid, &set_hash_table, 1);
    compute_hash_table (&set_hash_table);
+
+   /* Free implicit_sets before peak usage.  */
+   free (implicit_sets);
+   implicit_sets = NULL;
+
    if (gcse_file)
      dump_hash_table (gcse_file, "SET", &set_hash_table);
    if (set_hash_table.n_elems > 0)

Roger
--
Roger Sayle,                         E-mail: roger@eyesopen.com
OpenEye Scientific Software,         WWW: http://www.eyesopen.com/
Suite 1107, 3600 Cerrillos Road,     Tel: (+1) 505-473-7385
Santa Fe, New Mexico, 87507.         Fax: (+1) 505-473-0833


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]