This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Enable bbro for -Os

From: Eric Botcazou <ebotcazou at adacore dot com>
To: "Zhenqiang Chen" <zhenqiang dot chen at arm dot com>
Cc: gcc-patches at gcc dot gnu dot org
Date: Mon, 10 Sep 2012 19:00:59 +0200
Subject: Re: [PATCH] Enable bbro for -Os
References: <000001cd79e9$0be832c0$23b89840$@arm.com> <201209051538.55955.ebotcazou@adacore.com> <000001cd8c15$edab49c0$c901dd40$@arm.com>

> All other comments are accepted.
> 
> The updated patch is attached. Is it OK?

As you probably gathered, I had missed that Steven and Richard had already 
commented on your patch before posting my message.  Sorry about that...

I think that the patch is interesting because, even if it doesn't exactly 
implement what the comment in gate_handle_reorder_blocks was talking about, it 
fixes code layout regressions without increasing the code size (and even 
decreasing it).  So, assuming that Steven and Richard don't strongly oppose, I 
think the patch is OK modulo the following nits:

+   The above description is for the full algorithm, which is used when the
+   function is optimized for speed.  When the function is optimized for size,
+   in order to reduce long jump and connect more fall through edges, the

long jumps... bb-reorder.c uses "fallthru edges" consistently.

+   algorithm is modified as follows:
+   (1) Break long trace to short ones.  The trace is broken at a block, which
+   has multi-predecessors/successors during finding traces.

long traces... A trace is broken at a block that has multiple predecessors/ 
successors during trace discovery.

+   (2) Ignore the edge probability and frequency for fall through edges.

fallthru

+   (3) Keep its original order when there is no chance to fall through.  bbro

Keep the original order of blocks...  We rely on the results of cfg_cleanup

+   bases on the result of cfg_cleanup, which does lots of optimizations on 
cfg.
+   So the order is expected to be kept if no fall through.
+
+   To implement the change for code size optimization, block's index is
+   selected as the key and all traces are found in one round.


+	  /* If the best destination has multiple successors or predecessors,
+	     don't allow it to be added when optimizing for size.  This makes
+	     sure predecessors with smaller index handled before the best
+	     destination.  It breaks long trace and reduces long jumps.

missing "are" before "handled"


+	     After removing the best edge, the final result will be ABCD/ACBD.
+	     It does not add jump compared with the previous order. But it
+	     reduce the possibility of long jump.  */

Double space before "But".


+  if (optimize_function_for_size_p (cfun))
+    {
+      e_index = src_index_p ? e->src->index : e->dest->index;
+      b_index = src_index_p ? cur_best_edge->src->index
+			      : cur_best_edge->dest->index;
+      /* The smaller one is better to keep the original order.  */
+      return b_index > e_index;
+    } 

Trailing space after the last parenthesis.


+		  /* If dest has multiple predecessors, skip it.  We expect
+		     that one predecessor with smaller index connect with it
+		     later.  */

connects


+	      /* Only connect Trace n with Trace n + 1.  It is conservative
+		 to keep the order as close as possible to the original order.
+		 It also helps to reduce long jump.  */

long jumps


Thanks for working on this.

-- 
Eric Botcazou

Follow-Ups:
- RE: [PATCH] Enable bbro for -Os
  - From: Zhenqiang Chen

References:
- Re: [PATCH] Enable bbro for -Os
  - From: Eric Botcazou
- RE: [PATCH] Enable bbro for -Os
  - From: Zhenqiang Chen

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]