This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH, 2/2] shrink wrap a function with a single loop: split live_edge


On 09/19/14 10:02, Jiong Wang wrote:

On 19/09/14 16:49, Jeff Law wrote:
On 09/19/14 05:27, Jiong Wang wrote:
On 19/09/14 06:51, Jeff Law wrote:
On 09/16/14 00:55, Zhenqiang Chen wrote:
-----Original Message-----
From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
owner@gcc.gnu.org] On Behalf Of Jiong Wang
Sent: Monday, September 15, 2014 11:28 PM
To: Zhenqiang Chen
Cc: gcc-patches@gcc.gnu.org; Jeff Law
Subject: Re: [PATCH, 2/2] shrink wrap a function with a single loop:
split
live_edge

On 08/05/14 09:07, Zhenqiang Chen wrote:
     static bool
     move_insn_for_shrink_wrap (basic_block bb, rtx insn,
                               const HARD_REG_SET uses,
                               const HARD_REG_SET defs,
-                          HARD_REG_SET *last_uses)
+                          HARD_REG_SET *last_uses,
+                          bool *split_p)
{
       rtx set, src, dest;
       bitmap live_out, live_in, bb_uses, bb_defs;
       unsigned int i, dregno, end_dregno, sregno, end_sregno;
       basic_block next_block;
+  edge live_edge;

       /* Look for a simple register copy.  */
       set = single_set (insn);
@@ -5582,17 +5589,31 @@ move_insn_for_shrink_wrap (basic_block bb,
rtx insn,
           || overlaps_hard_reg_set_p (defs, GET_MODE (dest),
dregno))
         return false;

-  /* See whether there is a successor block to which we could move
INSN.  */
-  next_block = next_block_for_reg (bb, dregno, end_dregno);
-  if (!next_block)
+  live_edge = next_block_for_reg (bb, dregno, end_dregno);  if
+ (!live_edge)
          return false;

+  next_block = live_edge->dest;
+
       /* If the destination register is referred in later insn,
          try to forward it.  */
       if (overlaps_hard_reg_set_p (*last_uses, GET_MODE (dest),
dregno)
           && !try_copy_prop (bb, insn, src, dest, last_uses))
         return false;

+  /* Create a new basic block on the edge.  */  if (EDGE_COUNT
+ (next_block->preds) == 2)
+    {
+      next_block = split_edge (live_edge);
+
+      bitmap_copy (df_get_live_in (next_block), df_get_live_out
+ (bb));
(re-sent, looks like the first send not received by the server...)

     for the function "_IO_wdefault_xsputn" in glibc wgenops.c (.dot
file
attached)

     174: NOTE_INSN_BASIC_BLOCK 21 are the new created basic block
because
of the sink of move instruction "ax:DI = 0" and split edge.

     but the live_in of this BB is copied from live_out of BB 2 which
is too big, and
actually prevent the later sink of "16: r12:DI=dx:DI".

     Should it be better to copy live_in from "next_block->next_bb"
instead of
live_out from "bb", as it will model what's needed more accurately?
According to the algorithm, next_block->next_bb (which is
live_edge->dest) should have two predecessors. Live_in of
next_block->next_bb would include live_out of the other edge,  which
is not necessary accurately. To be more accurate, you may need an
intersection of df_get_live_out (bb) and df_get_live_in
(next_block->next_bb).

Presumably we can't recompute the liveness info here as it's too
expensive to do that each time we split an edge to sink a statement?
ie, we need the more accurate liveness within this pass so that we can
sink further instructions?
for a single case, x86 bootstrap, looks like these code are not compile
time critical
as for all 4762 live edges which could sink instructions

4139: EDGE_COUNT (next_block->preds) == 1
270: EDGE_COUNT (next_block->preds) == 2
353: EDGE_COUNT (next_block->preds) > 2

and if we make the live info more accurate here, just very few
additional functions (< 10)
shrink-wrapped from glibc build & gcc bootstrap test.
Perhaps I wasn't clear.

The whole point behind the proposed changes is to enable additional
sinking that is currently blocked because of the overly large set of
live objects.

So my question was an attempt to understand why we didn't just a normal
liveness update -- I speculated that we aren't doing a normal liveness
update because of the compile-time cost.



So, is it OK we simply change "bitmap_copy  (df_get_live_in
(next_block),  df_get_live_out  (bb));"
into "bitmap_and  (df_get_live_in  (next_block),  df_get_live_out
(bb),  df_get_live_in  (next_block->next_bb));" ?
Probably.  Though I'd be a bit concerned with next_block->next_bb.
Wouldn't it be safer to stash away the relevant basic block prior to the
call to split_edge, then use that saved copy.  Something like this
(untested):

basic_block old_dest = live_edge->dest;
next_block = split_edge (live_edge);

/* We create a new basic block.  Call df_grow_bb_info to make sure
     all data structures are allocated.  */
df_grow_bb_info (df_live);
bitmap_and (df_get_live_in (next_block),
              df_get_live_out (bb),
              df_get_live_in (old_dest));


The idea being we don't want to depend on the precise ordering blocks in
the block chain.

Could you try that and see if it does what you need?
Jeff,

   Thanks, verified, it works.
Great.  Can you send an updated patchkit for review.

Thanks,

jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]