This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [SH] PR 53976 - Add RTL pass to eliminate clrt, sett insns


Steven,

Thanks for the feedback.  I've committed the original patch as-is, but
I'm happy to improve it with follow up patches.

On Thu, 2013-11-21 at 00:04 +0100, Steven Bosscher wrote:
> On Wed, Nov 20, 2013 at 8:29 PM, Oleg Endo wrote:
> >
> > The attached patch adds another SH specific RTL pass which is supposed
> > to optimize clrt and sett instruction usage.  As a start, it currently
> > just eliminates redundant clrt and sett instructions in cases where the
> > T bit value is known.  However, I'm planning on extending it a little in
> > order to e.g. hoist clrt/sett insns out of loops etc.
> 
> +#include <vector>

Is there something wrong with <vector> ?


> > +#define log_msg(...)\
> > +  do { if (dump_file != NULL) fprintf (dump_file, __VA_ARGS__); } while (0)
> 
> Is that valid C++98, a varags macro?

No it's not.  It's C99 / C++11.  However, most compilers support
__VA_ARGS__.  If it causes too much trouble I'll remove it of course
(sh_treg_combine.cc would also be affected).
Having to write "if (dump_file ...)" stuff over and over again is
annoying and impacts the readability of code in my opinion.  So if the
__VA_ARGS__ usage has to go, there should be some more or less
alternative.  I think other passes would also benefit from that.
BTW something std::ostream like would be nice for logging, like...

  log_msg ("updated ccreg mode: ");
  log_rtx (m_ccreg);
  log_msg ("\n\n");

... could be something like

log (<< "updated ccreg mode: " << m_ccreg << "\n\n");

where log is:
#define log(expr)\
  do { if (dump_file != NULL) <logging ostream ref> expr; } while (0)

Since there are various ways of printing an rtx (print_rtl,
print_rtl_single), this could be done with a wrapper object around the
logged rtx:

log (<< "got insn: \n" << rtx::log_single (my_insn) << "\n");

or iomanip style (requires custom ostream):

log (<< "got insn: \n" << rtx::log_single << my_insn << "\n");

... but I'm not sure how others think about that.  If this is of
interest I could do some work in that direction.

> > +    // Notice that the CFG might be invalid at late RTL stages and
> > +    // BLOCK_FOR_INSN might return null.  Thus the basic block are recorded
> > +    // here while traversing them.
> > +    basic_block bb;
> 
> You insert your pass just before sched2. The CFG is perfectly fine at
> that stage. So you shouldn't need this. (And if you would need to
> record bb, then this solution wouldn't be GC-safe).

Why is that?  AFAIK GC is done after the pass has finished?  The pass
doesn't keep any state beyond the current pass execution.

> 
> BLOCK_FOR_INSN will only be NULL for things that are not inside a
> basic block (some notes, deleted labels, etc.).
> 
> That all said and done: AFAICT you don't actually use BLOCK_FOR_INSN
> anywhere :-)

Sorry for the confusion.  Initially I had it inserted right before
machine dependent reorg, at which point I was getting nullptr from
BLOCK_FOR_INSN all over the place.  So not using it was the fix.
However, only after I've noticed that SH's machine dependent reorg
leaves the basic block structure in a 'broken' state (PR 59189) and thus
I moved the pass before sched2.
I'll try using BLOCK_FOR_INSN again.

> 
> > +    for (edge_iterator ei = ei_start (bb->preds); !ei_end_p (ei); ei_next (&ei))
> 
> FOR_EACH_EDGE
> 
> Declaring the edge_iterator inside the for() is not a good argument
> against FOR_EACH_EDGE. 

But that was the only reason! ;)
(I've done the same in sh_treg_combine.cc)
Can we ...

> Of course, brownie points are up for grabs for
> the brave soul daring enough to make edge iterators be proper C++
> iterators... ;-)

... fix this first and use the SH RTL passes as examples.  Then migrate
other existing code?  Or do you insist on FOR_EACH_EDGE usage for the
time being?

> > +    if (pred_bb->index == ENTRY_BLOCK)
> 
> I used to dislike this idiom of checking bb->index against fixed block
> numbers. But now that ENTRY_BLOCK_PTR and EXIT_BLOCK_PTR actually
> require two pointer dereferences (cfun->cfg->...) it's the better
> option.
> 
> /me adds to TODO list...

Actually I don't need any of that.  I will re-test the following
(although it's quite obvious):

Index: gcc/config/sh/sh_optimize_sett_clrt.cc
===================================================================
--- gcc/config/sh/sh_optimize_sett_clrt.cc	(revision 205191)
+++ gcc/config/sh/sh_optimize_sett_clrt.cc	(working copy)
@@ -309,9 +309,6 @@
 			  std::vector<ccreg_value>& values_out,
 			  basic_block prev_visited_bb) const
 {
-  if (start_insn == NULL_RTX)
-    return;
-
   log_msg ("looking for ccreg values in [bb %d]\n", bb->index);
 
   for (rtx i = start_insn; i != NULL_RTX && i != PREV_INSN (BB_HEAD (bb));
@@ -376,9 +373,6 @@
     for (edge_iterator ei = ei_start (bb->preds); !ei_end_p (ei); ei_next (&ei))
       {
 	basic_block pred_bb = ei_edge (ei)->src;
-	if (pred_bb->index == ENTRY_BLOCK)
-	  continue;
-
 	pred_bb_count += 1;
 	find_last_ccreg_values (BB_END (pred_bb), pred_bb, values_out, bb);
       }


> 
> > +::remove_ccreg_dead_unused_notes (std::vector<ccreg_value>& values) const
> 
> Maybe put a generic version of this in rtlanal.c, next to
> remove_reg_equal_equiv_notes_for_regno?

Yes.  In fact I wanted to propose moving some of the helper functions
from sh_treg_combine.cc and sh_optimize_sett_clrt.cc to rtl.h /
rtlanal.c at some point in time.

> On the whole: The pass will have worst-case run time quadratic in the
> number of basic blocks, due to the recursion in
> find_last_ccreg_values. You'll probably want to limit the depth of the
> iteration somehow (hard-coded depth, common dominator, first
> post-dominated block, whatever...) or sooner-or-later you'll have a PR
> assigned to you for a test case that makes compile time explode in
> this pass...

A PR with a test case would be nice.  In the worst case it'll have to be
an iteration limit, but I can also imagine that caching results
(hash_table) for bb subtrees could be useful here, couldn't it?

Cheers,
Oleg


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]