This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Value Range Propagation Patch -- Part 2 of 2



  In message <200004031754.NAA08589@jwlab.FEITH.COM>you write:
At a high level some comments at the start of vrp.c about how it works
would b greatly appreciated.  I was able to figure most of it out by
inspection, but it would have helped to have some high level overview
of the optimization pass.

You've tended to write a number of loops like:

+ 	      for (rvrp = *(bb_reg_vr_table + bb); rvrp; rvrp = rvrp->next)
+ 		if (! TEST_BIT (bb_reg_vr_unknown[bb], rvrp->regno))
+ 		  *(reg_vr_table + rvrp->regno) = rvrp->vr;
+ 
+ 	      for (prvrp = *(previous_bb_reg_vr_table + bb);
+ 		   prvrp; prvrp = prvrp->next)
+ 		if (TEST_BIT (bb_reg_vr_unknown[bb], prvrp->regno))
+ 		  *(reg_vr_table + prvrp->regno) = prvrp->vr;
+ 	    }

Is there any reason to use *(table + index) instead of table[offset]?

Presumably you're using tree nodes to record value ranges because you can
then call into routines in fold-const.c and friends to do simplifications?
Any reason you didn't use fold_ranges/merge_ranges & friends?  It would
require some tweaks, but might simplify the code.

Did you give any thought to providing analogous routines in RTL?  I'm not
saying one would be better than the other, I'm just trying to get a feel
for what drove certain decisions.

I'm sure I'll have more comments in the future -- I had to cut this short
as I'm about to leave town again.


  > + 
  > + static rtx
  > + get_jump_table_offset (insn, earliest)
  > +      rtx insn;
  > +      rtx *earliest;
  > + {
  > +   rtx label;
  > +   rtx table;
  > +   rtx set;
  > +   rtx x;
  > +   rtx old_x;
  > +   int i;
  > +   int j;
  > +   rtx current_insn;
  > +   rtx current_x;
  > + 
  > +   if (GET_CODE (insn) != JUMP_INSN
  > +       || ! (label = JUMP_LABEL (insn))
  > +       || ! (table = NEXT_INSN (label))
  > +       || GET_CODE (table) != JUMP_INSN
  > +       || (GET_CODE (PATTERN (table)) != ADDR_VEC
  > + 	  && GET_CODE (PATTERN (table)) != ADDR_DIFF_VEC)
  > +       || ! (set = single_set (insn)))
  > +     return NULL_RTX;
Few notes.  It is possible for JUMP_LABEL to be null.  For example a
return insn will have a null jump label.

I believe that you can also end up with a NULL table since we could have a
jump to the last insn in the chain (that's how returns look early in
rtl generation).  So the GET_CODE (table) might segfault in some cases.

It might be better to write a series of simple conditionals rather than one
large conditional.

  > +   x = SET_SRC (set);
  > + 
  > +   if (GET_CODE (x) == IF_THEN_ELSE
  > +       && GET_CODE (XEXP (x, 2)) == LABEL_REF)
  > +     x = XEXP (x, 1);
  > + 
  > +   for (old_x = NULL_RTX; GET_CODE (x) == REG && x != old_x;
  > +        old_x = x, x = find_last_value (x, &insn, NULL_RTX, 0))
  > +     ;
Presumably you're trying to find the register that holds the index with
this code?  A comment might make the intent of this code clearer.

Presumably the code later in that function is just trying to compute the
original index if the value it finds isn't simple? 

That function really does need some comments explaining what it is trying
to do.  I can make educated guesses about it's behavior, but we're better
off not leaving it to folks to guess the code's intent.


  > + int
  > + value_range_prop (f, alter_jumps, file)
  > +      rtx f;
  > +      int alter_jumps;
  > +      FILE *file;
  > + {
  > +   int *abnormal;
  > +   int *ignore;
  > +   int *order;
  > +   int *visited;
  > +   int bb;
  > +   int changed;
  > +   int i;
  > +   int j;
  > +   sbitmap *dom;
  > +   struct reg_value_range **previous_reg_vr;
  > +   struct reg_value_range **reg_vr;
  > +   struct reg_value_range *next;
  > +   struct reg_value_range *prvrp;
  > +   struct reg_value_range *rvrp;
  > + 
  > +   vr_file = file;
  > + 
  > +   max_vr_regno = max_reg_num ();
  > + 
  > +   find_basic_blocks (f, max_vr_regno, file);
  > +   cleanup_cfg (f);
  > + 
  > +   abnormal = (int *)alloca(n_basic_blocks * sizeof(int));
  > +   ignore = (int *)alloca(n_basic_blocks * sizeof(int));
  > +   order = (int *)alloca(n_basic_blocks * sizeof(int));
  > +   visited = (int *)alloca(n_basic_blocks * sizeof(int));
Whitespace between the function name and the open paren for its argument
list.  We're also trying to avoid alloca, so using an xmalloc + free pair
would probably be better.

 > +   compute_value_range_bb_order (order, abnormal, dom);
I doubt this is really necessary.  Or more correctly I doubt ordering of
basic blocks is all that important to most code.  It can be shown that
the number of iterations of a dataflow problem is bounded by the nesting
of backedges in the cfg.

If we really want to have code to determine orders in which to process
blocks for dataflow problems, then we should make it generic so that other
optimizers can use it.  A DFS search ought to be sufficient to give a good
ordering for most dataflow problems -- so we might just want to have a
routine to give us a DFS ordering of blocks in the cfg.

jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]