Loop optimizer tidy up

Thu Nov 11 22:50:00 GMT 1999

  In message <14288.45634.600193.497873@andromeda.elec.canterbury.ac.nz>you wri
te:
  > Sat Sep  4 17:08:11 1999  Michael Hayes  <m.hayes@elec.canterbury.ac.nz>
  > 
  > 	* loop.c (loop_used_count_register): Delete array.
  > 	(insert_bct, instrument_loop_bct): Delete functions replacing
  > 	functionality with doloop_optimize in doloop.c.
  > 	(strength_reduce): Call doloop_optimize if HAVE_doloop_end defined.
  > 
  > 	* loop.h (doloop_optimize): Declare.
  > 
  > 	* doloop.c: New file to supercede functionality for low-overhead
  >  	looping that was in loop.c.
  > 	
  > 	* Makefile.in (OBJS): Add doloop.o to list of object files.
  > 	(doloop.o): Add dependencies.
  > 
  > 	* config/i386/i386.md (doloop_end): New pattern.
  > 	(decrement_and_branch_on_count): Rename as loop.
  > 
  > 	* config/rs6000/rs6000.md (doloop_end): Rename from
  > 	decrement_and_branch_on_count; add additional operands.
In doloop_optimize you have a check that the last instruction is a jump
instruction.  It seems to me that check needs to move into doloop_valid_p
with the other similar checks.

In your changes to loop.c to remove the old bct code -- you'll need to make
minor updates -- loop_used_count_register is allocated with xmalloc, so the
part of the patch that removed the alloca call doesn't apply.  And we need to
make sure to zap the free call.

Can you give me some examples of loops with nontrivial exit conditions that
your code will successfully convert to use a low overhead loop?   Presumably
if the iteration bounds are not compile time constants you're only converting
loops that count down to zero (those are the only ones I can manage to get
to work on the ppc port).

Please do not discuss using the machine dependent reorg pass to clean up
after doloop.  If the target needs that kind of support, then the comments
belong in the expander and/or that port's machdep reorg pass, not in the
loop optimizer. ie, you're discussing a fairly port specific implementation
detail -- I think that discussion belongs elsewhere.

You need to make sure to include tm_p.h in the new file (and update the
dependencies as appropriate).  The new file also has to be added to 
po/POTFILES.in

I don't particularly like the idea that doloop_modify knows the internal
structure of the looping patterns.  That's generally a bad idea.  Is there
a better way for it to get the information it needs?  It seems to me that
information can just be passed into the routine instead of trying to extract
it from the pattern.

Under what conditions is the number of iterations "an estimate"?  Either we
know the number of iterations or we do not.  I don't see that there's a
gray area in-between.  If we really do estimate, it seems to me that we must
ensure that we over-estimate the number of iterations -- otherwise backends
can not depend on that value when making decisions about whether or not a
low overhead looping instruction is appropriate.

What code verifies that the increment/decrement is an exact power of two in
the case where we have to compute the iteration count at runtime?

Basically it looks reasonable.  There's a lot of work that can still be done,
but it's a noticeable improvement over what we have. 

jeff