Low overhead looping support

Jeffrey A Law law@cygnus.com
Fri Feb 19 01:03:00 GMT 1999

  In message <13998.57803.940385.940959@ongaonga.elec.canterbury.ac.nz>you write:

  > Most of the changes are to correctly calculate the number of
  > iterations at run-time, handling all the little tricky boundary
  > conditions. 
As expected.

  > I've added another argument to the decrement_and_branch_on_count named
  > pattern which specifies the number of enclosed loops.  A innermost
  > loop has an argument of 1.
Huh?  Why would you name the argument pattern which it indicate the number
of enclosed loops?  Did I misread something.  Note it doesn't look like
you called it pattern in your patch.

  > I've also included another [optional]
  > named pattern 'init_branch_on_count' so that specialised looping
  > registers can be initialised to keep the register allocator happy.
Any specific reason not to pass the iteration count (either as a constant rtx
or an varying rtx if it isn't constant) to a branch_on_count expander?  I'd
like to see some discussion about the pros/cons for each approach.

  > I've tested the code with the C4x and did some superficial tests
  > with the rs6000.
Do you have a ppc/rs6000?  Since it's the only native target which currently
supports branch_on_count, I'd kind of like to know it bootstraps with this
change.  Alternately I could cobble up branch on count support for the PA
real quick and see if it bootstraps with your changes.

What does BCT_CHECK_LOOP_ITERATIONS do?  Ie, what behavior does it control.
It is something that should be defined by a target file?  If so, you need to
update tm.texi.

Similarly if we do end up with a new pattern to initialize the counter
register, we need to update md.texi/rtl.texi.  I suspect we don't have
decrement_and_branch_on_count in the docs either, could you please add it
if that is the case?

Could you please document the arguments and return value for
check_branch_on_count?   I know you didn't originally introduce it without
docs, but we should try to fix up the docs when we find them lacking.
Similarly for insert_branch_on_count.

It appears as if you've got two tests in check_branch_on_count to detect
completely unrolled loops.  It seems to me you only need one.  It seems to
me that we only completely unroll the loop when we know its iteration count
and that you now store that tidbit of information.  So only the check that
the iteration count == the unroll factor should be necessary to detect that
a loop was completely unrolled.

It probably still makes sense to verify there's a conditional branch at the
end of the loop, but the comment before that code needs to be updated.

Can you explain why you want to use the exiting precondition test from unroll.c
in the low overhead looping support?  I think I know why, but folks reading
this code should not have to guess.

In at least one place your wrote "abort();".  You need a space between the
function name and the open paren.

check_branch_on_count seems too big for its own good.

For example, you've got a large hunk of code which checks the iteration
count, comparison code, etc.  That could/should probably break into its
own function.

Similarly for the later code which generates the insn, then veriifies what
it got back was reasonable.  I'll note that you check the iteration count
again.  Seems to me we should probably look for a better way to check
that the iteration count is big enough to be useful and small enough for
the target machine.  I guess in general, I'd like to know how this code
differs from the iteration and comparison code checks you did earlier in
this function.

Then you've got the "handle the complex case" code which should probably break
out into its own subroutine.

It seems like check_branch_on_count is poorly named since it performs the
check *and* inserts the low overhead looping instruction if the check are
all positive.

This was just an initial scan.  I think it'll need to be looked at again
in detail.


More information about the Gcc-patches mailing list