This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Two other optimization questions.
- To: law at cygnus dot com
- Subject: Re: Two other optimization questions.
- From: Toon Moene <toon at moene dot indiv dot nluug dot nl>
- Date: Wed, 07 Apr 1999 22:01:25 +0200
- CC: egcs at egcs dot cygnus dot com
- Organization: Moene Computational Physics, Maartensdijk, The Netherlands
- References: <29563.923475347@upchuck>
Jeffrey A Law wrote:
> I wrote:
> > Loop unrolling, like function inlining, is an optimization that
> > favors running time over code size; so they more or less "belong
> > together". Wouldn't it be intuitive to *both* enable them by
> > default when -O3 is specified ?
> Actually I want to see a few things happen in this area:
>
> 1. Allow some loops to be unrolled at -O2/-O3. Specifically those
> which don't bloat the code too much and which we are confident
> will win when unrolled.
>
> Basically I'm thinking about single block loops which can be
> trivially unrolled.
That would certainly cover most of the "interesting" (for loop
unrolling) loops in Fortran code.
However, this is not the last word on loop unrolling. Just to quote a
(bilateral) discussion I had with Jim Wilson in February last year:
<QUOTE MY_COMMENTS_NOW=IN_BRACKETS>
Jim:
...two months ago you asked...
> me:
[ about limiting loop unrolling to 4 times, quoting unroll.c: ]
> /* Limit loop unrolling to 4, since this will make 7 copies of
> the loop body. */
> if (unroll_number > 4)
> unroll_number = 4;
> I do not completely understand the comment - probably it
> means to say that unrolling 8 times will lead to 7 extra copies of
> the loop body (because mod(number-of-iterations, number-of-unrolls)
> is maximally 7).
Jim again:
We get 2N-1 copies of the loop when preconditioning is performed. Given
a
loop like this:
for (i < 100; i++)
<body>
we can unroll it only if we do preconditioning. We then get code
something
like this (not checked for correctness):
tmp = i % 4;
if (tmp <= 1) {
if (tmp == 1) goto three:
else goto loop:
} else {
if (tmp == 2) goto two:
else goto one:
}
three:
<body>
two:
<body>
one:
<body>
loop:
for (i < 100; i += 4)
{
<body>
<body>
<body>
<body>
}
Note that this has 7 copies of the original loop body.
> It is not clear to me why this isn't implemented
> as:
> do i = 1, mod(number-of-iterations, number-of-unrolls)
> <body>
> enddo
> do i = mod(number-of-iterations, number-of-unrolls) + 1,
> x number-of-iterations, number-of-unrolls
> <unrolled-body>
> enddo
> especially since, when unrolling naively, the number-of-unrolls is
> 2, 4 or 8, so that mod(number-of-iterations, number-of-unrolls)
> effectively is iand(number-of-iterations, number-of-unrolls - 1),
> which is a very cheap computation compared to mod(...).
[I was pointing out that one only needed N+1 loop bodies here, because
the "preconditioning" could also be done by a loop. ]
Jim:
I am not sure what you are asking for here, because the loop unroller
already does this, as I mentioned above. The only difference is that it
is emitting multiple copies of the loop body instead of adding a second
loop. I did it this way because I though it would give better code,
though
I may not have thought through all of the implications of this.
This makes sense for small loop bodies, but is a disadvantage for
large loop bodies. Using a second loop also works better for larger
number-of-unrolls, since it avoids needed log2(number-of-unrolls)
branches
at the beginning. Hmm, maybe this should be done for certain cases.
[ Well, another thing I thought I pointed out was that the current
code contains too many taken forward branches. The Alpha hates them
and so probably other architectures. ]
</QUOTE>
[ errno setting not useful for Fortran ]
> > Would it be possible to implement a test settable by a Front End
> > that no errno setting is called for by its language ?
> Hmmm. Yes, we're trying to be ANSI/ISO compliant. Not it's not just errno,
> but the matherr handling by ANSI/ISO which mandates this behavior.
>
> This probably isn't appropriate for Fortran ;-) I've got no objection to
> some way for the backend to influence this code. I do wonder what the Fortran
> standards say about the behavior of sqrt (-1).
Craig already wrote you the answer to that. The full expression is:
"An undefined operation allows the compiler to do anything, up to and
including starting WW III, if the appropriate optional hardware is
installed".
Needless to say, that's a paraphrase of what's in the Standard.
--
Toon Moene (toon@moene.indiv.nluug.nl)
Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
g77 Support: fortran@gnu.org; egcs: egcs-bugs@cygnus.com