This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Modulo-scheduling improvements. Patch 2 of 2.
- From: "Andrew Pinski" <pinskia at gmail dot com>
- To: "Mircea Namolaru" <NAMOLARU at il dot ibm dot com>
- Cc: "Andrey Belevantsev" <abel at ispras dot ru>, "Andrew Pinski" <andrew_pinski at playstation dot sony dot com>, "Ayal Zaks" <ZAKS at il dot ibm dot com>, "David Edelsohn" <dje at watson dot ibm dot com>, "Dorit Nuzman" <dorit at il dot ibm dot com>, gcc-patches at gcc dot gnu dot org, "Zdenek Dvorak" <rakdver at atrey dot karlin dot mff dot cuni dot cz>, Trevor_Smigiel at playstation dot sony dot com, "Vladimir Yanovsky" <yanov at il dot ibm dot com>, vmakarov at toronto dot redhat dot com, "Vladimir Yanovsky" <volodyan at gmail dot com>
- Date: Sat, 23 Jun 2007 02:34:49 -0700
- Subject: Re: [PATCH] Modulo-scheduling improvements. Patch 2 of 2.
- References: <de8d50360706221600k1e897f8cq4892dfb32254a489@mail.gmail.com> <OF677E8DE4.AA4C6C2C-ON42257303.0027E606-42257303.0037E2EA@il.ibm.com>
On 6/23/07, Mircea Namolaru <NAMOLARU@il.ibm.com> wrote:
The "do-loop" optimization brings the loop to a specific form
(that it is suitable for modulo-scheduling):
1] a new induction variable is introduced, and its single purpose
is to control the number of iterations executed by the loop.
2] this induction variable is initialized, prior to the entrance of
the loop, with the number of iterations to be executed by the loop.
It is decremented by one on each iteration, and the loop is exited
when this induction variable is zero.
The main function of the "do-loop" pattern is to enable the doloop
optimization and to bring a loop to the above specific form.
As this optimization introduces a "do-loop" pattern at the end
of the loop, this pattern could be used also for a simple and
quick way to recognize loops for which the conditions 1) and
2) are met. Of course it is possible to use alternative ways
for detecting such loops.
Actually do-loop optimization is not for other passes to use, it is
rather to optimize target specific cases like bdnz/bdz. There is no
reason why we really need a do-loop pattern at all for other
optimizations. Some targets (like k6 [x86] which no longer define a
do-loop pattern), do-loop patterns are limited based on the length of
the jump.
I will also note that information about loops/iterations is not
preserved across optimizations.
And this is a problem why? And really could be solved other ways.
In addition to the loop
structure (computed by modulo-scheduling too), the doloop
optimization requires information about iterations that
it is provided by induction variable analysis etc. This
information is not computed by modulo-scheduling - as it
is not needed (or it is needed in a very limited form).
Why can't it be computed? I still don't understand sms has to depend
on do-loop patterns. The is still no reason why sms depends on
another target specific optimization from happenning. It is not like
we have all the information there already with the current loop
infrastructure. I mean seriously, if we did not depend on do-loop
patterns, then all targets would benefit right away from sms instead
of what currently is needed which can reduced performance.
> If we do add a do_loop_end pattern to the spu back-end in the end, can
> we then make it dependent it on sms being enabled?
I think that even for SPU the "do-loop" optimization may help.
Why? There is no reason to add an extra IV (as you mentioned already)
when IV-opts should figure out the register pressure and everything.
Yes spu has many register but guess what, this adds one extra one and
in some cases you can cause register pressure and cause an extra
spilling if you have now more IVs than registers. This is unlike PPC
where you have an extra register for the do-loop pattern (CTR) so that
can decrease register pressure there slightly but in the SPU case, you
increased it.
I don't have the benchmark results of why Trevor disabled the do-loop
pattern for non sms cases.
-- Pinski