This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Loop optimizer issues
- From: Jan Hubicka <jh at suse dot cz>
- To: Richard Henderson <rth at redhat dot com>, Jan Hubicka <jh at suse dot cz>,law at redhat dot com, Zack Weinberg <zack at codesourcery dot com>,Jason Merrill <jason at redhat dot com>,Zdenek Dvorak <rakdver at atrey dot karlin dot mff dot cuni dot cz>,Daniel Berlin <dberlin at dberlin dot org>,Diego Novillo <dnovillo at redhat dot com>,"gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>, pop at gauvain dot u-strasbg dot fr
- Date: Wed, 30 Jul 2003 16:48:09 +0200
- Subject: Re: Loop optimizer issues
- References: <87oezd40qr.fsf@egil.codesourcery.com> <200307291908.h6TJ8aFv013178@speedy.slc.redhat.com> <20030729194300.GO2231@kam.mff.cuni.cz> <20030729204935.GC29675@redhat.com>
> > - my webizer pass (comming from 3.2 times IMO ready for merge but there
> > are issues with SSA-RTL making it perhaps redundant one day in
> > future). Has measurable improvements in perfomrance especially in
> > combination with Zdenek's new loop unroller that lack induction
> > variable splitting
>
> I'm seriously considering allowing (most of) these to be merged
> into mainline during stage 2, under the argument that at present
> we have performance regressions.
Cool. Last incarantion sits at
http://gcc.gnu.org/ml/gcc-patches/2003-02/msg00501.html
>
> This should mostly cover re-use needs of a tree-based optimizer
> for the tree-ssa branch, I would think.
>
> > - Zdenek's value range profiling code. This one is half way in the
> > mainline. I really hoped to see this merged in 3.4 but I am not
> > quite sure this will happen.
> > This is major problem in merging mainline as Nathanel's work on gcov
> > did cause gread amount of conflicts..
> [...]
> > - Josef's variable tracking and Daniel's location lists. I believe
> > there is useable version sent for review for some time but it has
> > problem with losing track when value is copied to temporary location
> > that is later killed. This is relatively minnor problem in the
> > algorithm as the interfaces are major problem right now here.
> >
> > I believe we have to solve this in order to get nice debugging out of
> > de-SSA once we drop limitations on de-SSAizing. It is also solving
> > problems with -frename-registers as well as with webizer rendering
> > variable values random in debugger.
>
> I'm not sure what to do about these. Clearly the location list
> support is highly desirable, but I thought you said when I pinged
> you last is that it didn't work reliably.
The version Josef sent for review into mainline is not perfect (it can
not be), but it appears to be practicaly useable and give better data
than what we have right now I would say.
Now the problem. Consider code like:
r1 [a] = r2 [a]
...
kill r1
In this case the value tracking dataflow remembers only one location for
variable a. It is always last location a was stored into, so at the
place of kill r1 it is r1 and dataflow concludes that value of a has
been lost forever.
The problem of having multiple copies of same variable lurking around
registers and memory is quite dificult to solve especially because each
copy may contain different value - imagine case
i++
use i-1
in this case i-1 will be likely cseed into old copy of variable i when
webizer is in use and i is not trivial induction variable.
I guess we need some experience and experimenting with this alrorithm
as it is very dificult to judge it's effectivity.
Josef made different version of dataflow and that one has problem of
infinite looping.
I was thinking about updating this in a way that exact list of locations
is computed in one pass and the "preffered" copy is computed in other
pass in some stupid way so we don't get into looping but didn't have
time to implement this yet because the C++ inlining stuff took me much
more time than I originally expected and I considered it more critical.
In next two to three weeks I won't be able to do much coding as I am on
the conferences :(
It would be nice to include the code in GCC at least as experimental
feature so we see how much problem this makes in practice - it is very
dificult to estimate this IMO and majority of the work (location lists,
GDB, register value tracking) is there and needs to live to avoid the
rot.
>
> > - Zdenek's cleanup of GCSE. Improved store motion has been merged but
> > his breakup of GCSE into multiple files didn't. I guess in it's
> > current shape it would need to be redone.
> > - My code for GCSE on parallels that is actually in cfg branch only and
> > first halve of the changes went into mainline (basic code motion
> > infrastructure)
>
> I'm not sure these are worthwhile long term. I expect the rtl GCSE
> optimizer to collapse to almost nothing with the tree-ssa merge.
OK,
I would personally believe that RTL lowering pass will still invent a
lot of (global) CSE opurtunities by producing new constants and
temporaries, but I would agree with waiting for tree-SSA to shape up
before we move in this direction. GCSE is already much more complex than
I would like it to be so adding new complexity should be done with care.
Honza
>
>
> r~