This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: An issue for the SC: horrible documentation quality of GCC


> Hi,
> 
> On Fri, 9 May 2003, Richard Kenner wrote:
> 
> > Then it what sense is it "local"?  Do you know?
> 
> Maybe the initial author felt it was local in the sense that it doesn't
> use real and global data flow information, but instead a conservative
> subset, like implemented with cselib.

Well, I actually originally did it purely local but later found that
a) initialization of cselib at each basic block is expensive
   (I fixed this later so this should be no longer so hot issue)
b) doing superblock CSE makes the whole thing to converge faster and is
   completely for free.
I failed to document that somehow that is a mistake.  I will add a
documentation.
It duplicates what cse already knows to do for three reason - at first it
needs to be fast, at second it needs to be consistent with what global
optimization expect.  The global optimization works optimally only when
constants are put as constants and copies uses the oldest value inside
BB.  CSE fails to do that.  It uses heuristics often leaving two copies
in the same code..
At third I would like to see CSE replaced by standard and documented
alrogrithms.  Still I preffer to have copy propagation where I can open
a book and see what copy propagation does over having CSE that is not
doing CSE at all.

I will send patch updating the documentation.
Also concerning the gcse.c file, Zdenek also did series of patches to
break it up into multiple files each for individual pass it does (like
cprop, pre, unification, store motion, null pointer ellimination) all
folded together.  These somehow got suck in the review process
unfortunately.  I believe splitting the whole thing and adding
introducionary comment for each file with reference to apropriate
paper/book would help a lot.
> 
> >
> >       p1 <-- 100
> >       p2 <-- p1 + p3
> >     into
> >       p1 <-- 100
> >       p2 <-- 100 + p3
> >
> > For many machines, the former is better.
> 
> Well, when dead code is removed this will look like:
> 
>   p2 <-- 100 + p3
> 
> i.e. without setting p1 at all.  I guess there are not many machines where
> the latter is not better (or at least not worse) than the initial sequence
> with the explicit setup of p1.

Yes, I think we should have simple specialized pass here instead of the
crazy heuristics we have in cse.  For example for i386 this is still
true, in the case constant 100 is used many times in the code.  This is
something CSE can't decide having just the local information in the
mind, but cost of using constant and cost of loading constant to
register would.

Honza
> 
> Also what is better depends on when this optimization is run.  To clean up
> code, which usually is the goal for optimizations run early you want to do
> this copy propagation done unconditionally to better expose the
> redundancies.
> 
> If it then is determined that for a certain machine emitting that constant
> instead if a register reference is worse, and that constant still sits
> around in some register (for instance if the p1-setup could not be
> removed for some reason), that conversion can be reversed again, i.e. the
> register can be used again, or even a copy can again be inserted.
> 
> But this needs to be determined at a later point, instead of cluttering
> all the optimization passes with machine dependend checks.
> 
> > CSE has lots of logic to do this and knows which is better.
> 
> Yes, and now look at CSE's code.
> 
> > What is this code doing, why does it to things differently than CSE, and
> > how does it interact with CSE?  These are both questions and things that
> > need to be documented in the code.
> >
> > I'd expect gcse to only operate on the *start* of a basic block and let
> > CSE make the decisions *within* the block.
> 
> CSE as we implement it is fundamentally different from gcse.  For one our
> CSE is not any CSE but a kind of ad hoc value numbering intertwined with
> expression simplification.
> 
> That having said, the local_cprop pass _does_ use cselib to determine if a
> pseudo is the same as another pseudo or equal to a constant.  And I can't
> find a function in cse.c which does something similar (i.e. just
> propagating constants and copies).  For the gcse we don't want to run a
> full featured CSE pass in each iteration (because CSE is one of the
> slowest things in gcc overall), so someone has obviously implemented
> the missing functionality using as much code from cse (cselib namely) as
> possible.  It just was placed into gcse.c, although it doesn't really has
> any connection for it.  It just is a small cleanup pass.
> 
> > But the code doesn't seem to do that.  Without the missing
> > documentation, I can't tell if that's a bug in the code or is the way it
> > was designed to work.
> 
> Like I said, local_cprop* has nothing to do with gcse itself, except being
> run to clean up the code between the gcse iterations.
> 
> 
> Ciao,
> Michael.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]