This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Some 4.4 project musings

From: Jan Hubicka <hubicka at ucw dot cz>
To: Andrew MacLeod <amacleod at redhat dot com>
Cc: Diego Novillo <dnovillo at google dot com>, gcc at gcc dot gnu dot org
Date: Fri, 15 Feb 2008 02:37:13 +0100
Subject: Re: Some 4.4 project musings
References: <47A38749.1020308@redhat.com> <b798aad50802110549u8f40de4le5c9b733b9265ef8@mail.gmail.com> <47B07D2A.1080509@redhat.com>

> Diego Novillo wrote:
> >On Fri, Feb 1, 2008 at 3:55 PM, Andrew MacLeod <amacleod@redhat.com> wrote:
> >
> >  
> >> 1 - Pass cleanup.  There have been rumblings about this, but I haven't
> >>    
> >
> >Yes, this is an area that is in desperate need of TLC.  Your plan
> >looks good to me.  We need to have a mechanism to determine whether a
> >pass did something, we need to be able to have different pipelines for
> >different architectures.
> >
> >Do you have anything specific in mind?  Create a branch?  Work
> >directly on mainline?
> >  
> 
> I think I'll create a branch since its not completely clear if the 
> chosen information will be sufficient..  First convert all the passes to 
> pass some info back in an organized way and then there will be some 
> collecting/experimenting to do.  I'll do the trees first, but make sure 
> it can be applied to RTL as well.

Note that at RTL level I did some limited progress on this with BB_DIRTY
flags that was used to avoid cleanup_cfg and some other passes from
rescanning the function body. It is now used by DF infrastructure even
if the original use in cfg cleanup disappeared.

Perhaps most of the trees cleanups can be scheduled based on knowledge
if some basic blocks are dirty at all? 

> Thats a lot more work, and by itself may actually increase pressure in 
> the average case :-)  It will be interesting to see if anything has 
> changed in the past few years regarding RTL that gets generated without 
> TER. It would be nice if it wasn't needed, but I'm not aware of work 
> which has changed enough to help select better RTL patterns. we'll see 

With SSA info on place, the RTL expansion can actually do some walking
of graph and combining as needed. RTL expansion itself is not terribly
smart on combining the patterns either.

It seems to me that TER is combining many effects:

 1) it produces complex expressions to allow folding that we missed
 since we can't really do algebraic simplification at gimple level (i.e.
 complex patterns we don't match before gimplification has to wait for
 late fold)
 2) re-combining complex expressions helps to simplify conflict graph
 when variable ends up being both set and used in same instruction
 avoiding the conflict and improving coalescing of global registers.
 This is at cost of some extra copies RTL expansion will produce withing
 th ebasic block but this is usually better handled by backend.

 One particular case I hit while looking at gzip was something like

 loop {
   if (var1 != memory_var
       || var2 != memory_var2
     continue
   if (test)
     var1 = something;
   if (test)
     var2 = something2;
 }

 we propagated the PHI controlling the loop to use value loaded from
 memory_var and memory_var2 in some of the path through the body (not
 modifying var1/var2) that results in extra conflict and need for extra
 pseudo.
 In simple cases TER simplify it.  It is probably quite special case,
 but can we do something like not recording the conflict when BB is
 dominated by check for equivalency of two values?
 3) Some overall code movement due to reordering that sometimes moves
 uses closer to defs as you mention.
 4) RTL in some cases handles expansion better when complex expression
 is seen instead of simple gimple operands.

 I believe that there are not that many cases and most of them belongs
 to the category of algebraic simplifications not that dependent on the
 expansion itself.  Among most interesting examples is do_jump that can
 be completely reorganized to GIMPLE level.
 5) anything else I missed?

Perhaps we can get harder data on the individual cases?  For example
disabling folding during expansion is easy to see how much 1) matters (I
did it some time ago)

BTW I also used to have patch that expanded from SSA: it quite easilly
added code to expand SSA name (according to partitioning decision) and
after RTL expansion to insert proper copy instructions on the edges.
Those was a bit trickier to locate as each basic block in RTL CFG
expands to superblock but it is still doable.  The overall plan seemed
to work pretty well.

I however abadoned it after some discussion with Zdenek and we concluded
that we don't want to make expansion more nasty than it was at a time of
writting the patch.  Tuples hopefully cleans it up :)

Honza
> :-) It will be interesting to see what happens.  As an easy experiment, 
> have you tried a straight comparison with -fno-tree-ter on SPEC or 
> anything like that?  It may give you some idea of what to expect.
> 
> The pressure is likely to go up since TER usually moves defs closer to 
> the use... I have no data whatsoever, but I would think  that TER 
> reduces the pressure slightly in the average case. The larger 
> pathological cases are a little more unpredictable.
> 
> 
> Andrew

References:
- Some 4.4 project musings
  - From: Andrew MacLeod
- Re: Some 4.4 project musings
  - From: Diego Novillo
- Re: Some 4.4 project musings
  - From: Andrew MacLeod

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]