This is the mail archive of the
mailing list for the GCC project.
Re: [tree-ssa] Removal of gotos from cfg based ir
> On Fri, 2003-11-14 at 15:33, Jan Hubicka wrote:
> > > On Fri, 2003-11-14 at 11:27, Jan Hubicka wrote:
> > > > > On Fri, 2003-11-14 at 11:04, Jan Hubicka wrote:
> > >
> > > If you collect profiling information in the front-end, by the time the
> > > scheduler, for instance, goes to use it, hasn't it been watered down by
> > > loop replication and other things that have added branches where you
> > > have had to estimate how the original data is propagated to new
> > > branches?
> > Yes, this is problem. Fortunately we don't need the CFG to be perfect
> > and we definitly need profile to asist inlining as it is reported by
> > several sources as the most important optimization of all, followed by
> > basic block reordering, register allocation third.
> > This pretty much says how wide range the profile lifetime has to be.
> You can always do profiling in 2 different places :-)
Profiling is CFG sensitive, so earlier optimization decisions invalidate
profiling. Requiring user to compile/train/compile/train/compile/use
sequence is unlikely going to work very well given that users refuse to
profile even once.
We can have this as alternative to see how much difference it does.
> > > How would maintaining the CFG with branch taken counts be any different
> > > than annoating every jump with a count of the number of times its taken
> > > when generating RTL? When you build the CFG in rtl, you would end up
> > > with exactly the same information wouldn't you?
> > This is not at all easy to do. The expansion invents new jumps and
> > single conditional jump can be split into multiple jumps and so on.
> > The strategy is to know exactly the counts at the entry point of
> > original basic block and propagate it over the new control flow re-using
> > old edges. (see find_sub_basic_blocks implementation)
> If the RTL expansion isn't a one-to-one correspondence with the tree IL
> jumps, then the CFG is going to have to be updated by the expanders too?
No, it is updated by split_basic_blocks that looks at the basic block
and discover new basic block boundaries, so expanders don't need to be
aware of this. We use this scheme already when splitting instructions
or expnading some new code.
> In which case they have to be aware of how to translate the profile
> information anyway, so I still dont see how there is any difference
> between maintaining the CFG through the expanders or attaching branch
> counts to the branches, and propagating the information the same way to
> the RTL generated.
They can use branch prediction notes and other branch probabilities can
be guessed in usual way.
Surely you can completely store the orignial CFG basic blocks and edge
probabilities as notes to the produced RTL, but that is just converting
the CFG into different form, so there is not much benefit of it.
Knowing the original basic block and edges is both important for the
> > >
> > > > > same datastructures? I dont think at -O0 I want a CFG in either IL, and
> > > >
> > > > You get CFG at -O0 in RTL for a long time already. I think we always
> > > > did it in reload, in fact and now we do usual register allocation too.
> > > > > it seems like a restriction to force entry into RTL to have a CFG
> > > > > already created. I dont think the front end and the back end ought to
> > > > > be that tightly coupled...
> > > >
> > > > I would think about gimple as backend already. It is all about
> > > > optimization, not parsing. Generic is interface in between frontend and
> > > > backend.
> > >
> > > I think of every different IL as a different 'end'. We're more of a
> > > middle end, and a totally different optimizer than the RTL driven engine
> > > is. I dont think passing shared data structures like the CFG back and
> > Why? We do pass other datastructures as well (aliasing infromation,
> > debug infromation and more)
> None of which are required to generate correct code. They are provided
> to help optimize the code because they cannot be easily produced from
> the IL. You can still generate code without them, it just may not be
> optimized, or contain debug info, etc.
We are not capable of doing it nor planning to implement completely CFG
Doing CFG free expansion is relatively easy (the CFG aware expander code
is less than 100 lines and unlikely will get much more complicated.
This is not too expensive thing to duplicate), but the trend seems to be
undoing such decisions we made in the past. (we killed old jump
optimizer and got speedup from it on non-optimizing copilation and we
replaced stupid register allocation because it was maintenance headache)