This is the mail archive of the
mailing list for the GCC project.
Re: [tree-ssa] Removal of gotos from cfg based ir
> On Fri, 2003-11-14 at 15:51, Jan Hubicka wrote:
> > > > > How would maintaining the CFG with branch taken counts be any different
> > > > > than annoating every jump with a count of the number of times its taken
> > > > > when generating RTL? When you build the CFG in rtl, you would end up
> > > > > with exactly the same information wouldn't you?
> > > >
> > > > This is not at all easy to do. The expansion invents new jumps and
> > > > single conditional jump can be split into multiple jumps and so on.
> > > > The strategy is to know exactly the counts at the entry point of
> > > > original basic block and propagate it over the new control flow re-using
> > > > old edges. (see find_sub_basic_blocks implementation)
> > >
> > > If the RTL expansion isn't a one-to-one correspondence with the tree IL
> > > jumps, then the CFG is going to have to be updated by the expanders too?
> > No, it is updated by split_basic_blocks that looks at the basic block
> > and discover new basic block boundaries, so expanders don't need to be
> > aware of this. We use this scheme already when splitting instructions
> > or expnading some new code.
> no, I think we are miscommunicating.
> If the tree IL has a COND_EXPR, it has an execution count for each path
> taken. When we generate the rtl for this COND_EXPR, we can attach a note
> or whatever to the RTL generated indicating the execution counts on each
> Then when the cfg is built by the rtl pass, if that information is
> there, it simply picks it up.
> So if a single conditional jump in tree form can be split into multiple
> jumps, doesn't that affect the CFG for RTL? Are you are saying after we
> generate rtl, we have to make a pass through the IL looking for basic
> blocks which need to be split because of the way they were expanded?
Yes, this is basically right. Expansion of almost any tree expression
may introduce new control flow depending on what machine description
does. These, however, remains single entry / single exit regions
for non-control flow nodes and single entry / same exists regions for
control flow nodes.
This is not as painful to update keeping the profile mostly intact.
My scheme does not save the full basic block discovery pass on RTL, just
> > > > > I think of every different IL as a different 'end'. We're more of a
> > > > > middle end, and a totally different optimizer than the RTL driven engine
> > > > > is. I dont think passing shared data structures like the CFG back and
> > > > Why? We do pass other datastructures as well (aliasing infromation,
> > > > debug infromation and more)
> > > >
> > > None of which are required to generate correct code. They are provided
> > > to help optimize the code because they cannot be easily produced from
> > > the IL. You can still generate code without them, it just may not be
> > > optimized, or contain debug info, etc.
> > We are not capable of doing it nor planning to implement completely CFG
> > free compilation.
> > Doing CFG free expansion is relatively easy (the CFG aware expander code
> > is less than 100 lines and unlikely will get much more complicated.
> > This is not too expensive thing to duplicate), but the trend seems to be
> > undoing such decisions we made in the past. (we killed old jump
> > optimizer and got speedup from it on non-optimizing copilation and we
> > replaced stupid register allocation because it was maintenance headache)
> OK, Im not suggesting we have to have a CFG free pass in the compiler,
> or that we cant justify the existence of the CFG at -O0. I was merely
> saying it would not be possible if we hooked the CFG and the IL
> intimately for lack of a good reason, should we want it. Thats just one
> example I threw out because Ive worked on a compiler that was blindingly
> fast at -O0 because it did almost nothing except create a tree and
> generate an object file from it.
> Bottom line is Im just trying to find good justifiable reasons why we
> have to keep the CFG intact throughout compilation phases using
> different ILs, and that we should modify the IL to do this.
The reason is to give mechanizm to pass control flow specific
infomration for optimizers (like the profile).
Of course with non-optimizing copilation it makes less sense, but given
the complexity of our code generation backend we don't want to do too
much code duplication when it does not save us any significant amount of
cycles. But in my eyes, we should make GCC optimizer friendly, while
trying to keep nonoptimizing compilation fast, not making GCC
non-optimizing friendly making optimizers dificult.