This is the mail archive of the
mailing list for the GCC project.
Re: [lto][patch] Move the call to execute_all_ipa_transforms to cgraphunit.c
> > OK, do you think you could implement this solution to extern inlines?
> > For now, running the inliner early will get rid of the immediate
> > problem we are having. If you don't have a lot of time, could you
> > send an outline of what needs to be done?
> I think I still prefer to run the inliner and drop the extern inline
> functions. The option of fully transferring then to wpa would be
> harder, since now the compiler would see more function bodies then the
> linker and would need decide what to do with them. I am afraid that
> converting extern inline into static functions would break some code
> that has unreasonable expectations about a function defined in another
> file being called.
Well, the scheme that I think would be ideal should work here.
Basically we should
1) Make C and C++ froends to keep both copies of function of same name
(extern inline and non-extern inline body) if available.
2) Teach LTO to output the extern inlines as static functions and to
direct all calls to extern inline function variant
3) Teach inliner to redirect all the remaining calls to the externally
visible body. We aready do work on removing out of line extern inlines
so this can be bundled there.
This should have precisely the intended semantics of extern inline
eliminating the current bug that extern inlines are ignored with
--combine and if offline body is present.
This (except for the LTO bits) is something on my TODO for years, but
I never got around implementing 1) correctly. 3) is very easy to do...
> Doing an early inline (and possibly other optimizations) also has the
> benefit of reducing the size of the IL that is written to disk.
We already do this kind of early inline during the einline pass.
Einline pass is basically doing to work that pays back in both size and
speed and work that will be always done (always_inline and flatten
attribute processing), while full inliner is here to trade size for
speed based on knowledge of whole unit.
So you should not need to reorder passes here at all, just you will lose
some of extern inline oppurtunities (I think C frontend is making extern
inline always inline so early inliner will pick them except for weird
recursive cases where topological order does not allows them. In C++ I
think they are not always inline so early inliner will pick only very
small ones. If you need immediate hack around, perhaps making C++ ones
to disregard inline limits too would work).
> > Thanks. Diego.
> Rafael Avila de Espindola
> Google | Gordon House | Barrow Street | Dublin 4 | Ireland
> Registered in Dublin, Ireland | Registration Number: 368047