This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: C compile time

Jan Hubicka wrote:

The slowdown at -O3 is of course due to unit-at-a-time, which can be a big memory sucker if there are lots of opportunities to inline static

I was doing some estimates there, and for compiling SPEC sources the
slowdown is proportinal to the code size growth caused by
unit-at-a-time.  So the problem does not seem to be increased memory
overhead, just the fact that we find more inlining opurtunities and
inline more. GCC is probably good testcase for this as many parts are
ordered top-down instead of bottom-up or old inliner likes.

You can also use the test case for PR 10155 to see the RAM abuse (and PR 11121 while you're at it :-)

I seem to recall what happens for GCC compilation - in my original
unit-at-a-time benchmarking I did same test and found almost all
slowdown to come from insn-recog. That one is large file divided into
few functions that all inline together (as they are called once) and we
run into several quadratic bottlenecks.  Originally it took about 1 hour
to compile on my K6 test machine.   I implemented
-finline-functions-called-once for this reason but during the review
process we concluded to remove it.  Perhaps this is good reason to put
it back.

Maybe there should be an expand_function_called_once_inline() function that does not duplicate the whole function body for inlining. If you already know that you're only going to inline that function and not output it, then this should help a lot for memory consumption. Of course that doesn't help the backend...

I did elliminated majority of these (there was reload-cse, local cse,
sheduler problems) with single exception of GCSE that has problems with
doing transparent bitmaps for memories.  I didn't see direct way to get
out of that - you do have too many memories and to many alias classes to
care about.  The function analyses transparency of all these memories
even in the areas they are not live - this is how does PRE algorithm work
and without SSAPRE there is probably not too much to invent about.  The
algorithm simply is N^3.

One sollution would be to limit growth of the function I am inlining
once into so functions won't get too large and we won't exhibit this too
often.  I seen some of the testcases at unit-at-a-time and they usually
seem to exhibit similar behaviour.

I believe that this is the solution you should persue for now.

There are several other things we can do with unit-at-a-time - limit
overall growth of compilation unit or decide whether to inline or not
based on overall growth, not function size.

With unit-at-a-time, do we start inlining from the callgraph leaves or from the root? If we do the former, than you could simply stop inining if a function grows too large. Another thing you could try is to first compute what the total size of all the functions would be after inlining everything (all the information you need is available) and make inline decisions based on that.

Unfortunate thing on this path is that I will probably not be able to do
unit-at-a-time for C++ without some help and thus this work won't help
C++ where we get major problems.

Yes -- you recently mentioned that Mark had some ideas about C++ unit-at-a-time. Could you tell a bit more about that?

Other sollution would be to not give up so easilly and try to push more
on keeping compiler linear on the compilation.  For GCSE we probably can
give up when having too many memory references and simply do not record
the others.  Similarly we can deal with quadratic bottlenecks in the
scheduler, but I am not sure whether we will run into problems in
register allocation that is unavoidably quadratic by definition but do
far does not appear to cause major obstackles.

Well if GCSE is O(N^3) then that is obviously a good place to start and see what the effects are for the generated code. But eventually, keeping N small is the only thing that really helps if you have unavoidable quadratic or worse algorithms. So that means, make sure functions don't grow excessively large. That helps reducing memory consumption, too.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]