This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Combine four insns


On 08/07/2010 10:10 AM, Eric Botcazou wrote:
> So I think that the patch shouldn't go in at this point.

Richard has approved it.  I'll wait a few more days to see if anyone
else agrees with your position.

> Combining Steven and Bernd's figures, 1% of a bootstrap time is 37% of the 
> combiner's time.  The result is 0.18% more combined insns.  It seems to me 
> that we are already very far in the directory of diminishing returns.

Better to look at actual code generation results IMO.  Do you have an
opinion on the examples I included with the patch?

> Bernd is essentially of the opinion that compilation time doesn't matter.

In a sense, the fact that a single CPU can bootstrap gcc in under 15
minutes is evidence that it doesn't matter.
However, what I'm actually saying is that we shouldn't prioritize
compile time over producing good code, based on what I think users want
more.

> It  seems to me that, even if we were to adopt this position, this shouldn't
> mean wasting compilation time, which I think is the case here.

Compile time is wasted only when it's spent on something that has no
user-visible impact.  For all the talk about how important it is, no one
seems to have made an effort to eliminate some fairly obvious sources of
waste, such as excessive use of ggc.  I suspect that some of the time
lost in combine is simply due to inefficient allocation and collection
of all the patterns it creates.

The following crude proof-of-concept patch moves rtl generation back to
obstacks.  (You may need --disable-werror which I just noticed I have in
the build tree).

Three runs with ggc:
real 14m8.202s  user 99m23.408s  sys 3m4.175s
real 14m25.045s user 100m14.608s sys 3m7.654s
real 14m2.115s  user 99m9.492s sys 3m4.461s

Three runs with obstacks:
real 13m49.718s user 97m10.766s sys 3m4.311s
real 13m42.406s user 96m39.082s sys 3m3.908s
real 13m49.806s user 97m1.344s sys 3m2.731s

Combiner patch on top of the obstacks patch:
real 13m51.508s user 97m25.865s sys 3m5.938s
real 13m47.367s user 97m28.612s sys 3m7.298s

(The numbers are not comparable to the ones included with the combiner
patch last week, as that tree contained some i386 backend changes as
well which I've removed for this test.)

Even if you take the 96m39s outlier, I think it shows that the overhead
of the combine-4 patch is somewhat reduced when RTL allocation is
restored to sanity.

Since I didn't know what kinds of problems to expect, I've only tried to
find some kind of fix for whatever showed up, not necessarily the best
possible one.  A second pass over everything would be necessary to clean
it up a little.  I'm somewhat disinclined to spend much more than the
one weekend on this; after all I don't care about compile time.


Bernd

Attachment: rtlobst4.diff
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]