This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Proposal for a 'gcc -O4': interprocedural optimization

Hi Chris,

Chris Lattner wrote:
On Sat, 24 Aug 2002, Dave Hudson wrote:

Ok, that makes sense.  In this case you would still benefit a lot from
elimination of loads and stores.  Not only do the actual loads and stores
consume instruction space, but folding two loads together has a nice
cascading effect on other optimizations that can be performed (mostly
scalar simplifications arising from better value #ing information).
Hmm - interesting. Part of what I've set out to understand with my recent code is exactly how much we can gain from such situations. "Register" moves are so absurdly expensive that every time I can eliminate one it makes me very happy (e.g. for the IP2022 each 16-bit reg-to-reg, reg-to-mem or mem-to-mem copy costs 4 opcodes). Just recently I've had some pretty surprising success with some constant propagation code I wrote and so allowing this to span multiple functions could be *very* useful.

You'd be amazed how much of our code uses library code that calls back
into places - almost every event that doesn't happen synchronously
triggers callbacks.  With that said, however, I can suddenly see a whole
range of potential improvements here (at the moment such improvements
Sure that's absolutely no problem.  Library code is really easy to handle:
just compile the library code and interprocedurally optimize it with the
rest of the application.  Shared objects are the problem, because
currently there is no good way to specify which externally visible
functions may be called by dynamically loaded code.
Hmm - some of our apps have started to use dlls because we've been tight on space, but if the interprocedural wins were sufficiently large then this would certainly make a very strong case to eliminate their use, at least in places. I guess though that even in these cases because the dlls are being used more as a paging mechanism than anything else then their use could be worked out statically anyway.

This is why I think this sort of optimization has great potential in
many of these sorts of embedded apps.
Not _just_ embedded apps!  :)
True, but I think embedded apps are very good examples of where the wins are huge - typically code and data spaces are limited and also product volumes are sufficiently large and costs sufficiently tight that moving to the next sized processor up just isn't an option.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]