This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Proposal for a 'gcc -O4': interprocedural optimization


Hi Chris,

Chris Lattner wrote:
On Sat, 24 Aug 2002, Dave Hudson wrote:

Wow.  It certainly sounds like an interesting architecture! :)
It's definitely interesting - actually if you can keep things to 8 bits so that the accumulator register can be used then things work surprisingly well. With this one though we hide the accumulator from the register allocator completely and only make it visible late in the machine-dependent-reorg. Fortunately we have the best push and pop opcodes I've found and these can work wonders to keep code sizes down.

> ... I haven't
actually played with interprocedural constant propogation myself, but I
expect that the opportunities are fairly limited without using function
cloning: basically you would only use it when the parameter to a function
is _always_ the _same_ constant.  That said, it sounds like you could get
some impressive savings just form some simple interprocedural register
allocation...
Well most of the library code that we ship with our SDK is written to be completely general-purpose because we only want one function to handle each type of operation. In practice, however, with networking code (which is what most of our stuff is) most applications tend to use functions in a stylized way appropriate to the problem at hand. As an example a lot of our code will take a pointer to a datalink layer because we have some apps that run with 8 or even 9 such link layers, however in the majority of cases the code only uses one and that consequently if this could be analyzed correctly then we could effectively eliminate every single use of such pointers as parameters. There are plenty of similar situations elsewhere.

One of the reasons I suspect more code is not written in a completely general way is that if tools can't run the sort of analysis we're considering here the costs are usually pretty terrible.

True, but I think embedded apps are very good examples of where the wins
are huge - typically code and data spaces are limited and also product
volumes are sufficiently large and costs sufficiently tight that moving
to the next sized processor up just isn't an option.
That's actually a good point that I hadn't considered.  With desktops and
scientific applications, it's nice for things to be a few percent faster,
but not critical.  With embedded apps, if you bump over the line of what
your architecture can support, you end up having to move to a different
processor/architecture/system, which could add to the final cost of the
product...
Right - more importantly if some new feature is required then this can mean very costly re-engineering. Another issue is that many smaller embedded systems companies try to use the same processor family for almost everything they do - generating better code gives them far more options to avoid needing to move to multiple architectures.

Of course speed is always useful too :-)


Regards,
Dave


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]