[PATCH] Add -mno-r11 option to suppress load of ppc64 static chain in indirect calls

Michael Meissner meissner@linux.vnet.ibm.com
Thu Jul 7 15:50:00 GMT 2011


On Thu, Jul 07, 2011 at 10:59:36AM +0200, Richard Guenther wrote:
> On Thu, Jul 7, 2011 at 12:29 AM, Michael Meissner
> <meissner@linux.vnet.ibm.com> wrote:
> > This patch adds an option to not load the static chain (r11) for 64-bit PowerPC
> > calls through function pointers (or virtual function).  Most of the languages
> > on the PowerPC do not need the static chain being loaded when called, and
> > adding this instruction can slow down code that calls very short functions.
> >
> > In addition, if the function does not call alloca, setjmp or deal with
> > exceptions where the stack is modified, the compiler can move the store of the
> > TOC value for the current function to the prologue of the function, rather than
> > at each call site.
> >
> > The effect of these patches is to speed up 464.h264ref in the Spec 2006
> > benchmark by about 7% if -mno-r11 is used, and 5% if it is not used (but the
> > save of the TOC register is hoisted).  I believe this is due to the load of the
> > current function's TOC (r2) having to wait until the store queue is drained
> > with the store just before the call.
> >
> > Unfortunately, I do see a 3% slowdown in 429.mcf, which I don't know what the
> > cause is.
> >
> > I have bootstraped the compiler and saw that there were no regressions in make
> > check.  Is it ok to install in the trunk?
> 
> Hum.  Can't the compiler figure this our itself per-call-site?  At least
> the name of the command-line switch -m[no-]r11 is meaningless to me.
> Points-to information should be able to tell you if the function pointer
> points to a nested function.

No, the compiler cannot figure it out.  Consider the case where a function is
passed a pointer to a function, such as the standard library function qsort.
The call may come from any random module, that isn't part of the compilation
suite, such as if the function being passed the pointer is in a shared library.
You don't know whether the function pointed to uses the static chain
(i.e. nested function call with trampoline, call to PL/I, or other language
that does use the static chain, which is part of the ABI).  The point of the
switch is similar to -ffast-math where you say you are willing to ignore some
corner cases in the standard in order to get better performance.

I certainly can call the switch -mno-static-chain, which is perhaps more
meaningful (at least to us compiler folk, I'm not sure static chain means much
to the normal programmer).

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meissner@linux.vnet.ibm.com	fax +1 (978) 399-6899



More information about the Gcc-patches mailing list