[PATCH] Omit frame pointer and fix %ebp by default on x86 (take 3)

Roger Sayle roger@eyesopen.com
Mon Aug 16 01:39:00 GMT 2004


Have you ever had one of those days where you spend several hours
implementing a new -mfixed-ebp backend flag, bootstrap, regression
test and whilst analysing the testsuite failures you discover the
pre-existing -ffixed-ebp flag?  Doh!


The following patch is the latest revision of a patch to enable
-fomit-frame-pointer by default on x86.  The GDB and GCC's debugging
folks have done an impressive job supporting debugging without a
frame pointer, and it would be a shame if 3.5 didn't benefit from
those efforts.  As recently as a few hours ago, one of GCC's
benchmarking gurus reported new performance figures of GCC vs icc
without using "-fomit-frame-pointer" reflecting the need to get
better optimization with GCC's default flags.

As the old adage goes, "its not the bullet that kills you, its the
hole that it leaves behind".  In GCC's case, it's not -fomit-frame-pointer
that causes problems for stack unwinding and glibc's backtrace, but
the use of %ebp once it isn't needed required as a frame pointer.
If a function allocates %ebp as a general purpose register across
a function caller, that function will dutifully preserve %ebp as a
callee-saved register in it's stack frame, resulting in garbage in
the linked list of frames that backtrace(3) expects to be able to
traverse.

The solution is to consider the ability to avoid saving and restoring
%ebp in a function as independent of whether we allow %ebp to be used
as a spare register.  Fortunately, GCC already allows this through
the combination of "-fomit-frame-pointer" and "-ffixed-ebp".  This
allows GCC to avoid the overhead of building stack frames, but still
take avoid problems for languages/run-times that expect to unwind
through the stack.

In this light, the i386 backend's -momit-leaf-frame pointer can be
seen as avoiding problems, as the use of %ebp to hold arbitrary
values avoids corrupting the backtrace chain.  Unfortunately, this
is a partial solution as non-call traps, interrupts and signals
can still potentially encounter problems.


The simple patch below tweaks the i386 backend, such that we now
default to the equivalent of "-fomit-frame-pointer -ffixed-ebp" on
32-bit targets, when optimizing and the user hasn't explicitly
specified a frame pointer option, either -fomit-frame-pointer,
-fno-omit-frame-pointer or -momit-leaf-frame-pointer.  If any of
these options is specified, or we're targetting x86_64 there's no
functional change.

Many thanks to RTH for help with fixed_regs[HARD_FRAME_POINTER_REGNUM].


The are perhaps some subtle points with the patch below.  The first
is that we don't define CAN_DEBUG_WITHOUT_FP in i386.h; It's use in
opt.c's decode_options interferes with i386.c's override_option's
ability to determine whether the user explicitly altered the value
of flag_omit_frame_pointer [the i386.c backend uses a rogue value
of 2 to determine if the default has been unmodified].

The second point is that we update both fixed_regs and call_used_regs.
The cause of the failures mentioned above is caused by the fact that
reload assumes that call_used_regs is a superset of fixed_regs, and
so just modifying fixed_regs leads to a few additional testsuite
failures.

Finally, the third point is that its appropriate to set these values
to one.  OVERRIDE_OPTIONS is called just before backend_init in
do_compile, which in turn eventually calls CONDITIONAL_REGISTER_USAGE.
In the x86 backend, prior to CONDITIONAL_REGISTER_USAGE this array
actually holds a bitmask that is converted to a boolean value by that
macro.  The bit "1" applies to 32bit targets, and bit "2" applies to
64-bit x86_64 targets (i.e. TARGET_64BIT).  At this point, storing
"1" is both the correct and safe thing to do.

[Note there's a latent bug here, "-ffixed-$reg$" doesn't work on
x86_64, as opt.c's calls to fix_register with the value "1" occurs
before the backend is initialized.  I'll submit a fix separately].

The user is still able to specify that we want to create a linked
stack frame for every function by using "-fno-omit-frame-pointer" (as
is already done by libjava).  The -momit-leaf-frame-pointer option
works as it has before, but is no longer strictly faster than the
default.  The fastest performance is still to use -fomit-frame-pointer
but this unsafe w.r.t. backtrace(3), as its always been.


As described in the postings of previous versions of this patch, this
approach produces both smaller and faster code on IA-32.  My apologies
to Zack for not knowing exactly how much faster relative to full
"-fomit-frame-pointer" or "-momit-leaf-frame-pointer".


The following patch has been tested on i686-pc-linux-gnu with a full
"make bootstrap", all default languages, and regression checked with a
top-level "make -k check" with no new failures.  I've also run numerous
tests by hand.

Ok for mainline?



2004-08-15  Roger Sayle  <roger@eyesopen.com>
	    Richard Henderson  <rth@redhat.com>

	PR middle-end/16373
	* config/i386/i386.c (override_options): Default 32-bit targets to
	"-fomit-frame-pointer -ffixed-ebp" when optimizing if the user
	hasn't explicitly specified a frame pointer command line option.


Index: config/i386/i386.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/i386/i386.c,v
retrieving revision 1.708
diff -c -3 -p -r1.708 i386.c
*** config/i386/i386.c	13 Aug 2004 04:29:01 -0000	1.708
--- config/i386/i386.c	15 Aug 2004 21:24:12 -0000
*************** override_options (void)
*** 1204,1210 ****
    else
      {
        if (flag_omit_frame_pointer == 2)
! 	flag_omit_frame_pointer = 0;
        if (flag_asynchronous_unwind_tables == 2)
  	flag_asynchronous_unwind_tables = 0;
        if (flag_pcc_struct_return == 2)
--- 1204,1223 ----
    else
      {
        if (flag_omit_frame_pointer == 2)
! 	{
! 	  /* On 32-bit targets, if the user hasn't specified either
! 	     -fomit-frame-pointer, -fno-omit-frame-pointer nor
! 	     -momit-leaf-frame-pointer, default to omitting the frame
! 	     pointer but with -ffixed-ebp to preserve backtrace.  */
! 	  if (!TARGET_OMIT_LEAF_FRAME_POINTER)
! 	    {
! 	      call_used_regs[HARD_FRAME_POINTER_REGNUM] = 1;
! 	      fixed_regs[HARD_FRAME_POINTER_REGNUM] = 1;
! 	      flag_omit_frame_pointer = 1;
! 	    }
! 	  else
! 	    flag_omit_frame_pointer = 0;
! 	}
        if (flag_asynchronous_unwind_tables == 2)
  	flag_asynchronous_unwind_tables = 0;
        if (flag_pcc_struct_return == 2)


Roger
--
Roger Sayle,                         E-mail: roger@eyesopen.com
OpenEye Scientific Software,         WWW: http://www.eyesopen.com/
Suite 1107, 3600 Cerrillos Road,     Tel: (+1) 505-473-7385
Santa Fe, New Mexico, 87507.         Fax: (+1) 505-473-0833



More information about the Gcc-patches mailing list