This is the mail archive of the
mailing list for the GCC project.
RFA: Reduce iterations of elimination bookkeeping
- From: Jeff Law <law at redhat dot com>
- To: gcc-patches <gcc-patches at gcc dot gnu dot org>
- Date: Fri, 03 Apr 2009 10:29:01 -0600
- Subject: RFA: Reduce iterations of elimination bookkeeping
A few weeks ago I instrumented the top section of the main reload
loop. Specifically, I was looking to see how often we have to do
For an x86 bootstrap, we call reload 101k times and we perform
elimination bookkeeping 151k times. Obviously, we're iterating less
than half the times we call reload -- that was somewhat of a surprise.
However, there's still 50k iterations of elimination bookkeeping that
are worth looking at.
We will iterate on elimination bookkeeping any time the size of the
frame changes. So if we spill a pseudo to memory, allocate a
caller-save slot, align the stack, spill a memory address. We also
iterate for a variety of other reasons such as unexpected changes in
elimination offsets, a previously eliminable register is no longer
eliminable, any spill code generation.
About 13k of the iterations occur because we allocated a stack slot for
caller-save registers. Yup, that's right, 13k just because we allocated
a slot for a caller-save. These iterations can be trivially avoided by
allocating caller-save slots before we do the elimination bookkeeping.
About 26k iterations occur because of requested stack alignments. Yes,
we allocate a slot to ensure stack alignment *after* elimination
bookkeeping. Interestingly enough, we used to align prior to
elimination bookkeeping, but the code was moved in response to pr29248
and pr28966. Fortunately, all that was really necessary to fix those
PRs was to avoid aligning the stack if none had yet been allocated --
changing the sequencing of elimination bookkeeping & stack alignment was
not necessary and obviously was making more work for reload than was
With those two fixes to sequencing, we can eliminate 38k (of the 50k)
iterations of elimination bookkeeping. The remaining iterations are
almost exclusively due to spilling.
From a code generation standpoint, these changes permute where objects
land in the frame, so it's possible we can get minor code generation
differences on targets with restricted displacements in reg+d addressing
modes. We can also get some changes in cache behaviour. I would expect
both effects to be neutral overall.
From a compile-time standpoint, we're clearly doing less work, so I'd
expect some minor (possibly unmeasurable) compile-time improvements.
Bootstrapped and regression tested on i686-pc-linux-gnu. I also
verified pr29248 and pr28966 continue to generate the desired code.
* reload1.c (reload): Allocate caller-save areas and conditionally
align the stack before elimination bookkeeping.
--- reload1.c (revision 145487)
+++ reload1.c (working copy)
@@ -965,11 +965,31 @@
- starting_frame_size = get_frame_size ();
+ /* Set up the caller-save areas before elimination bookkeeping.
+ This eliminates about 25% of the iterations of the elimination
+ bookkeeping code as we no longer have to iterate the bookkeeping
+ if CALLER_SAVE_NEEDED is true. */
+ if (caller_save_needed)
+ setup_save_areas ();
+ /* If we have a stack frame, go ahead and align it before we
+ handle elimination bookkeeping. This avoids another 50% of the
+ iterations of the bookkeeping code.
+ We don't align if there is no stack, as that will cause a stack
+ frame when none is needed should STARTING_FRAME_OFFSET not be
+ already aligned to STACK_BOUNDARY. */
+ if (get_frame_size () && crtl->stack_alignment_needed)
+ assign_stack_local (BLKmode, 0, crtl->stack_alignment_needed);
+ /* Do this after we have set up the caller-save areas and handled
+ the stack alignment requests. This allows elimination bookkeeping
+ to stabilize without iterating much more often. */
+ starting_frame_size = get_frame_size ();
/* For each pseudo register that has an equivalent location defined,
try to eliminate any eliminable registers (such as the frame pointer)
assuming initial offsets for the replacement register, which
@@ -1025,26 +1045,9 @@
- if (caller_save_needed)
- setup_save_areas ();
/* If we allocated another stack slot, redo elimination bookkeeping. */
if (starting_frame_size != get_frame_size ())
- if (starting_frame_size && crtl->stack_alignment_needed)
- /* If we have a stack frame, we must align it now. The
- stack size may be a part of the offset computation for
- register elimination. So if this changes the stack size,
- then repeat the elimination bookkeeping. We don't
- realign when there is no stack, as that will cause a
- stack frame when none is needed should
- STARTING_FRAME_OFFSET not be already aligned to
- STACK_BOUNDARY. */
- assign_stack_local (BLKmode, 0, crtl->stack_alignment_needed);
- if (starting_frame_size != get_frame_size ())