Stack Reorganization Patch

Tue Feb 11 18:00:00 GMT 2003

"Naveen Sharma, Noida" <naveens@noida.hcltech.com> writes:

> Hi,
>
> This is a re-post of my previous patch at 
>   http://gcc.gnu.org/ml/gcc-patches/2003-01/msg00019.html
>
> I would request the review of the patch and let me know 
> the issues in its acceptance.

I do not have authority to approve this patch myself but I can make
some comments.

(0) The code in stack-reorg.c is clean and readable.  Good job!

(1) I do not see why you have introduced a mechanism for targets to
    disable this optimization.  It shouldn't have machine-specific
    reasons why it won't work or isn't an improvement.  This merely
    adds complexity and makes it harder to test.

(2) You need testing on more architectures.  If you turn on the
    optimization unconditionally, you can test against most
    interesting architectures by using cross compilers to simulator
    targets.  There are detailed instructions for this on the GCC
    website.  Pick five popular architectures, verify that there are
    no regressions, and do a code size comparison for each one.
    (Speed measurements are unlikely to be meaningful for a
    simulator.)

(3) We are not kidding when we say to use diff -crN or -urN.  It is
    significantly harder to work with a submission in which the new
    files appear in a different format from the changes.  Also, please
    do not use -c2.  If you do not have write access, you can fake the
    effect of 'cvs add' by editing CVS/Entries for the directory
    containing the file -- add a line

    /stack-reorg.c/0/dummy timestamp//

    at the end.  Note that as a matter of convention we prefer to use
    dashes in file names, not underscores.

(4) I do not understand why you mark all stack slots live at the end
    of every basic block.  Doesn't this prevent deletion of dead stack
    slots?  We want to be able to discover that a large stack-local
    variable is never used and not even allocate space for it.

(5) Must you call stack_reorg() from the middle of reload?  Can't you
    do it immediately before or after?

(6) You've got code in stack_reorg() to fix up DECL_RTL.  This
    *should* be unnecessary -- nothing should be looking at DECL nodes
    this late in rest_of_compilation.  However, I could be unaware of
    some horrible interaction issue.  Please take that out and see if
    anything breaks.

(7) You added a long complicated block of code to
    assign_stack_local_1; it should be broken out to a separate
    function.

(8) Some of this conditional

+   if (flag_stack_reorg
+       /* Disabling for new regalloc due to some interface problems.  */
+       && !flag_new_regalloc
+       && targetm.reorganize_stack_p
+       && (*targetm.reorganize_stack_p) ()
+       && !stack_reorg_completed
+       && !function->contains_functions
+       && DECL_CONTEXT(function->decl) == NULL_TREE
+       && !function->x_nonlocal_labels)

   is redundant with

+   /* Set stack_reorg_completed to 1 if it is to be disabled 
+      for a target. This will inhibit generation of stack 
+      pseudos.  */
+   if (flag_stack_reorg
+       && targetm.reorganize_stack_p
+       && (*targetm.reorganize_stack_p) ())
+     stack_reorg_completed = 0;
+   else
+     stack_reorg_completed = 1;

   It looks like the latter chunk can be totally removed.  That avoids
   action at a distance.

(9) Documentation issues:

+   { "stack-reorg", &flag_stack_reorg, 1,
+    N_("Allow locals to be reorganized.") },

   This does not explain the optimization very well: suggest instead
   "Compact the stack frame after register allocation."

+ @item -fstack-reorg
+ @opindex fstack-reorg
+ Allow variables on stack to be reorganized. When supported for a target,
+ variables are assigned offsets based on their size and frequency of
+ references. This helps generate better code for targets with short
+ displacements. This option is meant for testing at present.

   Same deal - use active voice, like so:

 Reorganize stack frames, assigning offsets to local variables based
 on their size and frequency of reference.  This produces better code
 especially for targets with short displacements.

+ Nonzero in a @code{reg} if the register holds address of a stack slot.
+ These are generated when hard stack slots are assigned after register
+ allocation.
+ Stored in @code{call} field and printed as @samp{/c}.

   You need 'the' before both 'address' and '@code{call}'.

+ /* The file contains routines to re-order the stack. The stack is sorted
+    by increasing size and decreasing number of references .i.e most
+    frequently referenced variables are near the stack.

  /* This file contains routines to re-order stack frames.  Stack
     local variables are sorted by increasing size and decreasing
     number of references: small, frequently used variables are
     nearer the stack pointer.

+    We need to reorder the stack because it is beneficial to do so for
+    targets which have small displacements. It has following advantages.

     It is especially beneficial to reorder the stack frame for
     targets with small displacements.  The major advantages are:

+    a. Decrease in Code Size: Adjusting a register (w.r.t. frame
+       pointer) too many times so that variable to be accessed lies
+       within short displacement range increases code size. There
+       are number of instructions whose sole purpose is to reach a local
+       variable.
+ 
+    b. Faster Execution: Adjusting large offsets requires a register and this 
+       adds to the register pressure during register allocation phase, 
+       which results in greater number of spills into memory resulting in
+       slower execution. Reducing most commonly referenced variables to 
+       smaller offsets helps generate better code.  */

   I don't understand what you are trying to say in these paragraphs
   so I cannot edit them.

   You have more grammar and style issues in the comments in
   stack-reorg.c but we can worry about them later.

zw