Bug 30736 - stack references scheduled below stack pointer adjustment
Summary: stack references scheduled below stack pointer adjustment
Status: RESOLVED DUPLICATE of bug 30282
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.3.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-02-08 17:24 UTC by Michael Matz
Modified: 2007-02-08 19:17 UTC (History)
3 users (show)

See Also:
Host: powerpc-linux
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Matz 2007-02-08 17:24:58 UTC
This was initially seen internally with a 4.1.2 gcc plus power6 patches,
but with some adjustments to haifa-sched.c (only touching the order in the
ready list) can be reproduced with trunk.  The underlying issue is, that
the stack pointer restore instruction in the rs6000 backend does not conflict
with all insns accessing stack memory, but only with those created by
gen_frame_mem().  Attached is a C++ testcase (from khtml).  Bear with me while
explaining the error, it's hard to construct a small runtime testcase.

Suppose this patch is applied to trunk:

-------------------------------------
Index: haifa-sched.c
===================================================================
--- haifa-sched.c       (revision 121710)
+++ haifa-sched.c       (working copy)
@@ -798,6 +798,8 @@ priority (rtx insn)
            }
          while (twin != prev_first);
        }
+if (INSN_UID(insn) == 235 || INSN_UID(insn)==230)
+  this_priority += 10;
       INSN_PRIORITY (insn) = this_priority;
       INSN_PRIORITY_KNOWN (insn) = 1;
     }
@@ -930,7 +932,7 @@ rank_for_schedule (const void *x, const
   /* If insns are equally good, sort by INSN_LUID (original insn order),
      so that we make the sort stable.  This minimizes instruction movement,
      thus minimizing sched's effect on debugging and cross-jumping.  */
-  return INSN_LUID (tmp) - INSN_LUID (tmp2);
+  return INSN_LUID (tmp2) - INSN_LUID (tmp);
 }

 /* Resort the array A in which only element at index N may be out of order.  */
@@ -2402,7 +2404,7 @@ schedule_block (basic_block *target_bb,
          if (ready.n_ready > 0)
            ready_sort (&ready);

-         if (targetm.sched.reorder2
+         if (0 && targetm.sched.reorder2
              && (ready.n_ready == 0
                  || !SCHED_GROUP_P (ready_element (&ready, 0))))
            {
-----------------------------------------

I.e. switch off target specific reorder2, reorder insns by _inverse_ uid
and special case two instructions by priority in order to force them to be
output earlier.  The compile the attached testcase with:

% ./gcc/cc1plus -O2 -quiet -fsched-verbose=6 -da bug.ii

Examine the assembler output for the docWidth method (not the thunk):

_ZNK5khtml12RenderCanvas8docWidthEv:
   ....
        addi 9,1,12                 r9 = r1+12
.L22:
        lwz 0,52(1)
        lwz 31,44(1)
        lwz 30,40(1)
        lwz 29,36(1)
        lwz 28,32(1)
        mtlr 0
        addi 1,1,48                 r1 = r1+48
        lwz 3,0(9)                  access [r9] == [r1-36]
        blr

Note that the access to [r9] is below the adjustment to r1 (stack pointer).
When a signal occurs at exactly that place this will most probably have been
overwritten.  I'm aware of the red zone on ppc64 (and in fact also on ppc
linux) but with a more complicated function the invalidly accessed offset
could be made larger.

The reason why this happens can be seen in the dumps.  The relevant insns are
(before sched2):

(insn:HI 153 152 154 18 (set (reg/v/f:SI 9 9 [orig:123 a ] [123])
        (plus:SI (reg/f:SI 1 1)
            (const_int 12 [0xc]))) 79 {*addsi3_internal1} (nil)
    (expr_list:REG_EQUAL (plus:SI (reg/f:SI 1 1)
            (const_int 12 [0xc]))
        (nil)))
...
(insn:HI 157 155 228 19 (set (reg:SI 3 3 [orig:135 D.62311 ] [135])
        (mem:SI (reg/v/f:SI 9 9 [orig:123 a ] [123]) [4 S4 A32])) 310
       {*movsi_internal1} (nil)
    (expr_list:REG_DEAD (reg/v/f:SI 9 9 [orig:123 a ] [123])
        (nil)))

(insn 228 157 229 19 (use (reg/i:SI 3 3 [ <result> ])) -1 (nil)
    (nil))
...
(insn 231 230 232 19 (set (reg:SI 28 28)
        (mem/c:SI (plus:SI (reg/f:SI 1 1)
                (const_int 32 [0x20])) [69 S4 A8])) 310 {*movsi_internal1}
        (nil)(nil))

[a couple more loads from callee saved reg from [r1+offset]

(insn 235 234 236 19 (set (reg/f:SI 1 1)
        (plus:SI (reg/f:SI 1 1)
            (const_int 48 [0x30]))) 79 {*addsi3_internal1} (nil)
    (nil))

(jump_insn 236 235 239 19 (parallel [
            (return)
            (use (reg:SI 65 lr))
        ]) 544 {*return_internal_si} (nil)
    (nil))

Note how the second last insn, the stack pointer adjustment can't cause any
dependency of any sort with insn 157, as the latter uses register 9 to access
the stack.  Note further that all accesses to the saved registers are using
alias set 69, while the read from [r9] (for the to be returned value) is using
alias set 4.  From inside the compiler it can be checked that both alias sets
do not conflict.

That is important to know, because the rs6000 backend has provisions to
generate a sort of schedule barrier instruction supposed to clobber stack
memory in order to not resort the saved register instructions with the stack
adjustment.  That's the 'stack_tie' instruction.  That one uses
gen_frame_mem() to create a MEM rtx which it can clobber, but that will have
(in this case) alias set 4 (get_frame_alias_set() ).  So, even if it would
have been emitted it wouldn't stop this particular reordering (because not
also clobbering stuff in alias set 69).

But that reorder barrier isn't emitted by the backend anyway (see
rs6000.c:rs6000_emit_epilogue) as frame_reg_rtx and sp_reg_rtx are equal for
our case.

I think the right solution here is to
1) always emit the reorder barrier
2) make the reorder barrier use alias set 0

Alternatively (but more difficult to implement) would be to not clobber
alias set 0, but another new alias set conflicting with _all_ frame related
alias sets.  That could also be reached by not using alias set 4 for the
load of the returned value in this case, but in general the alias set comes
from the middle end (the decls) already.

The proposed solution is for instance also used in the i386 backend, where in
fact the stack pointer adjusting instructions itself clobbers all memory.

The same problem also exists in the 4.1 branch (and probably 4.2, but haven't
checked).
Comment 1 Michael Matz 2007-02-08 17:35:17 UTC
I can't currently attach anything, somehow bugzilla is broken.
Anyway, perhaps the description of what happens is clear enough to see the
error.  Btw: this patch will solve the problem on trunk per my proposal:

Index: config/rs6000/rs6000.c
===================================================================
--- config/rs6000/rs6000.c      (revision 121710)
+++ config/rs6000/rs6000.c      (working copy)
@@ -14112,6 +14112,7 @@ rs6000_emit_stack_tie (void)
   rtx mem = gen_frame_mem (BLKmode,
                           gen_rtx_REG (Pmode, STACK_POINTER_REGNUM));

+  set_mem_alias_set (mem, 0);
   emit_insn (gen_stack_tie (mem));
 }

@@ -14561,9 +14562,8 @@ rs6000_emit_prologue (void)
                                       || info->first_fp_reg_save < 64
                                       || info->first_gp_reg_save < 32
                                       )));
-      if (frame_reg_rtx != sp_reg_rtx)
-       rs6000_emit_stack_tie ();
     }
+  rs6000_emit_stack_tie ();

   /* Handle world saves specially here.  */
   if (WORLD_SAVE_P (info))
@@ -15501,11 +15501,11 @@ rs6000_emit_epilogue (int sibcall)

   /* If this is V.4, unwind the stack pointer after all of the loads
      have been done.  */
+  rs6000_emit_stack_tie ();
   if (frame_reg_rtx != sp_reg_rtx)
     {
       /* This blockage is needed so that sched doesn't decide to move
         the sp change before the register restores.  */
-      rs6000_emit_stack_tie ();
       emit_move_insn (sp_reg_rtx, frame_reg_rtx);
     }
   else if (sp_offset != 0)
-------------------------------------------
Comment 2 Andrew Pinski 2007-02-08 19:17:15 UTC

*** This bug has been marked as a duplicate of 30282 ***