[PATCH] Fix ARM_DOUBLEWORD_ALIGN with nested functions

Olivier Hainque hainque@adacore.com
Thu Jan 17 14:42:00 GMT 2008


Hello,

 This is a followup on the "Problem with ARM_DOUBLEWORD_ALIGN on ARM"
 thread at http://gcc.gnu.org/ml/gcc/2007-11/msg00551.html.

 This is all related to the processing of cases with
 frame_pointer_needed, TARGET_ARM and a nested function, in particular
 case #2 in

      arm_expand_prologue
      ...
      else if (IS_NESTED (func_type))
	{
	  /* [...] need to find somewhere to store IP whilst the frame
             is being created.  We try the following places in order:
             ...
	       2. A slot on the stack above the frame.

 In this case, there is a discrepency between how the prologue assigns
 the hard frame pointer, where it locates the registers save area and
 how arm_get_frame_offsets sets the corresponding "offsets".

 The prologue expander sets the frame pointer to designate the topmost
 entry in the registers save area:

      arm_expand_prologue
      ...
      if (frame_pointer_needed && TARGET_ARM)
      {
        /* Create the new frame pointer.  */
        insn = GEN_INT (-(4 + args_to_push + fp_offset));
	insn = emit_insn (gen_addsi3 (hard_frame_pointer_rtx, ip_rtx, insn));

 while arm_get_frame_offsets sets offsets->frame to designate the first
 slot past the arguments area:

      offsets->frame = offsets->saved_args + (frame_pointer_needed ? 4 : 0);

 These two locations are not the same when an intermediate slot is used
 to temporarily save "ip".

 The saved-regs area offset is likewise off:

      offsets->saved_regs = offsets->saved_args + saved;

 ... as "saved" only counts the regular registers-save ops here, not
 the extra temporary space we use for "ip".

 This eventually becomes visible as a stack pointer misalignment wrt
 ARM_DOUBLEWORD_ALIGN on the simple C testcase below:

 << 
     extern void bar (int a, int b, int c, int d, int e, int f);

     void foo ()
     {
       int x;

       void nest (int a, int b, int c, int d, int e, int f)
       {
	 a = 1;
	 b = 2;
	 c = 3;
	 d = 4;
	 e = 5;
	 f = 6;

	 x = 2;
	 bar (a, b, c, d, e, f);
       }

       nest (1, 2, 3, 4, 5, 6);
     }
 >>

 For which an arm-elf mainline cross compiler with -mabi=aapcs produces 
 ...
 nest.1185:
        @ Nested: function declared inside another function.
        @ args = 8, pretend = 0, frame = 16
        @ frame_needed = 1, uses_anonymous_args = 0
        str     ip, [sp, #-4]!            # push 4bytes
        add     ip, sp, #4
        stmfd   sp!, {fp, ip, lr, pc}     # +16 = 20bytes
        sub     fp, ip, #8
        ldr     ip, [fp, #4]
        @ ip needed for prologue
        sub     sp, sp, #24               # +24  = 44bytes
        ...

 Assuming the hard frame pointer assignment prevails, the suggested patch
 addresses this by teaching arm_get_frame_offsets to account for the
 possible distance between the arguments and the registers areas,
 exposed by a dedicated function reflecting the prologue expander decisions.

 The patch also synchronizes arm_compute_initial_elimination_offset,
 introducing a common way of expressing the computations for the
 "from ARG_POINTER_REGNUM" case.

 The new expression is equivalent to the previous one for the "to
 STACK_POINTER_REGNUM" case and adjusted for the others.

 The "to ARM_HARD_FRAME_POINTER_REGNUM" case is simplified and has now
 a straight connection with offsets->frame.

 The "to FRAME_POINTER_REGNUM" case is adjusted to account for
 FIRST_PARM_OFFSET, which it used not to do. I couldn't see a reason
 why this offset was ignored and testing didn't reveal any problem
 with this change.

 The patch was initially developped on a GCC 4.1 based compiler, running
 ACATS for Ada on a real ARM board where very significant behavior
 improvements were observed (almost clean run for an ongoing port
 development, remaning failures unrelated to this issue).

 After adaptation to mainline, it was tested using the arm-elf cross
 simulator on a i586-suse-linux host, where I checked that it does fix
 the simple C case above and that it introduces no regression in a
 "check-gcc RUNTESTFLAGS=--target_board=arm-sim" run.

 (The simtest-howto instructions and capabilities are very nice, btw)

 A couple of aspects are still a bit unclear to me. To start with,
 what is offsets->frame expected to designate when !frame_pointer_needed.

 Thanks in advance for your feedback,

 Olivier


 2008-01-16  Olivier Hainque  <hainque@adacore.com>

	* arm.c (args_to_rsa_distance): New function. Distance between
	the arguments and the registers save area.
	(arm_get_frame_offsets): Account for this distance in the frame
	pointer and registers area offset computations.
	(arm_expand_prologue): Count the "ip" push in the amount of space
	we use to save registers past the arguments area.
	(arm_compute_initial_elimination_offset) <from ARG_POINTER_REGNUM>:
	Rewrite expressions along the lines of a straight common pattern.
 
 







 
-------------- next part --------------
Index: arm.c
===================================================================
*** arm.c	(revision 131526)
--- arm.c	(working copy)
*************** void (*arm_lang_output_object_attributes
*** 63,68 ****
--- 63,69 ----
  
  /* Forward function declarations.  */
  static arm_stack_offsets *arm_get_frame_offsets (void);
+ static int arm_args_to_rsa_distance (void);
  static void arm_add_gc_roots (void);
  static int arm_gen_constant (enum rtx_code, enum machine_mode, rtx,
  			     HOST_WIDE_INT, rtx, rtx, int, int);
*************** arm_get_frame_offsets (void)
*** 11941,11946 ****
--- 11942,11950 ----
    int saved;
    HOST_WIDE_INT frame_size;
  
+   /* Space between the arguments and the register save areas.  */
+   int args_to_rsa_distance;
+ 
    offsets = &cfun->machine->stack_offsets;
  
    /* We need to know if we are a leaf function.  Unfortunately, it
*************** arm_get_frame_offsets (void)
*** 11965,11973 ****
    /* Space for variadic functions.  */
    offsets->saved_args = current_function_pretend_args_size;
  
!   /* In Thumb mode this is incorrect, but never used.  */
!   offsets->frame = offsets->saved_args + (frame_pointer_needed ? 4 : 0);
! 
    if (TARGET_32BIT)
      {
        unsigned int regno;
--- 11969,11975 ----
    /* Space for variadic functions.  */
    offsets->saved_args = current_function_pretend_args_size;
  
!   /* Size of the saved registers save area.  */
    if (TARGET_32BIT)
      {
        unsigned int regno;
*************** arm_get_frame_offsets (void)
*** 12008,12016 ****
  	saved += 16;
      }
  
!   /* Saved registers include the stack frame.  */
!   offsets->saved_regs = offsets->saved_args + saved;
    offsets->soft_frame = offsets->saved_regs + CALLER_INTERWORKING_SLOT_SIZE;
    /* A leaf function does not need any stack alignment if it has nothing
       on the stack.  */
    if (leaf && frame_size == 0)
--- 12010,12030 ----
  	saved += 16;
      }
  
!   /* Distance between the arguments and the registers save area.  */
!   args_to_rsa_distance = arm_args_to_rsa_distance ();
! 
!   /* The hard frame pointer offset, to the topmost slot in the registers save
!      area in ARM mode, incorrect in Thumb mode but never used.  */
!   offsets->frame
!     = (offsets->saved_args + args_to_rsa_distance
!        + (frame_pointer_needed ? 4 : 0));
!        
!   /* Offset to the bottommost slot of the registers save area.  */
!   offsets->saved_regs = offsets->saved_args + args_to_rsa_distance + saved;
! 
!   /* Offset to the soft frame pointer.  */
    offsets->soft_frame = offsets->saved_regs + CALLER_INTERWORKING_SLOT_SIZE;
+ 
    /* A leaf function does not need any stack alignment if it has nothing
       on the stack.  */
    if (leaf && frame_size == 0)
*************** arm_compute_initial_elimination_offset (
*** 12058,12087 ****
    switch (from)
      {
      case ARG_POINTER_REGNUM:
        switch (to)
  	{
  	case THUMB_HARD_FRAME_POINTER_REGNUM:
  	  return 0;
  
  	case FRAME_POINTER_REGNUM:
! 	  /* This is the reverse of the soft frame pointer
! 	     to hard frame pointer elimination below.  */
! 	  return offsets->soft_frame - offsets->saved_args;
  
  	case ARM_HARD_FRAME_POINTER_REGNUM:
! 	  /* If there is no stack frame then the hard
! 	     frame pointer and the arg pointer coincide.  */
! 	  if (offsets->frame == offsets->saved_regs)
! 	    return 0;
! 	  /* FIXME:  Not sure about this.  Maybe we should always return 0 ?  */
! 	  return (frame_pointer_needed
! 		  && cfun->static_chain_decl != NULL
! 		  && ! cfun->machine->uses_anonymous_args) ? 4 : 0;
  
  	case STACK_POINTER_REGNUM:
! 	  /* If nothing has been pushed on the stack at all
! 	     then this will return -4.  This *is* correct!  */
! 	  return offsets->outgoing_args - (offsets->saved_args + 4);
  
  	default:
  	  gcc_unreachable ();
--- 12072,12097 ----
    switch (from)
      {
      case ARG_POINTER_REGNUM:
+       /* We use offsets->saved_args as a reference offset to the actual
+ 	 arguments area then adjust by the difference with what the arg
+ 	 pointer designates.  */
+ 	 
        switch (to)
  	{
  	case THUMB_HARD_FRAME_POINTER_REGNUM:
  	  return 0;
  
  	case FRAME_POINTER_REGNUM:
! 	  return (offsets->soft_frame - offsets->saved_args
! 		  - FIRST_PARM_OFFSET (current_function_decl));
  
  	case ARM_HARD_FRAME_POINTER_REGNUM:
! 	  return (offsets->frame - offsets->saved_args
! 		  - FIRST_PARM_OFFSET (current_function_decl));
  
  	case STACK_POINTER_REGNUM:
! 	  return (offsets->outgoing_args - offsets->saved_args 
! 		  - FIRST_PARM_OFFSET (current_function_decl));
  
  	default:
  	  gcc_unreachable ();
*************** thumb_set_frame_pointer (arm_stack_offse
*** 12242,12247 ****
--- 12252,12275 ----
    RTX_FRAME_RELATED_P (insn) = 1;
  }
  
+ /* How many bytes the prologue expansion below pushes between
+    the arguments and the registers save area.  */
+ static int
+ arm_args_to_rsa_distance (void)
+ {
+   int distance = 0;
+ 
+   /* Account for our possible push of the static chain in ARM mode.  */
+   if (TARGET_ARM && frame_pointer_needed
+       && IS_NESTED (arm_current_func_type ())
+       && df_regs_ever_live_p (3) != 0
+       && current_function_pretend_args_size == 0)
+     distance += 4;
+ 
+   return distance;
+ }
+ 
+ 
  /* Generate the prologue instructions for entry into an ARM or Thumb-2
     function.  */
  void
*************** arm_expand_prologue (void)
*** 12351,12359 ****
  
  	      insn = gen_rtx_PRE_DEC (SImode, stack_pointer_rtx);
  	      insn = emit_set_insn (gen_frame_mem (SImode, insn), ip_rtx);
- 	      fp_offset = 4;
  
! 	      /* Just tell the dwarf backend that we adjusted SP.  */
  	      dwarf = gen_rtx_SET (VOIDmode, stack_pointer_rtx,
  				   plus_constant (stack_pointer_rtx,
  						  -fp_offset));
--- 12379,12393 ----
  
  	      insn = gen_rtx_PRE_DEC (SImode, stack_pointer_rtx);
  	      insn = emit_set_insn (gen_frame_mem (SImode, insn), ip_rtx);
  
! 	      /* This push doesn't account as a regular register save in the
! 		 save_reg_mask sense, so shifts the "frame" location.  */
! 	      fp_offset += 4;
! 	      
! 	      /* We need to account for the space it takes, still, ...  */
! 	      saved_regs += 4;
! 
! 	      /* ... and tell the dwarf backend that we adjusted SP.  */
  	      dwarf = gen_rtx_SET (VOIDmode, stack_pointer_rtx,
  				   plus_constant (stack_pointer_rtx,
  						  -fp_offset));


More information about the Gcc-patches mailing list