[PATCH/RFC] ARM -- Implement ATPCS stack alignment rules
Jason R Thorpe
thorpej@wasabisystems.com
Fri Sep 6 15:40:00 GMT 2002
This is another patch that originates from Richard Earnshaw, and
that has been in-use in NetBSD's 2.95.3-based compiler for some time.
It implements the ATPCS stack alignment rules, which is to say, the stack
must always by aligned to an 8-byte boundary on entry to a function call
(and thus non-leaf functions must have stack frames which are multiples
of 8 bytes).
I'm not asking for approval of the patch yet, because I have more testing
to do (I've tested it with arm-elf sim, with no regressions, to make sure
the non-ATPCS case is not broken, but have not yet actually tested the
ATPCS case, and the function prologue/epilogue code in 3.3 is quite a bit
different than in 2.95.3), and there are some things I'm unsure about, and
would like to get some feedback on.
First of all, note that since arm_get_frame_size() always returns a rounded
size, some uses of the ROUND_UP() macro were eliminated.
Also, the new arm_get_frame_size() function always rounds to at least a
4-byte boundary. The diff will show that there are some parts of the code
which were using get_frame_size() in an unprotected fashion (that is, not
4-byte-rounded). These places are:
- use_return_insn: This one should be no problem, because the
test was really just for "was there a stack frame".
- arm_output_epilogue: This one actually used the raw frame
size from get_frame_size to generate an add insn to pop the
stack. As I understand it, get_frame_size() isn't guaranteed
to return a value that is rounded to STACK_BOUNDARY, so this
one definitely needs to be sanity-checked. Does the old code
actually have a bug?
- arm_expand_prologue: This also used the output of get_frame_size()
to generate a value directly used in a stack-adjusting insn.
- thumb_expand_prologue: This takes the raw size from
get_frame_size and then applies ROUND_UP() to it later. I guess
I can now eliminate the ROUND_UP() of the value. A sanity-check
here is also appreciated.
- thumb_expand_epilogue: This is the same situation is
thumb_expand_prologue. Same question applies :-)
Also, there's an "XXXJRT" in arm_get_frame_size() that should be
checked -- basically, I'm wondering if we can skip checking for FPA
regs if TARGET_SOFT_FLOAT.
Anyway, to test, I'm going to enable TARGET_ATPCS in the arm-elf config
and run the testsuite with the arm-elf sim.
* config/arm/arm-protos.h (arm_get_frame_size): New prototype.
* config/arm/arm.c (arm_get_frame_size): New function.
(use_return_insn, arm_output_epilogue, arm_output_function_epilogue)
(arm_compute_initial_elimination_offset, arm_expand_prologue)
(thumb_expand_prologue, thumb_expand_epilogue): Use arm_get_frame_size.
* config/arm/arm.h (PREFERRED_STACK_BOUNDARY): Define.
(THUMB_INITIAL_ELIMINATION_OFFSET): Use arm_get_frame_size.
--
-- Jason R. Thorpe <thorpej@wasabisystems.com>
-------------- next part --------------
Index: config/arm/arm-protos.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/arm/arm-protos.h,v
retrieving revision 1.32
diff -c -r1.32 arm-protos.h
*** config/arm/arm-protos.h 6 Sep 2002 14:54:48 -0000 1.32
--- config/arm/arm-protos.h 6 Sep 2002 21:37:27 -0000
***************
*** 31,36 ****
--- 31,37 ----
extern int arm_volatile_func PARAMS ((void));
extern const char * arm_output_epilogue PARAMS ((int));
extern void arm_expand_prologue PARAMS ((void));
+ extern HOST_WIDE_INT arm_get_frame_size PARAMS ((void));
/* Used in arm.md, but defined in output.c. */
extern void assemble_align PARAMS ((int));
extern const char * arm_strip_name_encoding PARAMS ((const char *));
Index: config/arm/arm.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/arm/arm.c,v
retrieving revision 1.227
diff -c -r1.227 arm.c
*** config/arm/arm.c 6 Sep 2002 14:54:48 -0000 1.227
--- config/arm/arm.c 6 Sep 2002 21:37:35 -0000
***************
*** 919,925 ****
/* Of if the function calls __builtin_eh_return () */
|| ARM_FUNC_TYPE (func_type) == ARM_FT_EXCEPTION_HANDLER
/* Or if there is no frame pointer and there is a stack adjustment. */
! || ((get_frame_size () + current_function_outgoing_args_size != 0)
&& !frame_pointer_needed))
return 0;
--- 919,925 ----
/* Of if the function calls __builtin_eh_return () */
|| ARM_FUNC_TYPE (func_type) == ARM_FT_EXCEPTION_HANDLER
/* Or if there is no frame pointer and there is a stack adjustment. */
! || ((arm_get_frame_size () + current_function_outgoing_args_size != 0)
&& !frame_pointer_needed))
return 0;
***************
*** 7545,7551 ****
frame that is $fp + 4 for a non-variadic function. */
int floats_offset = 0;
rtx operands[3];
! int frame_size = get_frame_size ();
FILE * f = asm_out_file;
rtx eh_ofs = cfun->machine->eh_epilogue_sp_ofs;
--- 7545,7551 ----
frame that is $fp + 4 for a non-variadic function. */
int floats_offset = 0;
rtx operands[3];
! int frame_size = arm_get_frame_size ();
FILE * f = asm_out_file;
rtx eh_ofs = cfun->machine->eh_epilogue_sp_ofs;
***************
*** 7818,7823 ****
--- 7818,7826 ----
}
else
{
+ /* We need to take into accounbt any stack-frame rounding. */
+ frame_size = arm_get_frame_size ();
+
if (use_return_insn (FALSE)
&& return_used_this_function
&& (frame_size + current_function_outgoing_args_size) != 0
***************
*** 8060,8066 ****
unsigned int from;
unsigned int to;
{
! unsigned int local_vars = (get_frame_size () + 3) & ~3;
unsigned int outgoing_args = current_function_outgoing_args_size;
unsigned int stack_frame;
unsigned int call_saved_registers;
--- 8063,8069 ----
unsigned int from;
unsigned int to;
{
! unsigned int local_vars = arm_get_frame_size ();
unsigned int outgoing_args = current_function_outgoing_args_size;
unsigned int stack_frame;
unsigned int call_saved_registers;
***************
*** 8181,8186 ****
--- 8184,8255 ----
}
}
+ /* Calculate the size of the stack frame, taking into account any
+ padding that is required to ensure stack-alignment. */
+
+ HOST_WIDE_INT
+ arm_get_frame_size ()
+ {
+ int regno;
+
+ int base_size = ROUND_UP (get_frame_size ());
+ int entry_size = 0;
+ int live_regs_mask = 0;
+ int volatile_func = (optimize > 0
+ && TREE_THIS_VOLATILE (current_function_decl));
+
+ if (! TARGET_ATPCS)
+ return base_size;
+
+ /* We know that SP will be word aligned on entry, and we must
+ preserve that condition at any subroutine call. But those are
+ the only constraints. */
+
+ /* Space for variadic functions. */
+ if (current_function_pretend_args_size)
+ entry_size += current_function_pretend_args_size;
+
+ if (! volatile_func)
+ {
+ for (regno = 0; regno <= 10; regno++)
+ if (regs_ever_live[regno] && ! call_used_regs[regno])
+ live_regs_mask |= 1 << regno;
+
+ if (flag_pic && regs_ever_live[PIC_OFFSET_TABLE_REGNUM])
+ live_regs_mask |= 1 << PIC_OFFSET_TABLE_REGNUM;
+
+ if (regs_ever_live[14])
+ live_regs_mask |= 1 << 14;
+ }
+
+ if (frame_pointer_needed)
+ live_regs_mask |= 0xd800;
+
+ /* If we have to push any registers, we must also push lr as well. */
+ if (live_regs_mask)
+ live_regs_mask |= 1 << 14;
+
+ for (regno = 0; regno <= LAST_ARM_REGNUM; regno++)
+ if (live_regs_mask & (1 << regno))
+ entry_size += 4;
+
+ /* XXXJRT Should this also be conditional on TARGET_HARD_FLOAT? What
+ happens when extended asm uses FPA regs even if TARGET_SOFT_FLOAT? */
+ if (! volatile_func)
+ {
+ for (regno = 23; regno > LAST_ARM_REGNUM; regno--)
+ if (regs_ever_live[regno] && ! call_used_regs[regno])
+ entry_size += 12;
+ }
+
+ if ((entry_size + base_size + current_function_outgoing_args_size) & 7)
+ base_size += 4;
+ if ((entry_size + base_size + current_function_outgoing_args_size) & 7)
+ abort ();
+
+ return base_size;
+ }
+
/* Generate the prologue instructions for entry into an ARM function. */
void
***************
*** 8416,8422 ****
}
}
! amount = GEN_INT (-(get_frame_size ()
+ current_function_outgoing_args_size));
if (amount != const0_rtx)
--- 8485,8491 ----
}
}
! amount = GEN_INT (-(arm_get_frame_size ()
+ current_function_outgoing_args_size));
if (amount != const0_rtx)
***************
*** 10167,10173 ****
void
thumb_expand_prologue ()
{
! HOST_WIDE_INT amount = (get_frame_size ()
+ current_function_outgoing_args_size);
unsigned long func_type;
--- 10236,10242 ----
void
thumb_expand_prologue ()
{
! HOST_WIDE_INT amount = (arm_get_frame_size ()
+ current_function_outgoing_args_size);
unsigned long func_type;
***************
*** 10262,10268 ****
void
thumb_expand_epilogue ()
{
! HOST_WIDE_INT amount = (get_frame_size ()
+ current_function_outgoing_args_size);
/* Naked functions don't have prologues. */
--- 10331,10337 ----
void
thumb_expand_epilogue ()
{
! HOST_WIDE_INT amount = (arm_get_frame_size ()
+ current_function_outgoing_args_size);
/* Naked functions don't have prologues. */
Index: config/arm/arm.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/arm/arm.h,v
retrieving revision 1.159
diff -c -r1.159 arm.h
*** config/arm/arm.h 6 Sep 2002 14:54:48 -0000 1.159
--- config/arm/arm.h 6 Sep 2002 21:37:38 -0000
***************
*** 689,694 ****
--- 689,696 ----
#define STACK_BOUNDARY 32
+ #define PREFERRED_STACK_BOUNDARY (TARGET_ATPCS ? 64 : 32)
+
#define FUNCTION_BOUNDARY 32
/* The lowest bit is used to indicate Thumb-mode functions, so the
***************
*** 1681,1687 ****
if ((TO) == STACK_POINTER_REGNUM) \
{ \
(OFFSET) += current_function_outgoing_args_size; \
! (OFFSET) += ROUND_UP (get_frame_size ()); \
} \
}
--- 1683,1689 ----
if ((TO) == STACK_POINTER_REGNUM) \
{ \
(OFFSET) += current_function_outgoing_args_size; \
! (OFFSET) += arm_get_frame_size (); \
} \
}
More information about the Gcc-patches
mailing list