This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [ARM] Fix register r3 wrongly used to save ip in nested APCS frame
- From: Richard Earnshaw <rearnsha at arm dot com>
- To: Eric Botcazou <ebotcazou at adacore dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 19 Dec 2013 15:38:26 +0000
- Subject: Re: [ARM] Fix register r3 wrongly used to save ip in nested APCS frame
- Authentication-results: sourceware.org; auth=none
- References: <2528794 dot 36Q92I15i4 at polaris> <1601733 dot pWYsbSa9Xg at polaris> <52975D28 dot 10505 at arm dot com> <17166670 dot OmqLAYtanj at polaris>
On 02/12/13 10:11, Eric Botcazou wrote:
>> My apologies for taking so long to look at this.
>
> No problem, all the more so that...
>
>>> 2013-09-05 Eric Botcazou <ebotcazou@adacore.com>
>>>
>>> * config/arm/arm.c (arm_expand_prologue): In a nested APCS frame with
>>> arguments to push onto the stack and no varargs, save ip into a stack
>>> slot if r3 isn't available on entry.
>>
>> Sorry, but this is not quite right either, as shown by the attached
>> testcase (in C this time, so we can commit it to gcc.target/arm :-)
>>
>> The problem is that if we have some alignment padding we end up storing
>> ip in one location but restoring it from another.
>>
>> str ip, [sp, #4]
>> add ip, sp, #8
>> stmfd sp!, {fp, ip, lr, pc}
>> sub fp, ip, #12
>> ldr ip, [fp, #4] // <==== Should be fp + 8
>> @ ip needed
>> str r3, [fp, #8]
>> ldr ip, [ip]
>
> ...ugh, right, the computation of the slot address was supposed to be clever
> but is totally broken instead. :-( Thanks for spotting it, I don't understand
> how we haven't seen this on VxWorks (although your testcase does not fail on
> VxWorks, since args_to_push is 4 instead of 8 as on arm-eabi).
>
> The computation of the slot address is:
>
> + {
> + rtx x = plus_constant (Pmode, stack_pointer_rtx,
> + args_to_push - 4);
> + insn = emit_insn (gen_addsi3 (stack_pointer_rtx,
> + stack_pointer_rtx,
> + GEN_INT (- args_to_push)));
> + emit_set_insn (gen_frame_mem (SImode, x), ip_rtx);
> + }
>
> meaning that IP will be saved into the first slot on the stack. But the
> restore code is:
>
> /* Recover the static chain register. */
> if (!arm_r3_live_at_start_p () || saved_pretend_args)
> insn = gen_rtx_REG (SImode, 3);
> else
> {
> insn = plus_constant (Pmode, hard_frame_pointer_rtx, 4);
> insn = gen_frame_mem (SImode, insn);
> }
> emit_set_insn (ip_rtx, insn);
>
> meaning that IP is restored from the last slot on the stack, so this can work
> only if args_to_push == 4.
>
> Revised patch attached, your testcase passes on arm-eabi with it. Does it
> look OK to you? If so, I'll run a testing cycle on arm-vxworks and arm-eabi.
>
>
> * config/arm/arm.c (arm_expand_prologue): In a nested APCS frame with
> arguments to push onto the stack and no varargs, save ip into the last
> stack slot if r3 isn't available on entry.
>
>
This generates:
sub sp, sp, #n
str ip, [sp]
which can be done in one instruction, vis:
str ip, [sp, #-n]!
Very similar code to do this essentially already exists a few lines
earlier (in the args_to_push == 0 case), but needs tweaking to use
pre-modify rather than pre-dec when n != 4.
I'll pre-approve a patch that makes that change.
R.