This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: PATCH: PR target/24419: ix86 prologue clobbers memory when it shouldn't
On Fri, Oct 21, 2005 at 07:38:10PM +0200, Jan Hubicka wrote:
> and I think logic here is wrong too. If frame->nregs is 1 and we need
> to allocate additional 8 bytes of stack frame we must end up using push
> on TARGET_SUB_ESP_8 as otherwise the combined stack adjustment of 16
> would end up converted to double push and we would kill the saved
> argument. I think it is easiest to simply see how things sums up and
> then undo the decision. Does something like this work for you?
>
> Index: config/i386/i386.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/config/i386/i386.c,v
> retrieving revision 1.861
> diff -c -3 -p -r1.861 i386.c
> *** config/i386/i386.c 25 Sep 2005 20:01:07 -0000 1.861
> --- config/i386/i386.c 21 Oct 2005 17:31:29 -0000
> *************** ix86_compute_frame_layout (struct ix86_f
> *** 4682,4687 ****
> --- 4682,4703 ----
> else
> frame->red_zone_size = 0;
> frame->to_allocate -= frame->red_zone_size;
> +
> + /* If push/pop sequence would end up being cheaper, use it. */
> + if (frame->save_regs_using_mov
> + && ((frame->to_allocate + frame.nregs * UNITS_PER_WORD == UNITS_PER_WORD
> + && !TARGET_SUB_ESP_4)
> + || (frame->to_allocate + frame.nregs * UNITS_PER_WORD == 2 * UNITS_PER_WORD
> + && !TARGET_SUB_ESP_8)))
frame.nregs should be frame->nregs.
> + {
> + frame->save_regs_using_mov = false;
> + if (TARGET_RED_ZONE && current_function_sp_is_unchanging
> + && current_function_is_leaf)
> + {
> + frame->to_allocate += frame->nregs * UNITS_PER_WORD;
> + frame->red_zone_size -= frame->nregs * UNITS_PER_WORD;
> + }
> + }
> frame->stack_pointer_offset -= frame->red_zone_size;
> #if 0
> fprintf (stderr, "nregs: %i\n", frame->nregs);
>
> The pop conditional has similar issue but there it would result in
> suboptimal code sequence only and to be honest I am not sure it matters
> fixing until we find some real world usage for this combination of
> flags. Does this combnation seem to win in your benchmarks?
>
I am trying to find it out. But if the combination generates
suboptimal code sequence, I will never know. I am getting
movq (%rsp), %rbx
movq 8(%rsp), %rbp
popq %rdx
popq %rcx
ret
when TARGET_EPILOGUE_USING_MOVE is true and TARGET_ADD_ESP_8 is false.
H.J.