Bug 45701 - [4.6 Regression] Fail to prefer using r3 for padding a push/pop multiple to 8-byte alignment
Summary: [4.6 Regression] Fail to prefer using r3 for padding a push/pop multiple to 8...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.6.0
: P1 normal
Target Milestone: 4.6.0
Assignee: Jakub Jelinek
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2010-09-17 10:51 UTC by Yao Qi
Modified: 2011-02-02 17:46 UTC (History)
4 users (show)

See Also:
Host: i486-build_pc-linux-gnu
Target: arm-unknown-linux-gnueabi
Build:
Known to work:
Known to fail:
Last reconfirmed: 2010-12-14 09:34:41


Attachments
Test case (231 bytes, text/plain)
2010-09-17 10:52 UTC, Yao Qi
Details
gcc46-pr45701.patch (1.28 KB, patch)
2010-12-21 12:30 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Yao Qi 2010-09-17 10:51:54 UTC
This was implemented in gcc-4.5, still works in gcc-4.5 branch.
2009-06-02 Richard Earnshaw <rearnsha@arm.com>

        * arm.c (arm_get_frame_offsets): Prefer using r3 for padding a
        push/pop multiple to 8-byte alignment.
However, it doesn't work any longer on gcc-4.6 trunk.  Reproduced on the following steps,

$ arm-none-linux-gnueabi-gcc -Os -mthumb -march=armv7-a bashline.c -S

On gcc-4.5 branch, r3 is used to keep stack alignment, which is expected.
history_expand_line_internal:
        @ args = 0, pretend = 0, frame = 0 
        @ frame_needed = 0, uses_anonymous_args = 0 
        push    {r3, r4, r5, r6, r7, lr} 

However, on gcc-4.6 trunk, we can see that, r8 is used to keep stack alignment, which is *not* expected.
history_expand_line_internal:
        @ args = 0, pretend = 0, frame = 0 
        @ frame_needed = 0, uses_anonymous_args = 0 
        push    {r4, r5, r6, r7, r8, lr}
Comment 1 Yao Qi 2010-09-17 10:52:44 UTC
Created attachment 21816 [details]
Test case
Comment 2 Yao Qi 2010-09-19 13:19:22 UTC
In arm.c:arm_get_frame_offsets (void)

	  if (!crtl->tail_call_emit
	      && arm_size_return_regs () <= 12
	      && (offsets->saved_regs_mask & (1 << 3)) == 0)
	    {
	      reg = 3;
	    }

In gcc-4.5, crtl->tail_call_emit is always false, so the condition is true.  However, in gcc-4.6, crtl->tail_call_emit can be true.  My stupid question is 'why can't we use r3 when we do tail call optimization?'.  I read one sentence in comment, 'We try to avoid using the arg registers (r0 -r3) as they might be  used to pass values in a tail call.'.  Is it the answer to my question?

If that is the answer, in oder to fix this bug, we may refine the condition to 'tail call insns are emitted, and r3 is used to pass values to tail call', like this,

  if (!(crtl->tail_call_emit && used_to_pass_values (3))
       && ...
       && ...)

Is this a correct fix?
Comment 3 Yao Qi 2010-09-22 02:18:36 UTC
Patch is submitted to gcc-patches for review
http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01741.html
Comment 4 Jakub Jelinek 2010-12-21 12:30:25 UTC
Created attachment 22835 [details]
gcc46-pr45701.patch

Remembering pointers to tail call insns is both wasteful (allocates stuff that is never used anywhere but on arm), error-prone (nothing adjusts the list if
tail calls are deleted, or duplicated, etc.) and unnecessary, because
the list is trivially available from the IL.

Here is completely untested patch (tested just on the 3 testcases with a cross).
Could anyone interested in ARM target please bootstrap/regtest this?
We have way too many open P1 bugs.
Comment 5 Ramana Radhakrishnan 2010-12-24 11:04:08 UTC
Ok I'll test this. (In reply to comment #4)
> Created attachment 22835 [details]
> gcc46-pr45701.patch
> 
> Remembering pointers to tail call insns is both wasteful (allocates stuff that
> is never used anywhere but on arm), error-prone (nothing adjusts the list if
> tail calls are deleted, or duplicated, etc.) and unnecessary, because
> the list is trivially available from the IL.
> 
> Here is completely untested patch (tested just on the 3 testcases with a
> cross).
> Could anyone interested in ARM target please bootstrap/regtest this?
> We have way too many open P1 bugs.

I'll test this.
Comment 6 Jakub Jelinek 2011-01-04 10:37:36 UTC
(In reply to comment #5)
> I'll test this.

Have you managed to test it already?  Thanks.
Comment 7 Jakub Jelinek 2011-01-15 13:35:46 UTC
On IRC you've mentioned some TLS test issues, were they caused by this patch or unrelated?
Comment 8 Ramana Radhakrishnan 2011-01-15 15:16:59 UTC
On Sat, Jan 15, 2011 at 1:36 PM, jakub at gcc dot gnu.org
<gcc-bugzilla@gcc.gnu.org> wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45701
>
> --- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-01-15 13:35:46 UTC ---
> On IRC you've mentioned some TLS test issues, were they caused by this patch or
> unrelated?

Been travelling this week. Should be able to get answers once I can
access my board next week.


cheers
Ramana

>
> --
> Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.
>
Comment 9 Ramana Radhakrishnan 2011-01-19 10:39:28 UTC
(In reply to comment #7)
> On IRC you've mentioned some TLS test issues, were they caused by this patch or
> unrelated?

Sorry about the delay in responding. With trunk I see no new regressions with my testing configurations . The tls failures were unrelated to this patch. 

Ok to submit 



Ramana
Comment 10 Jakub Jelinek 2011-01-25 16:22:45 UTC
Author: jakub
Date: Tue Jan 25 16:22:34 2011
New Revision: 169240

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169240
Log:
	PR target/45701
	* config/arm/arm.c (any_sibcall_uses_r3): New function.
	(arm_get_frame_offsets): Use it.

2011-01-25  Yao Qi  <yao@codesourcery.com>

        PR target/45701
        * gcc.target/arm/pr45701-1.c: New test.
        * gcc.target/arm/pr45701-2.c: New test.
	* gcc.target/arm/pr45701-3.c: New test.

Added:
    trunk/gcc/testsuite/gcc.target/arm/pr45701-1.c
    trunk/gcc/testsuite/gcc.target/arm/pr45701-2.c
    trunk/gcc/testsuite/gcc.target/arm/pr45701-3.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/arm/arm.c
    trunk/gcc/testsuite/ChangeLog
Comment 11 Jakub Jelinek 2011-01-25 16:24:30 UTC
Fixed.
Comment 12 Diego Novillo 2011-02-02 17:46:52 UTC
Author: dnovillo
Date: Wed Feb  2 17:46:50 2011
New Revision: 169583

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169583
Log:
	PR target/45701
	* config/arm/arm.c (any_sibcall_uses_r3): New function.
	(arm_get_frame_offsets): Use it.

2011-01-25  Yao Qi  <yao@codesourcery.com>

        PR target/45701
        * gcc.target/arm/pr45701-1.c: New test.
        * gcc.target/arm/pr45701-2.c: New test.
	* gcc.target/arm/pr45701-3.c: New test.

Added:
    branches/google/integration/gcc/testsuite/gcc.target/arm/pr45701-1.c
    branches/google/integration/gcc/testsuite/gcc.target/arm/pr45701-2.c
    branches/google/integration/gcc/testsuite/gcc.target/arm/pr45701-3.c
Modified:
    branches/google/integration/gcc/ChangeLog
    branches/google/integration/gcc/config/arm/arm.c
    branches/google/integration/gcc/testsuite/ChangeLog