Bug 38644 - [4.6 Regression] Optimization flag -O1 -fschedule-insns2 causes wrong code
Summary: [4.6 Regression] Optimization flag -O1 -fschedule-insns2 causes wrong code
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 4.3.2
: P2 normal
Target Milestone: 4.6.0
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
: 42155 42452 44091 50081 (view as bug list)
Depends on:
Blocks:
 
Reported: 2008-12-27 22:09 UTC by Dave Murphy
Modified: 2022-10-14 18:23 UTC (History)
21 users (show)

See Also:
Host:
Target: arm-none-eabi
Build:
Known to work: 4.2.4, 4.7.0
Known to fail: 4.3.2, 4.4.2, 4.5.0, 4.6.0
Last reconfirmed: 2010-08-12 10:08:35


Attachments
preprocessed source for arm-eabi testcase (150 bytes, application/octet-stream)
2008-12-27 22:10 UTC, Dave Murphy
Details
proposed 4.6 fix for PR38644 (446 bytes, patch)
2010-05-26 12:44 UTC, Mikael Pettersson
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dave Murphy 2008-12-27 22:09:46 UTC
The -fschedule-insns2 optimisation causes wrong code to be emitted for the following testcase. The assembly code loads a value from a stack frame which has already been deallocated.

This is similar to http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30282 for powerpc-eabi.

extern int doStreamReadBlock(int *, char *, int size, int);

char readStream(int *s)
{
    char c = 0;
    doStreamReadBlock(s, &c, 1, *s);
    return c;
}

arm-eabi davem$ arm-eabi-gcc -v -O2 -mthumb -c test.c 
Using built-in specs.
Target: arm-eabi
Configured with: ../../gcc-4.3.2/configure --enable-languages=c,c++
--with-cpu=arm7tdmi --enable-interwork --enable-multilib --with-gcc
--with-gnu-ld --with-gnu-as --disable-shared --disable-threads
--disable-win32-registry --disable-nls --disable-debug --disable-libmudflap
--disable-libssp --disable-libgomp --disable-libstdcxx-pch --target=arm-eabi
--with-newlib --prefix=/opt/devkitpro/devkitARM
--with-bugurl=http://wiki.devkitpro.org/index.php/Bug_Reports
--with-pkgversion='devkitARM release 24'
Thread model: single
gcc version 4.3.2 (devkitARM release 24) 

00000000 <readStream>:
   0:   b510            push    {r4, lr}
   2:   b082            sub     sp, #8
   4:   466c            mov     r4, sp
   6:   3407            adds    r4, #7
   8:   2300            movs    r3, #0
   a:   7023            strb    r3, [r4, #0]
   c:   1c21            adds    r1, r4, #0
   e:   6803            ldr     r3, [r0, #0]
  10:   2201            movs    r2, #1
  12:   f7ff fffe       bl      0 <doStreamReadBlock>
  16:   b002            add     sp, #8          <--- stack frame deallocated
  18:   7820            ldrb    r0, [r4, #0]    <--- value loaded from stack frame
  1a:   bc10            pop     {r4}
  1c:   bc02            pop     {r1}
  1e:   4708            bx      r1

The same thing happens with mainline

 /opt/devkitpro/devkitARM_mainline/bin/arm-eabi-gcc -v -O2 -mthumb -c test.c 
Using built-in specs.
Target: arm-eabi
Configured with: ../../../gcc_mainline/configure --disable-nls
--target=arm-eabi --prefix=/opt/devkitPro/devkitARM --enable-languages=c,c++
--with-cpu=arm7tdmi --enable-interwork --enable-multilib --with-gcc
--with-gnu-ld --with-gnu-as --disable-shared --disable-threads
--disable-win32-registry --disable-nls --disable-debug --disable-libmudflap
--disable-libssp --disable-libgomp --disable-libstdcxx-pch
Thread model: single
gcc version 4.4.0 20081223 (experimental) (GCC) 

00000000 <readStream>:
   0:   b510            push    {r4, lr}
   2:   b082            sub     sp, #8
   4:   466c            mov     r4, sp
   6:   3407            adds    r4, #7
   8:   2300            movs    r3, #0
   a:   7023            strb    r3, [r4, #0]
   c:   1c21            adds    r1, r4, #0
   e:   6803            ldr     r3, [r0, #0]
  10:   2201            movs    r2, #1
  12:   f7ff fffe       bl      0 <doStreamReadBlock>
  16:   b002            add     sp, #8
  18:   7820            ldrb    r0, [r4, #0]
  1a:   bc10            pop     {r4}
  1c:   bc02            pop     {r1}
  1e:   4708            bx      r1
Comment 1 Dave Murphy 2008-12-27 22:10:19 UTC
Created attachment 16993 [details]
preprocessed source for arm-eabi testcase
Comment 2 Richard Earnshaw 2009-03-17 00:03:45 UTC
Confirmed, this is a nasty bug that might silently bite users after a long period of apparently correct operation.
Comment 3 Andrew Pinski 2009-11-23 16:42:08 UTC
*** Bug 42155 has been marked as a duplicate of this bug. ***
Comment 4 Andrew Pinski 2009-12-21 21:46:54 UTC
*** Bug 42452 has been marked as a duplicate of this bug. ***
Comment 5 Richard Earnshaw 2009-12-22 11:16:40 UTC
IMO this is a generic bug in the scheduler.  The code in sched-deps.c should note that STACK_POINTER_RTX is being changed and insert a memory barrier that prevents migration of stack-related memory accesses across the change.

Of course, determining what memory accesses are stack-related is quite hard, and it may be that all memory accesses in the same address space as the stack will need to be restricted.
Comment 6 Steven Bosscher 2009-12-22 11:38:28 UTC
If this is a generic bug, why are all dups of this for ARM targets?
Comment 7 Richard Earnshaw 2009-12-22 11:58:43 UTC
(In reply to comment #6)
> If this is a generic bug, why are all dups of this for ARM targets?
> 

Just because it's only been reported against ARM doesn't mean it's not a generic problem.
Comment 8 Jakub Jelinek 2009-12-22 12:07:38 UTC
No, it likely means other backends insert memory barriers where needed themselves.
Comment 9 Richard Earnshaw 2009-12-22 13:33:40 UTC
I've looked at several backends and certainly not all do (sparc for example).

I think they get away with it because the stack pointer is valid in all addressing constructs -- that's not true for Thumb where SP can only be used for 32-bit loads.  However, that doesn't mean there isn't an underlying bug.
Comment 10 Steven Bosscher 2009-12-22 14:08:19 UTC
Re. #7 I'm not implying there is no generic bug. I just noticed the pattern (all reports for ARM) and thought that maybe the solution can be found in at least one of the other back ends.
Comment 11 Chengfa Wang 2009-12-22 14:35:29 UTC
My friend also found it with powerpc.
Comment 12 Andrew Pinski 2009-12-22 14:58:43 UTC
(In reply to comment #11)
> My friend also found it with powerpc.

Well some (most?) PPC ABIs there is a red zone which allows this to valid.
Comment 13 Andrew Pinski 2010-05-13 14:22:06 UTC
*** Bug 44091 has been marked as a duplicate of this bug. ***
Comment 14 Sebastian Huber 2010-05-17 09:04:54 UTC
This bug is present since r130052 which is related to:

http://gcc.gnu.org/ml/gcc-patches/2007-10/msg01814.html

If this is a generic bug or not is quite irrelevant since this is a very serious bug on ARM/Thumb.  I am a bit irritated why this bug survived the 4.4.0 and 4.5.0 release.  In a multi threaded environment it is pretty hard to find these kind of problems.  The time frame in which an interrupt can corrupt the stack frame lies between two instructions.

Is -fno-schedule-insns2 a workaround for this problem?
Comment 15 Mikael Pettersson 2010-05-26 12:44:33 UTC
Created attachment 20749 [details]
proposed 4.6 fix for PR38644

PR38644 and its dupes are very similar to PowerPC PR44199 and PR30282.  PR44199 was recently fixed by conditionally emitting stack ties in epilogues.  I first intended to simply clone that approach for Thumb1, but it turns out there's already a conditional barrier in thumb1_expand_epilogue ().  So for now I simply extended that condition to also trigger whenever there's stack pointer adjustment in the epilogue.  I've confirmed that this fixes the test cases in PR38644, PR42155, and PR44091.

I know that Richard Earnshaw has stated that he considers this a middle-end rather than a back-end bug, and I agree with that.  However, given that this wrong-code bug has been known for so long with no middle-end fix in sight, I think solving it in the back-end is appropriate for now, at least for 4.4/4.5.

The current patch is a little too heavy in that it also blocks non memory accesses from being scheduled past the stack pointer adjustment -- I saw an example of that in the large PR44091 test case.  Using a stack tie instead of a full barrier should hopefully fix that.

So far only tested with 4.4/4.5/4.6 crosses to armv5tel-unknown-linux-gnueabi.  I'll try this in a 4.5 native bootstrap soonish (4.6 bootstraps are currently broken on ARM, see PR44255).
Comment 16 Mikael Pettersson 2010-06-06 19:16:49 UTC
Patch posted:
http://gcc.gnu.org/ml/gcc-patches/2010-06/msg00481.html

I tried to use the existing stack tie instead of a full barrier, but it had no effect at all, causing the mis-schedules to reappear.  I also tried to port over the PowerPC version of the stack tie, but that ICEd the compiler; I'm not yet good enough at .md hackery to resolve that one.  So I went back to the initial patch, and bootstrapped and regtested it in native builds of 4.6, 4.5, and 4.4 on armv5tel-linux-gnueabi.
Comment 17 Sebastian Huber 2010-06-10 07:13:41 UTC
Thank you for your investigations.  A special case fix is better than nothing.  I am not a GCC expert but it seems that this kind of bug appears quite regularly on different platforms and all use special case code to avoid the evil consequences.  If it is a middle-end bug it should draw more attention by the middle-end developers.  You cannot detect this bug through simple test cases like a flow control or math problem.  With normal unit tests you will likely not find this bug in your application.  You need at least two threads of execution and you have to do certain things in between a wee bit of machine instructions.

Comment 18 Sebastian Huber 2010-08-12 08:19:03 UTC
This bug is still present in GCC 4.6.0 20100807 (arm-eabi-gcc -O1 -fschedule-insns2 -mthumb):

readStream:
	push	{r4, lr}
	sub	sp, sp, #8
	mov	r4, sp
	mov	r3, #0
	strb	r3, [r4, #7]
	add	r4, r4, #7
	ldr	r3, [r0]
	mov	r1, r4
	mov	r2, #1
	bl	doStreamReadBlock
	add	sp, sp, #8
	ldrb	r0, [r4]
	@ sp needed for prologue
	pop	{r4}
	pop	{r1}
	bx	r1
	.size	readStream, .-readStream
	.ident	"GCC: (GNU) 4.6.0 20100807 (experimental)"
Comment 19 Steven Bosscher 2010-08-12 10:00:34 UTC
According to comment#14, a patch from Alexander Monakov introduced this bug, therefore:

1. this is a regression on a primary platform => priority should be set P1
Comment 20 Steven Bosscher 2010-08-12 10:00:54 UTC
...and
2. Add richi and amonakov to CC:
Comment 21 Steven Bosscher 2010-08-12 10:08:35 UTC
Re. comment #14 "I am a bit irritated why this bug survived the 4.4.0
and 4.5.0 release.": Yes, well, ARM maintainers have been in the CC-list for this bug since the beginning, and apparently it was even too much trouble for them to see if this is a regression or not... :-(

Anyway, many thanks to Sebastian Huber for identifying the revision that introduced (or exposed) this bug.
Comment 22 Alexander Monakov 2010-08-12 10:12:10 UTC
It looks like patch from comment #16 should fix the problem, but was not reviewed and/or applied.
Comment 23 Steven Bosscher 2010-08-12 11:37:42 UTC
The patch from comment #16 only fixes the symptom, and only on ARM. It is not a proper fix for the generic problem that is apparently also visible on POWER.
Comment 24 Jakub Jelinek 2010-08-12 11:47:52 UTC
It is not visible on POWER, because it has been fixed there.
Comment 25 Alexander Monakov 2010-08-12 12:00:08 UTC
(In reply to comment #23)
> The patch from comment #16 only fixes the symptom, and only on ARM. It is not a
> proper fix for the generic problem that is apparently also visible on POWER.

PR30282 audit trail contains more discussion of this problem.  Jim Wilson argues that this problem should be addressed by emitting stack ties in epilogues for targets that suffer from this problem (other targets apparently do not thanks to red zone).  POWER was fixed that way (PR44199).
Comment 26 Richard Biener 2010-08-12 12:04:29 UTC
(In reply to comment #19)
> According to comment#14, a patch from Alexander Monakov introduced this bug,
> therefore:
> 
> 1. this is a regression on a primary platform => priority should be set P1

It's not P1 because P1 is reserved for serious bugs that were never in any
release which isn't true here.  P1 _block_ a release, it is unreasonable
to do so in general if a previous release shipped with that bug.
Comment 27 Richard Earnshaw 2010-08-12 12:13:13 UTC
(In reply to comment #21)
> Re. comment #14 "I am a bit irritated why this bug survived the 4.4.0
> and 4.5.0 release.": Yes, well, ARM maintainers have been in the CC-list for
> this bug since the beginning, and apparently it was even too much trouble for
> them to see if this is a regression or not... :-(
> 
> Anyway, many thanks to Sebastian Huber for identifying the revision that
> introduced (or exposed) this bug.
> 

So this ARM maintainer, proposed a fix for the problem (a generic bug, not a back-end bug).  But because it seems that generating correct code on all targets isn't a priority, it was rejected.

The compiler shouldn't be generating unsafe code by default; back-ends shouldn't need to paper over bugs in the MI code.



Comment 28 Jakub Jelinek 2010-08-12 12:26:18 UTC
The problem is that stuff like red-zone presence and size isn't known to the middle-end, all that stuff is backend private, so I think the right way is to handle this in the backends and most of the backends managed to handle it.
Comment 29 Richard Earnshaw 2010-08-12 12:30:58 UTC
(In reply to comment #28)
> The problem is that stuff like red-zone presence and size isn't known to the
> middle-end, all that stuff is backend private, so I think the right way is to
> handle this in the backends and most of the backends managed to handle it.
> 

No, the middle end code must fail safe.  If targets don't need that, then they should have the ability to turn it off; not the other way around.

This is critical because it leads to silent failures otherwise.
Comment 30 Richard Biener 2010-08-30 15:48:34 UTC
A regression but no known-to-work version?
Comment 31 Mikael Pettersson 2010-08-30 18:59:41 UTC
(In reply to comment #30)
> A regression but no known-to-work version?

4.2.4 is known to work.  See
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44091#c5
Comment 32 Sebastian Huber 2010-09-30 15:36:02 UTC
Which target milestone do you intend for a fix?  It is still present in 4.6.0 20100925.
Comment 33 Joel Sherrill 2011-01-31 21:26:08 UTC
Any chance this can get some attention before 4.6 branches?  Please.
Comment 34 Jeffrey A. Law 2011-02-07 16:27:24 UTC
Typically the right thing to do is to block all memory motions across a change in the stack pointer.    It's somewhat overly pessimistic, but in reality the few motions lost aren't going to be performance critical.

In the past each backend has emitted the blockage/barrier and it typically happened soon after the port was converted to use RTL prologues/epilogues...  That's probably the main reason why this was never fixed in the scheduler itself -- the first couple ports emitted a blockage and after that it became normal practice.

I would support both emitting the suitable blockage insn in the ARM backend or adding a dependency between the stack pointer adjustment insn and all memory insns in the scheduler.  Either is IMHO acceptable given history.  The first would be slightly preferred during this late stage of development with the latter being more appropriate in early stage development.
Comment 35 Jiangning Liu 2011-04-26 15:13:41 UTC
I verified that two patches in #38644 (back end) and #30282 (middle end) both work for attached cases. Here comes my two cents,

1) The concept of red zone is to control whether instructions can write memory below current stack frame or not, and it is only being supported by ABIs for some particular ISAs, so it shouldn't be enabled in middle end by default for all targets. At this point, middle end should be fixed to avoid doing things unwanted in general for all targets.

2) Red zone is an orthogonal concept to prologue/epilogue, so it is not good to fix this issue in prologue/epilogue back end code. At this point, we shouldn't simply fix it in back end by adding barrier to implicitly disable red zone. Instead, some hooks should be abstracted in middle end to support it in scheduling dependence (middle end code). Back end like X86-64 should enable it through hooks by itself. 

The key here is red zone should be a clean feature to be supported in middle end. Exposing this kind of stuff to back end through hooks can improve code quality for middle end and avoid bringing the bugs to back-end. 

This bug has long history, and it is being now or has ever been exposed on ARM, POWER and X86(with some options combination). Fixing it in middle end is not only a bug fix, but a simple infrastructure improvement. Due to the long duration and the extensive impact for different targets, I don't see good reason of not fixing it in mainline ASAP.
Comment 36 Richard Biener 2011-06-27 12:14:12 UTC
4.3 branch is being closed, moving to 4.4.7 target.
Comment 37 Sebastian Huber 2011-08-04 12:30:29 UTC
Is this problem also related to this bug (GCC 4.6.1 20110627) with comments inside:

objdump -d -C /opt/rtems-4.11/lib/gcc/arm-rtems4.11/4.6.1/thumb/vfp/libstdc++.a
| grep -A 94 new_opnt.o

new_opnt.o:     file format elf32-littlearm


Disassembly of section .text._ZnwmRKSt9nothrow_t:

00000000 <operator new(unsigned long, std::nothrow_t const&)>:
   0:   b5f0            push    {r4, r5, r6, r7, lr}
   2:   465f            mov     r7, fp
   4:   4656            mov     r6, sl
   6:   464d            mov     r5, r9
   8:   4644            mov     r4, r8
   a:   b4f0            push    {r4, r5, r6, r7}
   c:   b090            sub     sp, #64 ; 0x40
   e:   4b2a            ldr     r3, [pc, #168]  ; (b8 <operator new(unsigned
long, std::nothrow_t const&)+0xb8>)
  10:   af00            add     r7, sp, #0
  12:   627b            str     r3, [r7, #36]   ; 0x24
  14:   4b29            ldr     r3, [pc, #164]  ; (bc <operator new(unsigned
long, std::nothrow_t const&)+0xbc>)
  16:   62bb            str     r3, [r7, #40]   ; 0x28
  18:   4b29            ldr     r3, [pc, #164]  ; (c0 <operator new(unsigned
long, std::nothrow_t const&)+0xc0>)
  1a:   6078            str     r0, [r7, #4]
  1c:   2240            movs    r2, #64 ; 0x40
  1e:   1c38            adds    r0, r7, #0
  20:   19d2            adds    r2, r2, r7
  22:   633b            str     r3, [r7, #48]   ; 0x30
  24:   300c            adds    r0, #12
  26:   466b            mov     r3, sp
  28:   62fa            str     r2, [r7, #44]   ; 0x2c
  2a:   637b            str     r3, [r7, #52]   ; 0x34
  2c:   f7ff fffe       bl      0 <_Unwind_SjLj_Register>
  30:   687a            ldr     r2, [r7, #4]
  32:   2a00            cmp     r2, #0
  34:   d101            bne.n   3a <operator new(unsigned long, std::nothrow_t
const&)+0x3a>
  36:   2301            movs    r3, #1
  38:   607b            str     r3, [r7, #4]
  3a:   6878            ldr     r0, [r7, #4]
  3c:   f7ff fffe       bl      0 <malloc>
  40:   6038            str     r0, [r7, #0]
  42:   2800            cmp     r0, #0
  44:   d123            bne.n   8e <operator new(unsigned long, std::nothrow_t
const&)+0x8e>
  46:   4a1f            ldr     r2, [pc, #124]  ; (c4 <operator new(unsigned
long, std::nothrow_t const&)+0xc4>)
  48:   6813            ldr     r3, [r2, #0]
  4a:   2b00            cmp     r3, #0
  4c:   d104            bne.n   58 <operator new(unsigned long, std::nothrow_t
const&)+0x58>
  4e:   e021            b.n     94 <operator new(unsigned long, std::nothrow_t
const&)+0x94>
  50:   4a1c            ldr     r2, [pc, #112]  ; (c4 <operator new(unsigned
long, std::nothrow_t const&)+0xc4>)
  52:   6813            ldr     r3, [r2, #0]
  54:   2b00            cmp     r3, #0
  56:   d009            beq.n   6c <operator new(unsigned long, std::nothrow_t
const&)+0x6c>
  58:   2201            movs    r2, #1
  5a:   613a            str     r2, [r7, #16]
  5c:   f000 f834       bl      c8 <operator new(unsigned long, std::nothrow_t
const&)+0xc8>
  60:   6878            ldr     r0, [r7, #4]
  62:   f7ff fffe       bl      0 <malloc>
  66:   60b8            str     r0, [r7, #8]
  68:   2800            cmp     r0, #0
  6a:   d0f1            beq.n   50 <operator new(unsigned long, std::nothrow_t
const&)+0x50>
  6c:   1c38            adds    r0, r7, #0
  6e:   300c            adds    r0, #12
  70:   f7ff fffe       bl      0 <_Unwind_SjLj_Unregister>

BAD CODE BEGIN

  74:   46bd            mov     sp, r7

r7 is now the current stack pointer.

  76:   b010            add     sp, #64 ; 0x40

Current stack frame is free now, r7 points to obsolete stack frame.

  78:   68b8            ldr     r0, [r7, #8]

Here we read from the stack frame freed previously.  This is a disaster in
multi-threaded environments, because the exception code will use the stack of
an interrupted thread.

BAD CODE END

  7a:   bc3c            pop     {r2, r3, r4, r5}
  7c:   4690            mov     r8, r2
  7e:   4699            mov     r9, r3
  80:   46a2            mov     sl, r4
  82:   46ab            mov     fp, r5
  84:   bdf0            pop     {r4, r5, r6, r7, pc}
  86:   f7ff fffe       bl      0 <__cxa_begin_catch>
  8a:   f7ff fffe       bl      0 <__cxa_end_catch>
  8e:   683b            ldr     r3, [r7, #0]
  90:   60bb            str     r3, [r7, #8]
  92:   e7eb            b.n     6c <operator new(unsigned long, std::nothrow_t
const&)+0x6c>
  94:   2200            movs    r2, #0
  96:   60ba            str     r2, [r7, #8]
  98:   e7e8            b.n     6c <operator new(unsigned long, std::nothrow_t
const&)+0x6c>
  9a:   3f40            subs    r7, #64 ; 0x40
  9c:   69bb            ldr     r3, [r7, #24]
  9e:   6978            ldr     r0, [r7, #20]
  a0:   2b01            cmp     r3, #1
  a2:   d0f0            beq.n   86 <operator new(unsigned long, std::nothrow_t
const&)+0x86>
  a4:   1c5a            adds    r2, r3, #1
  a6:   d004            beq.n   b2 <operator new(unsigned long, std::nothrow_t
const&)+0xb2>
  a8:   2301            movs    r3, #1
  aa:   425b            negs    r3, r3
  ac:   613b            str     r3, [r7, #16]
  ae:   f7ff fffe       bl      0 <_Unwind_SjLj_Resume>
  b2:   613b            str     r3, [r7, #16]
  b4:   f7ff fffe       bl      0 <__cxa_call_unexpected>
        ...
  c0:   0000009a        .word   0x0000009a
  c4:   00000000        .word   0x00000000
  c8:   4718            bx      r3
  ca:   46c0            nop                     ; (mov r8, r8)
Comment 38 Jiangning Liu 2011-08-05 01:33:06 UTC
Hi Sebastian,

I'm fixing this bug, and will send out a formal patch to community soon.

Is it possible you send me your source code exposing the bug first, so I can verify your problem can really be fixed with my local fix?

Appreciate your help in advance!

BTW, you may refer to http://gcc.gnu.org/ml/gcc/2011-08/msg00000.html for discussion details.

-Jiangning


> -----Original Message-----
> From: sebastian.huber@embedded-brains.de [mailto:gcc-
> bugzilla@gcc.gnu.org]
> Sent: Thursday, August 04, 2011 8:32 PM
> To: Jiangning Liu
> Subject: [Bug rtl-optimization/38644] [4.4/4.5/4.6/4.7 Regression]
> Optimization flag -O1 -fschedule-insns2 causes wrong code
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38644
>
> --- Comment #37 from Sebastian Huber <sebastian.huber@embedded-
> brains.de> 2011-08-04 12:30:29 UTC ---
> Is this problem also related to this bug (GCC 4.6.1 20110627) with
> comments
> inside:
>
> objdump -d -C /opt/rtems-4.11/lib/gcc/arm-
> rtems4.11/4.6.1/thumb/vfp/libstdc++.a
> | grep -A 94 new_opnt.o
>
> new_opnt.o:     file format elf32-littlearm
>
>
> Disassembly of section .text._ZnwmRKSt9nothrow_t:
>
> 00000000 <operator new(unsigned long, std::nothrow_t const&)>:
>    0:   b5f0            push    {r4, r5, r6, r7, lr}
>    2:   465f            mov     r7, fp
>    4:   4656            mov     r6, sl
>    6:   464d            mov     r5, r9
>    8:   4644            mov     r4, r8
>    a:   b4f0            push    {r4, r5, r6, r7}
>    c:   b090            sub     sp, #64 ; 0x40
>    e:   4b2a            ldr     r3, [pc, #168]  ; (b8 <operator
> new(unsigned
> long, std::nothrow_t const&)+0xb8>)
>   10:   af00            add     r7, sp, #0
>   12:   627b            str     r3, [r7, #36]   ; 0x24
>   14:   4b29            ldr     r3, [pc, #164]  ; (bc <operator
> new(unsigned
> long, std::nothrow_t const&)+0xbc>)
>   16:   62bb            str     r3, [r7, #40]   ; 0x28
>   18:   4b29            ldr     r3, [pc, #164]  ; (c0 <operator
> new(unsigned
> long, std::nothrow_t const&)+0xc0>)
>   1a:   6078            str     r0, [r7, #4]
>   1c:   2240            movs    r2, #64 ; 0x40
>   1e:   1c38            adds    r0, r7, #0
>   20:   19d2            adds    r2, r2, r7
>   22:   633b            str     r3, [r7, #48]   ; 0x30
>   24:   300c            adds    r0, #12
>   26:   466b            mov     r3, sp
>   28:   62fa            str     r2, [r7, #44]   ; 0x2c
>   2a:   637b            str     r3, [r7, #52]   ; 0x34
>   2c:   f7ff fffe       bl      0 <_Unwind_SjLj_Register>
>   30:   687a            ldr     r2, [r7, #4]
>   32:   2a00            cmp     r2, #0
>   34:   d101            bne.n   3a <operator new(unsigned long,
> std::nothrow_t
> const&)+0x3a>
>   36:   2301            movs    r3, #1
>   38:   607b            str     r3, [r7, #4]
>   3a:   6878            ldr     r0, [r7, #4]
>   3c:   f7ff fffe       bl      0 <malloc>
>   40:   6038            str     r0, [r7, #0]
>   42:   2800            cmp     r0, #0
>   44:   d123            bne.n   8e <operator new(unsigned long,
> std::nothrow_t
> const&)+0x8e>
>   46:   4a1f            ldr     r2, [pc, #124]  ; (c4 <operator
> new(unsigned
> long, std::nothrow_t const&)+0xc4>)
>   48:   6813            ldr     r3, [r2, #0]
>   4a:   2b00            cmp     r3, #0
>   4c:   d104            bne.n   58 <operator new(unsigned long,
> std::nothrow_t
> const&)+0x58>
>   4e:   e021            b.n     94 <operator new(unsigned long,
> std::nothrow_t
> const&)+0x94>
>   50:   4a1c            ldr     r2, [pc, #112]  ; (c4 <operator
> new(unsigned
> long, std::nothrow_t const&)+0xc4>)
>   52:   6813            ldr     r3, [r2, #0]
>   54:   2b00            cmp     r3, #0
>   56:   d009            beq.n   6c <operator new(unsigned long,
> std::nothrow_t
> const&)+0x6c>
>   58:   2201            movs    r2, #1
>   5a:   613a            str     r2, [r7, #16]
>   5c:   f000 f834       bl      c8 <operator new(unsigned long,
> std::nothrow_t
> const&)+0xc8>
>   60:   6878            ldr     r0, [r7, #4]
>   62:   f7ff fffe       bl      0 <malloc>
>   66:   60b8            str     r0, [r7, #8]
>   68:   2800            cmp     r0, #0
>   6a:   d0f1            beq.n   50 <operator new(unsigned long,
> std::nothrow_t
> const&)+0x50>
>   6c:   1c38            adds    r0, r7, #0
>   6e:   300c            adds    r0, #12
>   70:   f7ff fffe       bl      0 <_Unwind_SjLj_Unregister>
>
> BAD CODE BEGIN
>
>   74:   46bd            mov     sp, r7
>
> r7 is now the current stack pointer.
>
>   76:   b010            add     sp, #64 ; 0x40
>
> Current stack frame is free now, r7 points to obsolete stack frame.
>
>   78:   68b8            ldr     r0, [r7, #8]
>
> Here we read from the stack frame freed previously.  This is a disaster
> in
> multi-threaded environments, because the exception code will use the
> stack of
> an interrupted thread.
>
> BAD CODE END
>
>   7a:   bc3c            pop     {r2, r3, r4, r5}
>   7c:   4690            mov     r8, r2
>   7e:   4699            mov     r9, r3
>   80:   46a2            mov     sl, r4
>   82:   46ab            mov     fp, r5
>   84:   bdf0            pop     {r4, r5, r6, r7, pc}
>   86:   f7ff fffe       bl      0 <__cxa_begin_catch>
>   8a:   f7ff fffe       bl      0 <__cxa_end_catch>
>   8e:   683b            ldr     r3, [r7, #0]
>   90:   60bb            str     r3, [r7, #8]
>   92:   e7eb            b.n     6c <operator new(unsigned long,
> std::nothrow_t
> const&)+0x6c>
>   94:   2200            movs    r2, #0
>   96:   60ba            str     r2, [r7, #8]
>   98:   e7e8            b.n     6c <operator new(unsigned long,
> std::nothrow_t
> const&)+0x6c>
>   9a:   3f40            subs    r7, #64 ; 0x40
>   9c:   69bb            ldr     r3, [r7, #24]
>   9e:   6978            ldr     r0, [r7, #20]
>   a0:   2b01            cmp     r3, #1
>   a2:   d0f0            beq.n   86 <operator new(unsigned long,
> std::nothrow_t
> const&)+0x86>
>   a4:   1c5a            adds    r2, r3, #1
>   a6:   d004            beq.n   b2 <operator new(unsigned long,
> std::nothrow_t
> const&)+0xb2>
>   a8:   2301            movs    r3, #1
>   aa:   425b            negs    r3, r3
>   ac:   613b            str     r3, [r7, #16]
>   ae:   f7ff fffe       bl      0 <_Unwind_SjLj_Resume>
>   b2:   613b            str     r3, [r7, #16]
>   b4:   f7ff fffe       bl      0 <__cxa_call_unexpected>
>         ...
>   c0:   0000009a        .word   0x0000009a
>   c4:   00000000        .word   0x00000000
>   c8:   4718            bx      r3
>   ca:   46c0            nop                     ; (mov r8, r8)
>
> --
> Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.


-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium.  Thank you.
Comment 39 Ramana Radhakrishnan 2011-08-05 03:57:14 UTC
On Fri, Aug 5, 2011 at 2:33 AM, jiangning.liu at arm dot com
<gcc-bugzilla@gcc.gnu.org> wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38644
>
> --- Comment #38 from Jiangning Liu <jiangning.liu at arm dot com> 2011-08-05 01:33:06 UTC ---
> Hi Sebastian,
>
> I'm fixing this bug, and will send out a formal patch to community soon.
>
> Is it possible you send me your source code exposing the bug first, so I can
> verify your problem can really be fixed with my local fix?

I think this is just from the libstdc++ sources - look at that file
from the build with --with-mode=thumb .

Ramana
Comment 40 Sebastian Huber 2011-08-05 06:49:00 UTC
(In reply to comment #39)
> On Fri, Aug 5, 2011 at 2:33 AM, jiangning.liu at arm dot com
[...]
> > Is it possible you send me your source code exposing the bug first, so I can
> > verify your problem can really be fixed with my local fix?
> 
> I think this is just from the libstdc++ sources - look at that file
> from the build with --with-mode=thumb .

Yes, this is from the libstdc++ sources (4.6.1 20110627, libstdc++-v3/libsupc++/new_opnt.cc).  You need a non-EABI ARM variant of GCC since this bug manifestation will only show up in the SJLJ version.
Comment 41 Jiangning Liu 2011-08-09 02:04:52 UTC
> Yes, this is from the libstdc++ sources (4.6.1 20110627,
> libstdc++-v3/libsupc++/new_opnt.cc).  You need a non-EABI ARM variant of GCC
> since this bug manifestation will only show up in the SJLJ version.

I tried and my local patch works on this case. As you can see like below, it is fixed!

        add     r0, r0, #12                                                                                           
        bl      _Unwind_SjLj_Unregister
        ldr     r0, [r7, #8]
        mov     sp, r7
        add     sp, sp, #68
        @ sp needed for prologue
        pop     {r2, r3, r4, r5}

Thanks,
-Jiangning
Comment 42 LYZ 2011-08-15 08:08:46 UTC
*** Bug 50081 has been marked as a duplicate of this bug. ***
Comment 43 Sebastian Huber 2011-09-06 07:45:29 UTC
How long will this middle to back end ping pong last until this bug is finally fixed?  It is now open since 2008.
Comment 44 Joel Sherrill 2011-09-09 13:27:54 UTC
Ping.. any chance of getting the proposed 4.6 fix merged?  Please.  

This is almost a 3 year old bug.  It would be nice to get it closed.
Comment 45 Steven Bosscher 2011-09-11 15:42:50 UTC
(In reply to comment #41)
> I tried and my local patch works on this case. As you can see like below, it is
> fixed!

So, where is this local patch? And why not attach it here or post it to gcc-patches, instead of keeping it local? ;)
Comment 46 Sebastian Huber 2011-09-12 08:42:59 UTC
I guess the local patch will look like this:

http://gcc.gnu.org/ml/gcc/2011-07/msg00459.html
Comment 47 Jeffrey A. Law 2011-09-12 15:18:44 UTC
On 09/12/2011 02:42 AM, sebastian.huber@embedded-brains.de wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38644
>
> --- Comment #46 from Sebastian Huber<sebastian.huber@embedded-brains.de>  2011-09-12 08:42:59 UTC ---
> I guess the local patch will look like this:
>
> http://gcc.gnu.org/ml/gcc/2011-07/msg00459.html
A much simpler way to fix this is to emit a barrier just prior to 
mucking around with stack pointer in the epilogue.  That's how targets 
have dealt with this exact issue for a couple decades.

JEff
Comment 48 Richard Earnshaw 2011-09-12 15:31:51 UTC
On 12/09/11 16:18, law at redhat dot com wrote:

> A much simpler way to fix this is to emit a barrier just prior to 
> mucking around with stack pointer in the epilogue.  That's how targets 
> have dealt with this exact issue for a couple decades.

Simpler, but wrong.  The compiler should not be generating unsafe code
by default.  The problem is in the mid-end and expecting every port to
get this right in order to work-around a mid-end bug is just stupid
stupid stupid.

The mid end should not be scheduling around stack moves unless it has
been explicitly told it is safe to do this.  I don't understand why
there is so much resistance to fixing the problem properly.
Comment 49 Jeffrey A. Law 2011-09-12 15:33:56 UTC
On 09/12/2011 09:31 AM, rearnsha at arm dot com wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38644
>
> --- Comment #48 from Richard Earnshaw<rearnsha at arm dot com>  2011-09-12 15:31:51 UTC ---
> On 12/09/11 16:18, law at redhat dot com wrote:
>
>> A much simpler way to fix this is to emit a barrier just prior to
>> mucking around with stack pointer in the epilogue.  That's how targets
>> have dealt with this exact issue for a couple decades.
> Simpler, but wrong.  The compiler should not be generating unsafe code
> by default.  The problem is in the mid-end and expecting every port to
> get this right in order to work-around a mid-end bug is just stupid
> stupid stupid.
>
> The mid end should not be scheduling around stack moves unless it has
> been explicitly told it is safe to do this.  I don't understand why
> there is so much resistance to fixing the problem properly.
I don't disagree with you Richard, but we're at, what, 3 years on this 
bug...    That's absurd, particularly when there's a trivial, correct fix.

jeff
Comment 50 Steven Bosscher 2011-09-12 18:24:58 UTC
Perhaps someone can comment on, and test, Joern's patch here:
http://gcc.gnu.org/ml/gcc/2011-07/msg00461.html

@Richard E: That would be a middle-end fix, if it's correct, so perhaps you could do the honors?
Comment 51 rguenther@suse.de 2011-09-26 08:04:37 UTC
On Mon, 12 Sep 2011, rearnsha at arm dot com wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38644
> 
> --- Comment #48 from Richard Earnshaw <rearnsha at arm dot com> 2011-09-12 15:31:51 UTC ---
> On 12/09/11 16:18, law at redhat dot com wrote:
> 
> > A much simpler way to fix this is to emit a barrier just prior to 
> > mucking around with stack pointer in the epilogue.  That's how targets 
> > have dealt with this exact issue for a couple decades.
> 
> Simpler, but wrong.  The compiler should not be generating unsafe code
> by default.  The problem is in the mid-end and expecting every port to
> get this right in order to work-around a mid-end bug is just stupid
> stupid stupid.
> 
> The mid end should not be scheduling around stack moves unless it has
> been explicitly told it is safe to do this.  I don't understand why
> there is so much resistance to fixing the problem properly.

The middle-end does not treat stack moves specially, they are just
memory accesses.  Extra dependences have to be modeled accordingly.
It's a hack to treat stack moves specially, not a proper fix.
Comment 52 Sebastian Huber 2011-10-15 08:48:10 UTC
In GCC 4.6.2 20111014 (prerelease) the problem is still not fixed and  "arm-eabi-gcc -march=armv5t -mthumb -O2" produces wrong code.  Please don't let it slip through the next release.
Comment 53 Sebastian Huber 2011-10-24 13:06:03 UTC
I tested the above patch proposed by Mikael Pettersson (from 2010-05-26, more than one year ago) with GCC 4.6 20111021.  It still fixes the test case provided by Dave Murphy (from 2008-12-27, nearly three years ago).  I run the GCC C and C++ testsuite with the arm-rtemseabi4.11 target and the results are identical with and without the patch.  Even if we have no perfect solution yet, it would be very kind if someone can check in an intermediate fix.
Comment 54 Sebastian Huber 2011-10-28 07:31:19 UTC
I tested with GCC 4.6.2 and the patch provided by Mikael Pettersson.  It works for -march=armv4t and -march=armv5t, but not for -march=armv5te:

--- test-armv5te.s      2011-10-28 09:22:24.627388063 +0200
+++ test-armv5t.s       2011-10-28 09:22:19.923643155 +0200
@@ -1,4 +1,4 @@
-       .arch armv5te
+       .arch armv5t
        .fpu softvfp
        .eabi_attribute 20, 1
        .eabi_attribute 21, 1
@@ -27,8 +27,8 @@
        mov     r1, r4
        mov     r2, #1
        bl      doStreamReadBlock
-       add     sp, sp, #8
        ldrb    r0, [r4]
+       add     sp, sp, #8
        @ sp needed for prologue
        pop     {r4, pc}
        .size   readStream, .-readStream

Command line:

arm-eabi-gcc -O2 -march=armv5t -mthumb -S test.c -o test-armv5t.s
arm-eabi-gcc -O2 -march=armv5te -mthumb -S test.c -o test-armv5te.s
Comment 55 Dave Murphy 2011-10-29 23:27:02 UTC
(In reply to comment #54)
> I tested with GCC 4.6.2 and the patch provided by Mikael Pettersson.  It works
> for -march=armv4t and -march=armv5t, but not for -march=armv5te:

For what it's worth I've been using Richard Earnshaw's patch from http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30282#c8 with my own gcc builds and it's working fine for all -march values.

There's also Joern's patch at http://gcc.gnu.org/ml/gcc/2011-07/msg00461.html which I haven't tested but looks like it should work.

I still don't understand why there seems to be so much resistance to Richard's suggestion that targets with redzones should explicitly enable this behaviour. How can it be a hack to treat stack moves specially? Isn't the stack generally a special register?
Comment 56 Jiangning Liu 2011-10-31 07:48:25 UTC
(In reply to comment #54)
> I tested with GCC 4.6.2 and the patch provided by Mikael Pettersson.  It works
> for -march=armv4t and -march=armv5t, but not for -march=armv5te:
> 

Sebastian,

Actually you may try this,

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index aed748c..8269c1a 100755
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -22273,6 +22273,8 @@ thumb1_expand_epilogue (void)
   gcc_assert (amount >= 0);
   if (amount)
     {
+      emit_insn (gen_blockage ());
+
       if (amount < 512)
        emit_insn (gen_addsi3 (stack_pointer_rtx, stack_pointer_rtx,
                               GEN_INT (amount)));

Thanks,
-Jiangning
Comment 57 Mikael Pettersson 2011-10-31 08:32:09 UTC
(In reply to comment #56)
> (In reply to comment #54)
> > I tested with GCC 4.6.2 and the patch provided by Mikael Pettersson.  It works
> > for -march=armv4t and -march=armv5t, but not for -march=armv5te:
> > 
> 
> Sebastian,
> 
> Actually you may try this,
> 
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index aed748c..8269c1a 100755
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -22273,6 +22273,8 @@ thumb1_expand_epilogue (void)
>    gcc_assert (amount >= 0);
>    if (amount)
>      {
> +      emit_insn (gen_blockage ());
> +
>        if (amount < 512)
>         emit_insn (gen_addsi3 (stack_pointer_rtx, stack_pointer_rtx,
>                                GEN_INT (amount)));
> 
> Thanks,
> -Jiangning

Yeah, that should be even better than my patch.  Thanks.
Comment 58 Sebastian Huber 2011-10-31 10:45:43 UTC
I tested Jiangning Liu's latest patch.  With it GCC 4.6.2 produces valid code for -march=armv4t, -march=armv5t, -march=armv5te, -march=armv6, and -march=armv7-m.  GCC 4.6.2 produces valid code for -march-armv7-m also without the patch, for all other listed options the code is wrong.

Testsuite results:

http://gcc.gnu.org/ml/gcc-testresults/2011-10/msg03523.html
Comment 59 jye2 2011-11-04 16:50:11 UTC
Author: jye2
Date: Fri Nov  4 16:50:04 2011
New Revision: 180964

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=180964
Log:
2011-11-04  Jiangning Liu  <jiangning.liu@arm.com>

	PR rtl-optimization/38644
	* config/arm/arm.c (thumb1_expand_epilogue): Add memory barrier
	for epilogue having stack adjustment.

	testcase:
	* gcc.target/arm/stack-red-zone.c: New.

Added:
    trunk/gcc/testsuite/gcc.target/arm/stack-red-zone.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/arm/arm.c
    trunk/gcc/testsuite/ChangeLog
Comment 60 Sebastian Huber 2011-11-09 09:00:26 UTC
Jiangning Liu thank you very much for your update.

The target milestone is currently 4.4.7.  Are there plans to commit this fix the the 4.4, 4.5, and 4.6 branches?
Comment 61 Jiangning Liu 2011-11-16 09:47:01 UTC
Author: liujiangning
Date: Wed Nov 16 09:46:58 2011
New Revision: 181406

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=181406
Log:
2011-11-16  Jiangning Liu  <jiangning.liu@arm.com>

        Backport r180964 from mainline
        2011-11-04  Jiangning Liu  <jiangning.liu@arm.com>

        PR rtl-optimization/38644
        * config/arm/arm.c (thumb1_expand_epilogue): Add memory barrier
        for epilogue having stack adjustment.

testsuite:

2011-11-16  Jiangning Liu  <jiangning.liu@arm.com>

        Backport r180964 from mainline
        2011-11-04  Jiangning Liu  <jiangning.liu@arm.com>

        PR rtl-optimization/38644
        * gcc.target/arm/stack-red-zone.c: New.

Added:
    branches/ARM/embedded-4_6-branch/gcc/testsuite/gcc.target/arm/stack-red-zone.c
Modified:
    branches/ARM/embedded-4_6-branch/gcc/ChangeLog.arm
    branches/ARM/embedded-4_6-branch/gcc/config/arm/arm.c
    branches/ARM/embedded-4_6-branch/gcc/testsuite/ChangeLog.arm
Comment 62 Ramana Radhakrishnan 2012-01-09 16:55:24 UTC
Author: ramana
Date: Mon Jan  9 16:55:16 2012
New Revision: 183019

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=183019
Log:

2012-01-09  Ramana Radhakrishnan  <ramana.radhakrishnan@linaro.org>

	Backport from mainline
	2011-11-04  Jiangning Liu  <jiangning.liu@arm.com>

	PR rtl-optimization/38644
	* config/arm/arm.c (thumb1_expand_epilogue): Add memory barrier
	for epilogue having stack adjustment.

2012-01-09  Ramana Radhakrishnan  <ramana.radhakrishnan@linaro.org>

	Backport from mainline:
	2011-11-04  Jiangning Liu  <jiangning.liu@arm.com>

	PR rtl-optimization/38644
	* gcc.target/arm/stack-red-zone.c: New.



Added:
    branches/gcc-4_6-branch/gcc/testsuite/gcc.target/arm/stack-red-zone.c
Modified:
    branches/gcc-4_6-branch/gcc/ChangeLog
    branches/gcc-4_6-branch/gcc/config/arm/arm.c
    branches/gcc-4_6-branch/gcc/testsuite/ChangeLog
Comment 63 Jakub Jelinek 2012-03-13 12:47:29 UTC
4.4 branch is being closed, moving to 4.5.4 target.
Comment 64 Richard Biener 2012-07-02 11:32:12 UTC
The 4.5 branch is being closed, adjusting target milestone.
Comment 65 Hagay 2012-07-31 16:26:15 UTC
Also saw the same problem on MIPS 
Compiling using 'gcc version 4.5.2 (Sourcery CodeBench Lite 2011.09-86)'

Produces the below assembly (mips16)

4c13      addiu   a0,19
6478      restore 64,ra,s0-s1
f3a6 dd50 sw      v0,13232(a1)
e820      jr      ra
8c40      lh      v0,0(a0)
6500      nop

You can see the 'lh' comes after 'restore' , so the problem exist on MIPS Sourcery as well.
Comment 66 Andrew Pinski 2012-07-31 16:49:54 UTC
(In reply to comment #65)
> Also saw the same problem on MIPS 
> Compiling using 'gcc version 4.5.2 (Sourcery CodeBench Lite 2011.09-86)'

You should report that issue to Mentor because it was fixed with:
2011-10-02  Richard Sandiford  <rdsandiford@googlemail.com>

        * config/mips/mips.c (mips_frame_barrier): New function.
        (mips_expand_prologue): Call it after allocating stack space.
        (mips_deallocate_stack): New function.
        (mips_expand_epilogue): Call mips_frame_barrier and
        mips_deallocate_stack.
Comment 67 Hagay 2012-07-31 17:34:02 UTC
Just reported to Mentor. Thanks
Comment 68 incrediball 2013-04-05 03:51:14 UTC
Is this problem resolved? The status is still set to "NEW" but "known to work" shows that it either has been resolved in v4.7.0 or that version is somehow not affected.

I spent the best part of this week trying to track this problem down. It was VERY hard to reproduce but I eventually found that the -fschedule-insns2 compile option caused the problem. The code that was affected is almost exactly the same as the test case in comment #1.
Comment 69 Andrew Pinski 2013-04-05 04:23:43 UTC
(In reply to comment #68)
> Is this problem resolved? The status is still set to "NEW" but "known to work"
> shows that it either has been resolved in v4.7.0 or that version is somehow not
> affected.

It was resolved with the patch in comment #59 and comment #62.  In GCC 4.6.0.
Comment 70 Sebastian Huber 2013-04-05 07:15:34 UTC
(In reply to comment #69)
> (In reply to comment #68)
> > Is this problem resolved? The status is still set to "NEW" but "known to work"
> > shows that it either has been resolved in v4.7.0 or that version is somehow not
> > affected.
> 
> It was resolved with the patch in comment #59 and comment #62.  In GCC 4.6.0.

It was fixed in GCC 4.6.3 and above.
Comment 71 Jackie Rosen 2014-02-16 10:01:30 UTC Comment hidden (spam)
Comment 72 Satish M 2016-03-14 08:56:59 UTC
This bug still exists in GCC 4.8.2 ARM. It can reproduced by adding one more argument in 'doStreamReadBlock' function in test case.

/x86_64-unknown-linux-gnu/bin/gcc -B /x86_64-unknown-linux-gnu/bin -msoft-float -marm -mcpu=cortex-a9 -march=armv7-a -mno-thumb-interwork -mlong-calls -mno-unaligned-access -O2 test.c

extern int doStreamReadBlock (int *, char *, int size, int, int);

char readStream (int *s)
{
       char c = 0;
       doStreamReadBlock (s, &c, 1, *s, 22);
       return c;
}

00000000 <readStream>:
   0:	e1a0c00d 	mov	ip, sp
   4:	e3a02000 	mov	r2, #0
   8:	e92dd800 	push	{fp, ip, lr, pc}
   c:	e24cb004 	sub	fp, ip, #4
  10:	e24dd008 	sub	sp, sp, #8
  14:	e24b100c 	sub	r1, fp, #12
  18:	e3a0c016 	mov	ip, #22
  1c:	e5612001 	strb	r2, [r1, #-1]!
  20:	e3a02001 	mov	r2, #1
  24:	e5903000 	ldr	r3, [r0]
  28:	e58dc000 	str	ip, [sp]
  2c:	e59fc00c 	ldr	ip, [pc, #12]	; 40 <readStream+0x40>
  30:	e12fff3c 	blx	ip
  34:	e24bd00c 	sub	sp, fp, #12        <---- Stack frame de-allocated
  38:	e55b000d 	ldrb	r0, [fp, #-13]     <---- Accessing stack.
  3c:	e89da800 	ldm	sp, {fp, sp, pc}
  40:	00000000 	andeq	r0, r0, r0
Comment 73 Richard Biener 2016-03-14 09:37:18 UTC
(In reply to Satish M from comment #72)
> This bug still exists in GCC 4.8.2 ARM. It can reproduced by adding one more
> argument in 'doStreamReadBlock' function in test case.
> 
> /x86_64-unknown-linux-gnu/bin/gcc -B /x86_64-unknown-linux-gnu/bin
> -msoft-float -marm -mcpu=cortex-a9 -march=armv7-a -mno-thumb-interwork
> -mlong-calls -mno-unaligned-access -O2 test.c
> 
> extern int doStreamReadBlock (int *, char *, int size, int, int);
> 
> char readStream (int *s)
> {
>        char c = 0;
>        doStreamReadBlock (s, &c, 1, *s, 22);
>        return c;
> }
> 
> 00000000 <readStream>:
>    0:	e1a0c00d 	mov	ip, sp
>    4:	e3a02000 	mov	r2, #0
>    8:	e92dd800 	push	{fp, ip, lr, pc}
>    c:	e24cb004 	sub	fp, ip, #4
>   10:	e24dd008 	sub	sp, sp, #8
>   14:	e24b100c 	sub	r1, fp, #12
>   18:	e3a0c016 	mov	ip, #22
>   1c:	e5612001 	strb	r2, [r1, #-1]!
>   20:	e3a02001 	mov	r2, #1
>   24:	e5903000 	ldr	r3, [r0]
>   28:	e58dc000 	str	ip, [sp]
>   2c:	e59fc00c 	ldr	ip, [pc, #12]	; 40 <readStream+0x40>
>   30:	e12fff3c 	blx	ip
>   34:	e24bd00c 	sub	sp, fp, #12        <---- Stack frame de-allocated
>   38:	e55b000d 	ldrb	r0, [fp, #-13]     <---- Accessing stack.
>   3c:	e89da800 	ldm	sp, {fp, sp, pc}
>   40:	00000000 	andeq	r0, r0, r0

Please open a new bugreport.
Comment 74 GCC Commits 2022-10-14 09:56:48 UTC
The master branch has been updated by Eric Botcazou <ebotcazou@gcc.gnu.org>:

https://gcc.gnu.org/g:e39b170695a161feba7401b7d21d824db9ee1f8f

commit r13-3296-ge39b170695a161feba7401b7d21d824db9ee1f8f
Author: Eric Botcazou <ebotcazou@adacore.com>
Date:   Fri Oct 14 11:52:04 2022 +0200

    Fix PR target/107248
    
    This is the infamous PR rtl-optimization/38644 rearing its ugly head for
    leaf functions on SPARC more than a decade later...  Richard E.'s generic
    solution has never been implemented so let's do as other RISC back-ends did.
    
    gcc/
            PR target/107248
            * config/sparc/sparc.cc (sparc_expand_prologue): Emit a frame
            blockage for leaf functions.
            (sparc_flat_expand_prologue): Emit frame instead of full blockage.
            (sparc_expand_epilogue): Emit a frame blockage for leaf functions.
            (sparc_flat_expand_epilogue): Emit frame instead of full blockage.
Comment 75 GCC Commits 2022-10-14 09:57:45 UTC
The releases/gcc-12 branch has been updated by Eric Botcazou <ebotcazou@gcc.gnu.org>:

https://gcc.gnu.org/g:a5a6598d5b1d29741993371310c0bb8ca57e190c

commit r12-8831-ga5a6598d5b1d29741993371310c0bb8ca57e190c
Author: Eric Botcazou <ebotcazou@adacore.com>
Date:   Fri Oct 14 11:52:04 2022 +0200

    Fix PR target/107248
    
    This is the infamous PR rtl-optimization/38644 rearing its ugly head for
    leaf functions on SPARC more than a decade later...  Richard E.'s generic
    solution has never been implemented so let's do as other RISC back-ends did.
    
    gcc/
            PR target/107248
            * config/sparc/sparc.cc (sparc_expand_prologue): Emit a frame
            blockage for leaf functions.
            (sparc_flat_expand_prologue): Emit frame instead of full blockage.
            (sparc_expand_epilogue): Emit a frame blockage for leaf functions.
            (sparc_flat_expand_epilogue): Emit frame instead of full blockage.
Comment 76 GCC Commits 2022-10-14 09:58:42 UTC
The releases/gcc-11 branch has been updated by Eric Botcazou <ebotcazou@gcc.gnu.org>:

https://gcc.gnu.org/g:3f4b65df625edae3e8829718af721ad2330b3f22

commit r11-10311-g3f4b65df625edae3e8829718af721ad2330b3f22
Author: Eric Botcazou <ebotcazou@adacore.com>
Date:   Fri Oct 14 11:52:04 2022 +0200

    Fix PR target/107248
    
    This is the infamous PR rtl-optimization/38644 rearing its ugly head for
    leaf functions on SPARC more than a decade later...  Richard E.'s generic
    solution has never been implemented so let's do as other RISC back-ends did.
    
    gcc/
            PR target/107248
            * config/sparc/sparc.c (sparc_expand_prologue): Emit a frame
            blockage for leaf functions.
            (sparc_flat_expand_prologue): Emit frame instead of full blockage.
            (sparc_expand_epilogue): Emit a frame blockage for leaf functions.
            (sparc_flat_expand_epilogue): Emit frame instead of full blockage.
Comment 77 GCC Commits 2022-10-14 10:00:03 UTC
The releases/gcc-10 branch has been updated by Eric Botcazou <ebotcazou@gcc.gnu.org>:

https://gcc.gnu.org/g:d0ef37c35b7ff7324b4567652380f32079d46088

commit r10-11034-gd0ef37c35b7ff7324b4567652380f32079d46088
Author: Eric Botcazou <ebotcazou@adacore.com>
Date:   Fri Oct 14 11:52:04 2022 +0200

    Fix PR target/107248
    
    This is the infamous PR rtl-optimization/38644 rearing its ugly head for
    leaf functions on SPARC more than a decade later...  Richard E.'s generic
    solution has never been implemented so let's do as other RISC back-ends did.
    
    gcc/
            PR target/107248
            * config/sparc/sparc.c (sparc_expand_prologue): Emit a frame
            blockage for leaf functions.
            (sparc_flat_expand_prologue): Emit frame instead of full blockage.
            (sparc_expand_epilogue): Emit a frame blockage for leaf functions.
            (sparc_flat_expand_epilogue): Emit frame instead of full blockage.