Bug 28493

Summary: [4.1 Regression] Wrong address of stack object used for destructor call on PPC
Product: gcc Reporter: Aaron Graham <aaron>
Component: middle-endAssignee: Jason Merrill <jason>
Status: RESOLVED FIXED    
Severity: normal CC: dje, gcc-bugs, jakub, janis, mark, pinskia
Priority: P1 Keywords: EH, sjlj-eh, wrong-code
Version: 4.1.1   
Target Milestone: 4.1.2   
Host: Target: powerpc-linux-gnu with sjlj eh
Build: Known to work: 3.4.1 3.4.2 4.0.0 4.0.3 4.2.0
Known to fail: 4.1.0 4.1.1 Last reconfirmed: 2006-09-08 05:39:30

Description Aaron Graham 2006-07-26 15:01:27 UTC
---------- sample program ----------
struct Command {
  Command() {}
  virtual ~Command() {}
};

void tryfunc() {
  Command cmd;
  for (;;) { throw 1; }
}
---------- end sample program ----------


Disassembly of tryfunc():
  (notice at 58-5c, constructor is called on r1+8, but at
   88-90, destructor is called on r1+0)

00000000 <tryfunc()>:
   0:   94 21 ff 60     stwu    r1,-160(r1)
   4:   7c 08 02 a6     mflr    r0
   8:   3d 20 00 00     lis     r9,0
                        a: R_PPC_ADDR16_HA      __gxx_personality_sj0
   c:   39 29 00 00     addi    r9,r9,0
                        e: R_PPC_ADDR16_LO      __gxx_personality_sj0
  10:   7d 80 00 26     mfcr    r12
  14:   91 21 00 30     stw     r9,48(r1)
  18:   3d 20 00 00     lis     r9,0
                        1a: R_PPC_ADDR16_HA     .gcc_except_table
  1c:   38 61 00 18     addi    r3,r1,24
  20:   90 01 00 a4     stw     r0,164(r1)
  24:   39 29 00 00     addi    r9,r9,0
                        26: R_PPC_ADDR16_LO     .gcc_except_table
  28:   80 01 00 00     lwz     r0,0(r1)
  2c:   91 21 00 34     stw     r9,52(r1)
  30:   3d 20 00 00     lis     r9,0
                        32: R_PPC_ADDR16_HA     .text+0x84
  34:   39 29 00 84     addi    r9,r9,132
                        36: R_PPC_ADDR16_LO     .text+0x84
  38:   90 01 00 40     stw     r0,64(r1)
  3c:   38 01 00 08     addi    r0,r1,8
  40:   90 01 00 38     stw     r0,56(r1)
  44:   91 81 00 54     stw     r12,84(r1)
  48:   91 21 00 3c     stw     r9,60(r1)
  4c:   bd c1 00 58     stmw    r14,88(r1)
  50:   90 21 00 44     stw     r1,68(r1)
  54:   48 00 00 01     bl      54 <tryfunc()+0x54>
                        54: R_PPC_REL24 _Unwind_SjLj_Register
  58:   38 61 00 08     addi    r3,r1,8
  5c:   48 00 00 01     bl      5c <tryfunc()+0x5c>
                        5c: R_PPC_REL24 Command::Command()
  60:   38 60 00 04     li      r3,4
  64:   48 00 00 01     bl      64 <tryfunc()+0x64>
                        64: R_PPC_REL24 __cxa_allocate_exception
  68:   38 00 00 01     li      r0,1
  6c:   3c 80 00 00     lis     r4,0
                        6e: R_PPC_ADDR16_HA     typeinfo for int
  70:   90 03 00 00     stw     r0,0(r3)
  74:   38 84 00 00     addi    r4,r4,0
                        76: R_PPC_ADDR16_LO     typeinfo for int
  78:   38 a0 00 00     li      r5,0
  7c:   90 01 00 1c     stw     r0,28(r1)
  80:   48 00 00 01     bl      80 <tryfunc()+0x80>
                        80: R_PPC_REL24 __cxa_throw
  84:   80 01 00 20     lwz     r0,32(r1)
  88:   7c 23 0b 78     mr      r3,r1
  8c:   90 01 00 4c     stw     r0,76(r1)
  90:   48 00 00 01     bl      90 <tryfunc()+0x90>
                        90: R_PPC_REL24 Command::~Command()
  94:   38 00 ff ff     li      r0,-1
  98:   80 61 00 4c     lwz     r3,76(r1)
  9c:   90 01 00 1c     stw     r0,28(r1)
  a0:   48 00 00 01     bl      a0 <tryfunc()+0xa0>
                        a0: R_PPC_REL24 _Unwind_SjLj_Resume


Program was compiled with the following command line options:
g++ -Os -msoft-float -fno-inline sample-program.cc -c

The -msoft-float and -Os aren't necessary to reproduce this problem,
but reduce clutter.

The optimization level doesn't matter.  Looking at a disassembly at
-O0 may shed more light on the problem:

Disassembly of tryfunc() at -O0 (all other CL arguments unchanged):
00000000 <tryfunc()>:
   0:   94 21 ff 50     stwu    r1,-176(r1)
   4:   7c 08 02 a6     mflr    r0
   8:   7d 80 00 26     mfcr    r12
   c:   91 c1 00 68     stw     r14,104(r1)
  10:   91 e1 00 6c     stw     r15,108(r1)
  14:   92 01 00 70     stw     r16,112(r1)
  18:   92 21 00 74     stw     r17,116(r1)
  1c:   92 41 00 78     stw     r18,120(r1)
  20:   92 61 00 7c     stw     r19,124(r1)
  24:   92 81 00 80     stw     r20,128(r1)
  28:   92 a1 00 84     stw     r21,132(r1)
  2c:   92 c1 00 88     stw     r22,136(r1)
  30:   92 e1 00 8c     stw     r23,140(r1)
  34:   93 01 00 90     stw     r24,144(r1)
  38:   93 21 00 94     stw     r25,148(r1)
  3c:   93 41 00 98     stw     r26,152(r1)
  40:   93 61 00 9c     stw     r27,156(r1)
  44:   93 81 00 a0     stw     r28,160(r1)
  48:   93 a1 00 a4     stw     r29,164(r1)
  4c:   93 c1 00 a8     stw     r30,168(r1)
  50:   93 e1 00 ac     stw     r31,172(r1)
  54:   90 01 00 b4     stw     r0,180(r1)
  58:   91 81 00 64     stw     r12,100(r1)
  5c:   7c 3f 0b 78     mr      r31,r1
  60:   3d 20 00 00     lis     r9,0
                        62: R_PPC_ADDR16_HA     __gxx_personality_sj0
  64:   38 09 00 00     addi    r0,r9,0
                        66: R_PPC_ADDR16_LO     __gxx_personality_sj0
  68:   90 1f 00 30     stw     r0,48(r31)
  6c:   3d 20 00 00     lis     r9,0
                        6e: R_PPC_ADDR16_HA     .gcc_except_table
  70:   38 09 00 00     addi    r0,r9,0
                        72: R_PPC_ADDR16_LO     .gcc_except_table
  74:   90 1f 00 34     stw     r0,52(r31)
  78:   39 7f 00 38     addi    r11,r31,56
  7c:   38 1f 00 08     addi    r0,r31,8
  80:   90 0b 00 00     stw     r0,0(r11)
  84:   3d 20 00 00     lis     r9,0
                        86: R_PPC_ADDR16_HA     .text+0xec
  88:   38 09 00 ec     addi    r0,r9,236
                        8a: R_PPC_ADDR16_LO     .text+0xec
  8c:   90 0b 00 04     stw     r0,4(r11)
  90:   80 01 00 00     lwz     r0,0(r1)
  94:   90 0b 00 08     stw     r0,8(r11)
  98:   90 2b 00 0c     stw     r1,12(r11)
  9c:   38 1f 00 18     addi    r0,r31,24
  a0:   7c 03 03 78     mr      r3,r0
  a4:   48 00 00 01     bl      a4 <tryfunc()+0xa4>
                        a4: R_PPC_REL24 _Unwind_SjLj_Register
  a8:   38 1f 00 08     addi    r0,r31,8
  ac:   7c 03 03 78     mr      r3,r0
  b0:   48 00 00 01     bl      b0 <tryfunc()+0xb0>
                        b0: R_PPC_REL24 Command::Command()
  b4:   38 60 00 04     li      r3,4
  b8:   48 00 00 01     bl      b8 <tryfunc()+0xb8>
                        b8: R_PPC_REL24 __cxa_allocate_exception
  bc:   7c 60 1b 78     mr      r0,r3
  c0:   7c 0b 03 78     mr      r11,r0
  c4:   7d 69 5b 78     mr      r9,r11
  c8:   38 00 00 01     li      r0,1
  cc:   90 09 00 00     stw     r0,0(r9)
  d0:   7d 63 5b 78     mr      r3,r11
  d4:   3d 20 00 00     lis     r9,0
                        d6: R_PPC_ADDR16_HA     typeinfo for int
  d8:   38 00 00 01     li      r0,1
  dc:   90 1f 00 1c     stw     r0,28(r31)
  e0:   38 89 00 00     addi    r4,r9,0
                        e2: R_PPC_ADDR16_LO     typeinfo for int
  e4:   38 a0 00 00     li      r5,0
  e8:   48 00 00 01     bl      e8 <tryfunc()+0xe8>
                        e8: R_PPC_REL24 __cxa_throw
  ec:   3b ff ff f8     addi    r31,r31,-8
  f0:   80 1f 00 20     lwz     r0,32(r31)
  f4:   90 1f 00 50     stw     r0,80(r31)
  f8:   80 1f 00 50     lwz     r0,80(r31)
  fc:   90 1f 00 4c     stw     r0,76(r31)
 100:   38 1f 00 08     addi    r0,r31,8
 104:   7c 03 03 78     mr      r3,r0
 108:   48 00 00 01     bl      108 <tryfunc()+0x108>
                        108: R_PPC_REL24        Command::~Command()
 10c:   80 1f 00 4c     lwz     r0,76(r31)
 110:   90 1f 00 50     stw     r0,80(r31)
 114:   38 00 ff ff     li      r0,-1
 118:   90 1f 00 1c     stw     r0,28(r31)
 11c:   80 7f 00 50     lwz     r3,80(r31)
 120:   48 00 00 01     bl      120 <tryfunc()+0x120>
                        120: R_PPC_REL24        _Unwind_SjLj_Resume


I have not been able to reproduce this problem on compilers
targeted to x86.
Comment 1 Aaron Graham 2006-07-26 15:42:52 UTC
Actually, the for loop is unnecessary.  Here's a shorter version that displays the same problem:

struct Command {
  virtual ~Command() {}
};
void tryfunc() {
  Command cmd;
  throw 1;
}
Comment 2 Aaron Graham 2006-07-27 02:47:22 UTC
This bug appears to only happen when the compiler is built with SjLj exceptions.  When the compiler is built for dwarf2 exceptions, this test case (and my original problem area) are both correct:

00000000 <tryfunc()>:
   0:   94 21 ff d0     stwu    r1,-48(r1)
   4:   7c 08 02 a6     mflr    r0
   8:   38 61 00 08     addi    r3,r1,8
   c:   90 01 00 34     stw     r0,52(r1)
  10:   bf a1 00 24     stmw    r29,36(r1)
  14:   48 00 00 01     bl      14 <tryfunc()+0x14>
                        14: R_PPC_REL24 Command::Command()
  18:   38 60 00 04     li      r3,4
  1c:   48 00 00 01     bl      1c <tryfunc()+0x1c>
                        1c: R_PPC_REL24 __cxa_allocate_exception
  20:   38 00 00 01     li      r0,1
  24:   3c 80 00 00     lis     r4,0
                        26: R_PPC_ADDR16_HA     typeinfo for int
  28:   90 03 00 00     stw     r0,0(r3)
  2c:   38 84 00 00     addi    r4,r4,0
                        2e: R_PPC_ADDR16_LO     typeinfo for int
  30:   38 a0 00 00     li      r5,0
  34:   48 00 00 01     bl      34 <tryfunc()+0x34>
                        34: R_PPC_REL24 __cxa_throw
  38:   7c 7d 1b 78     mr      r29,r3
  3c:   38 61 00 08     addi    r3,r1,8
  40:   48 00 00 01     bl      40 <tryfunc()+0x40>
                        40: R_PPC_REL24 Command::~Command()
  44:   7f a3 eb 78     mr      r3,r29
  48:   48 00 00 01     bl      48 <tryfunc()+0x48>
                        48: R_PPC_REL24 _Unwind_Resume
Comment 3 Aaron Graham 2006-08-05 16:58:15 UTC
This may not be related to 19774 as I had originally thought.  This failure case is new as of 4.1.0.  GCC version 4.0.3 gets it right:

g++-4.0.3 -Os -msoft-float -mcpu=405 -c bug.cc -fno-inline -Wall -dA

00000000 <tryfunc()>:
   0:   3d 20 00 00     lis     r9,0
                        2: R_PPC_ADDR16_HA      __gxx_personality_sj0
   4:   94 21 ff 60     stwu    r1,-160(r1)
   8:   7c 08 02 a6     mflr    r0
   c:   39 29 00 00     addi    r9,r9,0
                        e: R_PPC_ADDR16_LO      __gxx_personality_sj0
  10:   91 21 00 30     stw     r9,48(r1)
  14:   3d 20 00 00     lis     r9,0
                        16: R_PPC_ADDR16_HA     .gcc_except_table
  18:   90 01 00 a4     stw     r0,164(r1)
  1c:   39 29 00 00     addi    r9,r9,0
                        1e: R_PPC_ADDR16_LO     .gcc_except_table
  20:   80 01 00 00     lwz     r0,0(r1)
  24:   7d 80 00 26     mfcr    r12
  28:   91 21 00 34     stw     r9,52(r1)
  2c:   3d 20 00 00     lis     r9,0
                        2e: R_PPC_ADDR16_HA     .text+0x84
  30:   39 29 00 84     addi    r9,r9,132
                        32: R_PPC_ADDR16_LO     .text+0x84
  34:   38 61 00 18     addi    r3,r1,24
  38:   90 01 00 40     stw     r0,64(r1)
  3c:   38 01 00 08     addi    r0,r1,8
  40:   90 01 00 38     stw     r0,56(r1)
  44:   91 81 00 54     stw     r12,84(r1)
  48:   91 21 00 3c     stw     r9,60(r1)
  4c:   bd c1 00 58     stmw    r14,88(r1)
  50:   90 21 00 44     stw     r1,68(r1)
  54:   48 00 00 01     bl      54 <tryfunc()+0x54>
                        54: R_PPC_REL24 _Unwind_SjLj_Register
  58:   38 61 00 08     addi    r3,r1,8
  5c:   48 00 00 01     bl      5c <tryfunc()+0x5c>
                        5c: R_PPC_REL24 Command::Command()
  60:   38 60 00 04     li      r3,4
  64:   48 00 00 01     bl      64 <tryfunc()+0x64>
                        64: R_PPC_REL24 __cxa_allocate_exception
  68:   38 00 00 01     li      r0,1
  6c:   90 03 00 00     stw     r0,0(r3)
  70:   3c 80 00 00     lis     r4,0
                        72: R_PPC_ADDR16_HA     typeinfo for int
  74:   90 01 00 1c     stw     r0,28(r1)
  78:   38 84 00 00     addi    r4,r4,0
                        7a: R_PPC_ADDR16_LO     typeinfo for int
  7c:   38 a0 00 00     li      r5,0
  80:   48 00 00 01     bl      80 <tryfunc()+0x80>
                        80: R_PPC_REL24 __cxa_throw
  84:   80 01 00 20     lwz     r0,32(r1)
  88:   38 61 00 08     addi    r3,r1,8
  8c:   90 01 00 4c     stw     r0,76(r1)
  90:   48 00 00 01     bl      90 <tryfunc()+0x90>
                        90: R_PPC_REL24 Command::~Command()
  94:   80 61 00 4c     lwz     r3,76(r1)
  98:   38 00 ff ff     li      r0,-1
  9c:   90 01 00 1c     stw     r0,28(r1)
  a0:   48 00 00 01     bl      a0 <tryfunc()+0xa0>
                        a0: R_PPC_REL24 _Unwind_SjLj_Resume

Additionally, I tested this case in 3.4.1, 3.4.2, and 4.0.0, and they all get it right as well.
Comment 4 Aaron Graham 2006-08-05 21:11:19 UTC
Actually, it turns out that gcc versions before the 4.1 series all get it wrong too, at -O0.  The bug gets masked when introducing optimization.  Here is the -O0 output from 4.0.3:

g++-4.0.3 -O0 -msoft-float -mcpu=405 -c bug.cc -fno-inline

00000000 <tryfunc()>:
   0:   94 21 ff 50     stwu    r1,-176(r1)
   4:   7c 08 02 a6     mflr    r0
   8:   7d 80 00 26     mfcr    r12
   c:   91 c1 00 68     stw     r14,104(r1)
  10:   91 e1 00 6c     stw     r15,108(r1)
  14:   92 01 00 70     stw     r16,112(r1)
  18:   92 21 00 74     stw     r17,116(r1)
  1c:   92 41 00 78     stw     r18,120(r1)
  20:   92 61 00 7c     stw     r19,124(r1)
  24:   92 81 00 80     stw     r20,128(r1)
  28:   92 a1 00 84     stw     r21,132(r1)
  2c:   92 c1 00 88     stw     r22,136(r1)
  30:   92 e1 00 8c     stw     r23,140(r1)
  34:   93 01 00 90     stw     r24,144(r1)
  38:   93 21 00 94     stw     r25,148(r1)
  3c:   93 41 00 98     stw     r26,152(r1)
  40:   93 61 00 9c     stw     r27,156(r1)
  44:   93 81 00 a0     stw     r28,160(r1)
  48:   93 a1 00 a4     stw     r29,164(r1)
  4c:   93 c1 00 a8     stw     r30,168(r1)
  50:   93 e1 00 ac     stw     r31,172(r1)
  54:   90 01 00 b4     stw     r0,180(r1)
  58:   91 81 00 64     stw     r12,100(r1)
  5c:   7c 3f 0b 78     mr      r31,r1
  60:   3d 20 00 00     lis     r9,0
                        62: R_PPC_ADDR16_HA     __gxx_personality_sj0
  64:   38 09 00 00     addi    r0,r9,0
                        66: R_PPC_ADDR16_LO     __gxx_personality_sj0
  68:   90 1f 00 30     stw     r0,48(r31)
  6c:   3d 20 00 00     lis     r9,0
                        6e: R_PPC_ADDR16_HA     .gcc_except_table
  70:   38 09 00 00     addi    r0,r9,0
                        72: R_PPC_ADDR16_LO     .gcc_except_table
  74:   90 1f 00 34     stw     r0,52(r31)
  78:   39 7f 00 38     addi    r11,r31,56
  7c:   38 1f 00 08     addi    r0,r31,8
  80:   90 0b 00 00     stw     r0,0(r11)
  84:   3d 20 00 00     lis     r9,0
                        86: R_PPC_ADDR16_HA     .text+0xe8
  88:   38 09 00 e8     addi    r0,r9,232
                        8a: R_PPC_ADDR16_LO     .text+0xe8
  8c:   90 0b 00 04     stw     r0,4(r11)
  90:   80 01 00 00     lwz     r0,0(r1)
  94:   90 0b 00 08     stw     r0,8(r11)
  98:   90 2b 00 0c     stw     r1,12(r11)
  9c:   38 1f 00 18     addi    r0,r31,24
  a0:   7c 03 03 78     mr      r3,r0
  a4:   48 00 00 01     bl      a4 <tryfunc()+0xa4>
                        a4: R_PPC_REL24 _Unwind_SjLj_Register
  a8:   38 7f 00 08     addi    r3,r31,8
  ac:   48 00 00 01     bl      ac <tryfunc()+0xac>
                        ac: R_PPC_REL24 Command::Command()
  b0:   38 60 00 04     li      r3,4
  b4:   48 00 00 01     bl      b4 <tryfunc()+0xb4>
                        b4: R_PPC_REL24 __cxa_allocate_exception
  b8:   7c 60 1b 78     mr      r0,r3
  bc:   7c 0b 03 78     mr      r11,r0
  c0:   7d 69 5b 78     mr      r9,r11
  c4:   38 00 00 01     li      r0,1
  c8:   90 09 00 00     stw     r0,0(r9)
  cc:   7d 63 5b 78     mr      r3,r11
  d0:   3d 20 00 00     lis     r9,0
                        d2: R_PPC_ADDR16_HA     typeinfo for int
  d4:   38 00 00 01     li      r0,1
  d8:   90 1f 00 1c     stw     r0,28(r31)
  dc:   38 89 00 00     addi    r4,r9,0
                        de: R_PPC_ADDR16_LO     typeinfo for int
  e0:   38 a0 00 00     li      r5,0
  e4:   48 00 00 01     bl      e4 <tryfunc()+0xe4>
                        e4: R_PPC_REL24 __cxa_throw
  e8:   38 1f ff f8     addi    r0,r31,-8
  ec:   7c 1f 03 78     mr      r31,r0
  f0:   80 1f 00 20     lwz     r0,32(r31)
  f4:   90 1f 00 50     stw     r0,80(r31)
  f8:   80 1f 00 50     lwz     r0,80(r31)
  fc:   90 1f 00 4c     stw     r0,76(r31)
 100:   38 7f 00 08     addi    r3,r31,8
 104:   48 00 00 01     bl      104 <tryfunc()+0x104>
                        104: R_PPC_REL24        Command::~Command()
 108:   80 1f 00 4c     lwz     r0,76(r31)
 10c:   90 1f 00 50     stw     r0,80(r31)
 110:   38 00 ff ff     li      r0,-1
 114:   90 1f 00 1c     stw     r0,28(r31)
 118:   80 7f 00 50     lwz     r3,80(r31)
 11c:   48 00 00 01     bl      11c <tryfunc()+0x11c>

In summary: All gcc versions since 3.4.0 (inclusive) display this bug at -O0, but until 4.1.*, the bug didn't appear in optimized output.  In 4.1.0 and 4.1.1, the bug appears consistently at all optimization levels.
Comment 5 Janis Johnson 2006-08-08 20:35:02 UTC
David asked me to run a regression hunt on this, but I'm very confused about when the problem occurs, since some of the submitter's examples look just fine to me.  Here's what it looks like to me, based on the generated code provided:

submitter's description
  4.1.1 with -Os has bad call to destructor
  4.1.1 with -O0 has good call to destructor
comment #3
  4.0.3 with -Os has good call to destructor
  4.0.3 with -O0 has good call to destructor

Am I missing something?

I can reproduce the bad call with the 4.1-branch for powerpc-linux configured with --enable-sjlj-exceptions.  It uses the wrong address for -Os, but looks fine for -O[0123].
Comment 6 Janis Johnson 2006-08-08 20:49:25 UTC
Oops, I meant that for 4.1 powerpc-linux with sjlj exceptions, it passes for -O0 but fails for -O[s123].  I'm trying 4.0 now, then will back up if I see problems with 4.0.
Comment 7 Janis Johnson 2006-08-08 21:08:18 UTC
I don't get any failures with the 4.0-branch for powerpc-linux with sjlj exceptions.  Here's the executable test case I'm using for a regression hunt:

------------------------
extern "C" void abort (void);
void *pc, *pd;
                                                                                
struct Command {
  Command() { pc = (void *)this; }
  ~Command() { pd = (void *)this; }
};
                                                                                
void tryfunc() {
  Command cmd;
  throw 1;
}
                                                                                
int main()
{
  try { tryfunc(); }
  catch (int) { }
  if (pc != pd) abort ();
}
------------------------
Comment 8 Aaron Graham 2006-08-08 23:21:54 UTC
(In reply to comment #7)
> I don't get any failures with the 4.0-branch for powerpc-linux with sjlj
> exceptions.  Here's the executable test case I'm using for a regression hunt:

Janis,

Thank you for looking into this.  I built gcc 4.1.1 for powerpc-linux, and although I don't have the hardware in front of me right now to run the executable, I can see just by looking at the disassembly that the bug exists for -O0:

disassembly of tryfunc(), compiled with:
powerpc-linux-g++ -O0 -msoft-float -mcpu=405 -c bug.cc -fno-inline

00000000 <tryfunc()>:
   0:   94 21 ff 50     stwu    r1,-176(r1)
   4:   7c 08 02 a6     mflr    r0
   8:   7d 80 00 26     mfcr    r12
   c:   91 c1 00 68     stw     r14,104(r1)
  10:   91 e1 00 6c     stw     r15,108(r1)
  14:   92 01 00 70     stw     r16,112(r1)
  18:   92 21 00 74     stw     r17,116(r1)
  1c:   92 41 00 78     stw     r18,120(r1)
  20:   92 61 00 7c     stw     r19,124(r1)
  24:   92 81 00 80     stw     r20,128(r1)
  28:   92 a1 00 84     stw     r21,132(r1)
  2c:   92 c1 00 88     stw     r22,136(r1)
  30:   92 e1 00 8c     stw     r23,140(r1)
  34:   93 01 00 90     stw     r24,144(r1)
  38:   93 21 00 94     stw     r25,148(r1)
  3c:   93 41 00 98     stw     r26,152(r1)
  40:   93 61 00 9c     stw     r27,156(r1)
  44:   93 81 00 a0     stw     r28,160(r1)
  48:   93 a1 00 a4     stw     r29,164(r1)
  4c:   93 c1 00 a8     stw     r30,168(r1)
  50:   93 e1 00 ac     stw     r31,172(r1)
  54:   90 01 00 b4     stw     r0,180(r1)
  58:   91 81 00 64     stw     r12,100(r1)
  5c:   7c 3f 0b 78     mr      r31,r1
  60:   3d 20 00 00     lis     r9,0
                        62: R_PPC_ADDR16_HA     __gxx_personality_sj0
  64:   38 09 00 00     addi    r0,r9,0
                        66: R_PPC_ADDR16_LO     __gxx_personality_sj0
  68:   90 1f 00 30     stw     r0,48(r31)
  6c:   3d 20 00 00     lis     r9,0
                        6e: R_PPC_ADDR16_HA     .gcc_except_table
  70:   38 09 00 00     addi    r0,r9,0
                        72: R_PPC_ADDR16_LO     .gcc_except_table
  74:   90 1f 00 34     stw     r0,52(r31)
  78:   39 7f 00 38     addi    r11,r31,56
  7c:   38 1f 00 08     addi    r0,r31,8
  80:   90 0b 00 00     stw     r0,0(r11)
  84:   3d 20 00 00     lis     r9,0
                        86: R_PPC_ADDR16_HA     .text+0xec
  88:   38 09 00 ec     addi    r0,r9,236
                        8a: R_PPC_ADDR16_LO     .text+0xec
  8c:   90 0b 00 04     stw     r0,4(r11)
  90:   80 01 00 00     lwz     r0,0(r1)
  94:   90 0b 00 08     stw     r0,8(r11)
  98:   90 2b 00 0c     stw     r1,12(r11)
  9c:   38 1f 00 18     addi    r0,r31,24
  a0:   7c 03 03 78     mr      r3,r0
  a4:   48 00 00 01     bl      a4 <tryfunc()+0xa4>
                        a4: R_PPC_REL24 _Unwind_SjLj_Register
  a8:   38 1f 00 08     addi    r0,r31,8
  ac:   7c 03 03 78     mr      r3,r0
  b0:   48 00 00 01     bl      b0 <tryfunc()+0xb0>
                        b0: R_PPC_REL24 Command::Command()
  b4:   38 60 00 04     li      r3,4
  b8:   48 00 00 01     bl      b8 <tryfunc()+0xb8>
                        b8: R_PPC_REL24 __cxa_allocate_exception
  bc:   7c 60 1b 78     mr      r0,r3
  c0:   7c 0b 03 78     mr      r11,r0
  c4:   7d 69 5b 78     mr      r9,r11
  c8:   38 00 00 01     li      r0,1
  cc:   90 09 00 00     stw     r0,0(r9)
  d0:   7d 63 5b 78     mr      r3,r11
  d4:   3d 20 00 00     lis     r9,0
                        d6: R_PPC_ADDR16_HA     typeinfo for int
  d8:   38 00 00 01     li      r0,1
  dc:   90 1f 00 1c     stw     r0,28(r31)
  e0:   38 89 00 00     addi    r4,r9,0
                        e2: R_PPC_ADDR16_LO     typeinfo for int
  e4:   38 a0 00 00     li      r5,0
  e8:   48 00 00 01     bl      e8 <tryfunc()+0xe8>
                        e8: R_PPC_REL24 __cxa_throw
  ec:   3b ff ff f8     addi    r31,r31,-8
  f0:   80 1f 00 20     lwz     r0,32(r31)
  f4:   90 1f 00 50     stw     r0,80(r31)
  f8:   80 1f 00 50     lwz     r0,80(r31)
  fc:   90 1f 00 4c     stw     r0,76(r31)
 100:   38 1f 00 08     addi    r0,r31,8
 104:   7c 03 03 78     mr      r3,r0
 108:   48 00 00 01     bl      108 <tryfunc()+0x108>
                        108: R_PPC_REL24        Command::~Command()
 10c:   80 1f 00 4c     lwz     r0,76(r31)
 110:   90 1f 00 50     stw     r0,80(r31)
 114:   38 00 ff ff     li      r0,-1
 118:   90 1f 00 1c     stw     r0,28(r31)
 11c:   80 7f 00 50     lwz     r3,80(r31)
 120:   48 00 00 01     bl      120 <tryfunc()+0x120>
                        120: R_PPC_REL24        _Unwind_SjLj_Resume

At addresses 0xa8-0xb0, the constructor is called with r31+8.  At addresses 0x100-0x108, the destructor is also called with r31+8.  However, r31 is modified at address 0xec (it subtracts 8)!  Is this not what you're seeing?  The problem is easier to spot in optimized disassemblies because the extra subtraction has been optimized to coincide with the placing of the object's address into r3.

Your code is a little different than mine.  It was easier for me to reproduce the bug with the ~Command() destructor virtual and when compiled with -fno-inline.  Otherwise, the functions would get inlined or the object would get optimized away completely.

I spent a few hours on this problem on Saturday (Aug 5), doing my own little regression hunt.  I stopped the regression hunt at 3.4.0, because before that, the exception handling code starts to look entirely different.  As I said in Comment #4, all versions from 3.4.0 display this problem at -O0, but only 4.1 and above display it at -O[s123] as well.

I'll retry the 4.0 series again as soon as I can.

Since some of my previous comments are confusing, I'll reiterate that Comments #2 and #3 describe correct output, from gcc 4.1.1 with dwarf2-eh and from gcc 4.0.3 -Os respectively.  Comments #0 and #4 describe buggy compiler output.  So you were correct when you said that "some of the submitter's examples look just fine".

Thanks again for helping out.  I'll be working on this problem as time permits.
Comment 9 Janis Johnson 2006-08-09 00:13:07 UTC
Aaron, I had not noticed that the stack pointer is modified in some of the code that I had thought looked correct.  My example works correctly with -O0 for powerpc-linux with sjlj exceptions for 4.0 and 4.1 branches, but I see now that it gets the wrong code from a cross compiler for powerpc-wrs-vxworks for both the 4.0 and 4.1 branches.  I'll investigate that further.  In the meantime, my automated regression hunt for the start of failures with optimization is almost done.
Comment 10 Janis Johnson 2006-08-09 00:21:53 UTC
A regression hunt using the testcase from comment #7 compiled with -O1, with a powerpc-linux compiler configured with --enable-sjlj-exceptions, identified the following patch:

    http://gcc.gnu.org/viewcvs?view=rev&rev=101348
 
    r101348 | jakub | 2005-06-27 07:41:16 +0000 (Mon, 27 Jun 2005)

I'm not sure how relevant this is, given that the same testcase looks as if it would fail with -O0 before and after that change for powerpc-wrs-vxworks.

Aaron, I have a setup that makes it very easy for me to run automated regression hunts when I start with an automated test that clearly passes or fails.  I don't yet have such a test for the -O0 case for powerpc-wrs-vxworks.
Comment 11 David Edelsohn 2006-08-09 00:39:20 UTC
Let's consider this PR as only the -O1 and above bug that has been confirmed and regression hunted.  Another PR can be opened for the -O0 bug that does not appear to be as general -- it may be a problem with WRS running initializers or initializing the frame tables.
Comment 12 Aaron Graham 2006-08-09 01:30:57 UTC
(In reply to comment #11)
[...]
> it may be a problem with WRS running initializers or
> initializing the frame tables.

Both of the gcc builds I'm testing with are cross compilers (host i686-pc-linux-gnu):

$ powerpc-linux-gcc -v
Using built-in specs.
Target: powerpc-linux
Configured with: ../gcc-4.1.1/configure --target=powerpc-linux --prefix=/content/opt --with-gnu-as --with-gnu-ld --disable-shared --disable-libssp --enable-languages=c,c++ --enable-libstdcxx-allocator=mt --enable-sjlj-exceptions
Thread model: posix
gcc version 4.1.1

$ powerpc-wrs-vxworks-gcc -v
Using built-in specs.
Target: powerpc-wrs-vxworks
Configured with: ../gcc-4.1.1/configure --target=powerpc-wrs-vxworks --prefix=/opt/vxppc --with-headers=/home/agraham/gnu/vxh --with-gnu-as --with-gnu-ld --disable-shared --disable-libssp --enable-languages=c,c++ --enable-libstdcxx-allocator=mt --enable-sjlj-exceptions
Thread model: vxworks
gcc version 4.1.1

Using both of the above compilers, the disassembly of tryfunc() looks _exactly_ the same for both targets when compiling my test case with the following command line switches:
[...]-g++ -O0 -msoft-float -mcpu=405 -c bug.cc

...and both appear to be buggy.  I can attach those disassemblies, if anyone wants confirmation of this assertion.

Perhaps this isn't a WRS issue so much as a cross-compiler issue.
Comment 13 Aaron Graham 2006-08-11 02:42:43 UTC
The problem goes away (at least in this case) at optimization levels -O[s123] (but remains at -O0) when compiling with -fstack-protector.  Of course, that's not really an acceptable workaround for most people affected by this problem.
Comment 14 Aaron Graham 2006-08-14 18:16:47 UTC
The following patch to the 4.1.1 release code appears to fix the problem.  Though I have not been able to convince myself that this is the CORRECT solution to the problem (and am doubtful that it is), testing this fix with a very large and complex application has been very successful.

--- old/gcc/config/rs6000/rs6000.c	2006-08-10 11:26:48.000000000 -0400
+++ new/gcc/config/rs6000/rs6000.c	2006-08-10 22:46:55.000000000 -0400
@@ -18993,7 +18993,7 @@
   HOST_WIDE_INT offset;
 
   if (from == HARD_FRAME_POINTER_REGNUM && to == STACK_POINTER_REGNUM)
-    offset = info->push_p ? 0 : -info->total_size;
+    offset = info->push_p ? 8 : -info->total_size;
   else if (from == FRAME_POINTER_REGNUM && to == STACK_POINTER_REGNUM)
     {
       offset = info->push_p ? 0 : -info->total_size;
Comment 15 Jason Merrill 2006-09-08 07:14:33 UTC
The bug is in expand_builtin_setjmp_receiver:

  /* Now put in the code to restore the frame pointer, and argument
     pointer, if needed.  */
[...]
  emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);

This is wrong for PPC, and indeed any target with non-zero STARTING_FRAME_OFFSET.  A casual glance at the virtual register elimination code in function.c makes it clear that these two are not always equal.  We could fix this by changing it to

  emit_move_insn (virtual_stack_vars_rtx,
                  plus_constant (hard_frame_pointer_rtx,
                                 STARTING_FRAME_OFFSET));

but is there any reason not to just do

  emit_move_insn (frame_pointer_rtx, hard_frame_pointer_rtx);

?
Comment 16 Jason Merrill 2006-09-08 07:37:56 UTC
The line in question dates back to when __builtin_setjmp was first added in 1996.
Comment 17 Jason Merrill 2006-09-08 22:37:10 UTC
Hmm, it seems things are a bit more complicated than I thought.  Without my change to expand_builtin_setjmp_receiver, Janis's test passes at -O0 and fails at -O1; the adjustment of r31 at -O0 is actually correct.

However, with my change, it fails at -O0 and passes at -O1.  It seems that there's a problem in the sjlj unwinder.  Continuing to investigate...
Comment 18 Jason Merrill 2006-09-08 22:52:05 UTC
Janis: the most part of the -fstack-protector patch that seems plausible for causing this problem was the change to expand_function_end to call sjlj_emit_function_exit_after at a different point.  But that section of the patch was reverted in revision 101673.  Did that not fix the test?  I'm wondering if the current breakage is from a later change.

Comment 19 Jason Merrill 2006-09-09 07:16:42 UTC
Yep, after merging the 101673 change back in, the compiler works up until 101467, at which point Jakub's ppc sfp change seems to break the testcase (at -O1).
Comment 20 Jason Merrill 2006-09-12 18:02:51 UTC
Subject: Bug 28493

Author: jason
Date: Tue Sep 12 18:02:36 2006
New Revision: 116900

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=116900
Log:
        PR middle-end/28493
        * builtins.c (expand_builtin_setjmp_receiver): Clobber
        hard_frame_pointer_rtx after using it to update the frame pointer.

Added:
    trunk/gcc/testsuite/g++.dg/eh/unwind1.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/builtins.c

Comment 21 Jason Merrill 2006-09-14 23:13:43 UTC
Subject: Bug 28493

Author: jason
Date: Thu Sep 14 23:13:30 2006
New Revision: 116955

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=116955
Log:
        PR middle-end/28493
        * builtins.c (expand_builtin_setjmp_receiver): Clobber
        hard_frame_pointer_rtx after using it to update the frame pointer.

Added:
    branches/gcc-4_1-branch/gcc/testsuite/g++.dg/eh/unwind1.C
      - copied unchanged from r116900, trunk/gcc/testsuite/g++.dg/eh/unwind1.C
Modified:
    branches/gcc-4_1-branch/gcc/ChangeLog
    branches/gcc-4_1-branch/gcc/builtins.c

Comment 22 Andrew Pinski 2006-09-15 04:37:04 UTC
Fixed.