GCC 2.95.2 for AIX handles 'long long' incorrectly

Sat Mar 11 13:44:00 GMT 2000

Am Fri, 10 Mar 2000 schrieb Franz Sirl:
>At 02:38 09.03.00, Geoff Keating wrote:
>
>>Wayne Scott <wscott@ichips.intel.com> writes:
>>
>> > Attached is a simple testcase of a function that returns a 'long long'
>> >
>> > GCC version 2.95.2 produced incorrect results when the optimizer is
>> > enabled.
>> >
>> > # this is a IBM 397 with 1G of memory
>> > $ uname -a
>> > AIX pdxcs566 1 4 000111179400
>>
>>This appears to be fixed in the newppc-branch and so will be fixed in
>>the next release.  I'm not sure what patch fixed it or whether it's
>>fixed in the mainline CVS.
>
>The patch that fixed it is Richard Hendersons stupid.c-die-die-die patch. I 
>tried parts of the patch on the gcc-2_95-branch and this is the one that 
>fixes the testcase:
>
>Index: stmt.c
>===================================================================
>RCS file: /cvs/gcc/egcs/gcc/stmt.c,v
>retrieving revision 1.71.4.4
>diff -u -p -r1.71.4.4 stmt.c
>--- stmt.c      2000/01/07 22:41:20     1.71.4.4
>+++ stmt.c      2000/03/10 11:56:01
>@@ -2521,7 +2521,8 @@ expand_value_return (val)
>  #endif
>        emit_move_insn (return_reg, val);
>      }
>-  if (GET_CODE (return_reg) == REG
>+  if (obey_regdecls
>+      && GET_CODE (return_reg) == REG
>        && REGNO (return_reg) < FIRST_PSEUDO_REGISTER)
>      emit_insn (gen_rtx_USE (VOIDmode, return_reg));
>    /* Handle calls that return values in multiple non-contiguous locations.
>
>Somehow the USEs confuse other parts of the compiler, but unfortunately the 
>patch breaks the bootstrap of the gcc-2_95-branch :-(, so I'm probably 
>something missing here.
>
>The question now is how do we want to proceed here? Should stupid.c die in 
>the gcc-2_95-branch too? Or should I try to find another solution, possibly 
>based on the above patch and other parts of the stupid.c patch? Any 
>ideas/hints?

Ugh, actually the bug is totally different and Richard's stupid.c-diediedie
patch just hides the real problem in emit-rtl.c:operand_subword(). What happens
is that during jump_optimize the compiler gets confused about the high and low
words of a reg:DI if it contains a hardreg. If the generation of USEs is
suppressed as without stupid.c the compiler uses pseudos and just hides the bug.

I removed this optimization:

Index: emit-rtl.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/emit-rtl.c,v
retrieving revision 1.59.4.4
diff -u -p -r1.59.4.4 emit-rtl.c

--- emit-rtl.c  1999/08/11 07:28:52     1.59.4.4
+++ emit-rtl.c  2000/03/11 20:47:08
@@ -1256,19 +1256,8 @@ operand_subword (op, i, validate_address
          && (! HARD_REGNO_MODE_OK (REGNO (op), word_mode)
              || ! HARD_REGNO_MODE_OK (REGNO (op) + i, word_mode)))
        return 0;
-      else if (REGNO (op) >= FIRST_PSEUDO_REGISTER
-              || (REG_FUNCTION_VALUE_P (op)
-                  && rtx_equal_function_value_matters)
-              /* We want to keep the stack, frame, and arg pointers
-                 special.  */
-              || op == frame_pointer_rtx
-#if FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM
-              || op == arg_pointer_rtx
-#endif
-              || op == stack_pointer_rtx)
-       return gen_rtx_SUBREG (word_mode, op, i);
       else
-       return gen_rtx_REG (word_mode, REGNO (op) + i);
+       return gen_rtx_SUBREG (word_mode, op, i);
     }
   else if (GET_CODE (op) == SUBREG)
     return gen_rtx_SUBREG (word_mode, SUBREG_REG (op), i + SUBREG_WORD (op));


and everything is OK afterwards. The compiler bootstraps and passes the
testsuite without regressions on powerpc-linux-gnu.

I used Wayne's example slightly modified:

typedef unsigned long long uint64;
const uint64 bigconst = 1ULL << 34;

int a = 1;

static
uint64 getmask(void)
{
    if (a)
      return bigconst;
    else
      return 0;
}

main()
{
    uint64 f = getmask();
    if (f != bigconst) abort ();
    exit (0);
}

Without my patch this was the offending RTL after jump:

(insn 51 53 52 (parallel[
            (set (subreg:SI (reg:DI 95) 0)
                (and:SI (subreg:SI (reg:DI 89) 0)
                    (reg:SI 3 r3)))
            (clobber (scratch:CC))
        ] ) -1 (nil)
    (expr_list:REG_NO_CONFLICT (reg:DI 89)
        (expr_list:REG_NO_CONFLICT (reg/i:DI 3 r3)
            (nil))))

(insn 52 51 55 (parallel[
            (set (subreg:SI (reg:DI 95) 1)
                (and:SI (subreg:SI (reg:DI 89) 1)
                    (reg:SI 4 r4)))
            (clobber (scratch:CC))
        ] ) -1 (nil)
    (expr_list:REG_NO_CONFLICT (reg:DI 89)
        (expr_list:REG_NO_CONFLICT (reg/i:DI 3 r3)
            (nil))))

(insn 55 52 57 (set (reg:DI 95)
        (reg:DI 95)) -1 (nil)
    (insn_list:REG_RETVAL 53 (expr_list:REG_EQUAL (and:DI (reg:DI 89)
                (reg/i:DI 3 r3))
            (nil))))

(insn 57 55 59 (set (reg:DI 89)
        (reg:DI 95)) -1 (nil)
    (nil))

(insn 59 57 34 (set (reg/i:DI 3 r3)
        (reg:DI 89)) -1 (nil)
    (nil))

(insn 34 59 0 (use (reg/i:DI 3 r3)) -1 (nil)
    (nil))

Note that the subreg index 0 in (and:SI (subreg:SI (reg:DI 89) 0) corresponds
to the index used in calling operand_subword. It seems this index always has
the meaning 0==lowpart and 1==highpart, in contrary to the comment before
operand_subword()... Compared to that the rN+0 is the high part of a hardreg
and rN+1 is the lowpart on PPC.

Now with my patch the RTL looks like this:

(insn 52 54 53 (parallel[
            (set (subreg:SI (reg:DI 96) 0)
                (and:SI (subreg:SI (reg:DI 88) 0)
                    (subreg:SI (reg:DI 90) 0)))
            (clobber (scratch:CC))
        ] ) -1 (nil)
    (expr_list:REG_NO_CONFLICT (reg:DI 88)
        (expr_list:REG_NO_CONFLICT (reg:DI 90)
            (nil))))

(insn 53 52 56 (parallel[
            (set (subreg:SI (reg:DI 96) 1)
                (and:SI (subreg:SI (reg:DI 88) 1)
                    (subreg:SI (reg:DI 90) 1)))
            (clobber (scratch:CC))
        ] ) -1 (nil)
    (expr_list:REG_NO_CONFLICT (reg:DI 88)
        (expr_list:REG_NO_CONFLICT (reg:DI 90)
            (nil))))

(insn 56 53 58 (set (reg:DI 96)
        (reg:DI 96)) -1 (nil)
    (insn_list:REG_RETVAL 54 (expr_list:REG_EQUAL (and:DI (reg:DI 88)
                (reg:DI 90))
            (nil))))

(insn 58 56 0 (set (reg/i:DI 3 r3)
        (reg:DI 96)) -1 (nil)
    (nil))

I did play with other solutions involving WORDS_BIG_ENDIAN, but I couldn't get
it to work. What happens now is that operand subword returns (subreg:SI
(reg/i:DI 3 r3) 0)/(subreg:SI (reg/i:DI 3 r3) 1) instead of the wrong (reg:SI 3
r3)/(reg:SI 4 r4).

A backtrace:

Breakpoint 2, operand_subword (op=0x1034b0b8, i=0, validate_address=1,
mode=DImode)     at ../../../gcc295/gcc/emit-rtl.c:1260
1260            return gen_rtx_SUBREG (word_mode, op, i);
(gdb) p debug_rtx(op)

(reg/i:DI 3 r3)
$19 = void
(gdb) l
1255          if (REGNO (op) < FIRST_PSEUDO_REGISTER
1256              && (! HARD_REGNO_MODE_OK (REGNO (op), word_mode)
1257                  || ! HARD_REGNO_MODE_OK (REGNO (op) + i, word_mode)))
1258            return 0;
1259          else
1260            return gen_rtx_SUBREG (word_mode, op, i);
1261        }
1262      else if (GET_CODE (op) == SUBREG)
1263        return gen_rtx_SUBREG (word_mode, SUBREG_REG (op), i + SUBREG_WORD (op));
1264      else if (GET_CODE (op) == CONCAT)
(gdb) bt
#0  operand_subword (op=0x1034b0b8, i=0, validate_address=1, mode=DImode) at ../../../gcc295/gcc/emit-rtl.c:1260
#1  0x100c4500 in operand_subword_force (op=0x1034b0b8, i=0, mode=DImode) at ../../../gcc295/gcc/emit-rtl.c:1525
#2  0x1009da84 in expand_binop (mode=DImode, binoptab=0x10355d10, op0=0x10348de8, op1=0x1034b0b8, target=0x10349160,
    unsignedp=0, methods=OPTAB_LIB_WIDEN) at ../../../gcc295/gcc/optabs.c:1002
#3  0x10096114 in expand_and (op0=0x1034b0b8, op1=0x10348de8, target=0x10348de8) at ../../../gcc295/gcc/expmed.c:4051
#4  0x101229c8 in jump_optimize_1 (f=0x1034ae78, cross_jump=0, noop_moves=0, after_regscan=1, mark_labels_only=0)
    at ../../../gcc295/gcc/jump.c:1175
#5  0x10120a0c in jump_optimize (f=0x1034ae78, cross_jump=0, noop_moves=0, after_regscan=1)
    at ../../../gcc295/gcc/jump.c:143
#6  0x10006bdc in rest_of_compilation (decl=0x1035d620) at ../../../gcc295/gcc/toplev.c:3845
#7  0x1029bc60 in finish_function (nested=0) at ../../../gcc295/gcc/c-decl.c:7269
#8  0x102841d0 in yyparse () at c-parse.y:313
#9  0x10005420 in compile_file (name=0x7ffffdab "ret64.i") at ../../../gcc295/gcc/toplev.c:3267
#10 0x1000a730 in main (argc=11, argv=0x7ffffc74) at ../../../gcc295/gcc/toplev.c:5444
#11 0xff0a7dc in Letext () at ../sysdeps/powerpc/elf/libc-start.c:106
(gdb) fin
Run till exit from #0  operand_subword (op=0x1034b0b8, i=0, validate_address=1, mode=DImode)
    at ../../../gcc295/gcc/emit-rtl.c:1260
0x100c4500 in operand_subword_force (op=0x1034b0b8, i=0, mode=DImode) at ../../../gcc295/gcc/emit-rtl.c:1525
1525      rtx result = operand_subword (op, i, 1, mode);
Value returned is $20 = 0x10349190
(gdb) p debug_rtx(0x10349190)

(subreg:SI (reg/i:DI 3 r3) 0)
$21 = void

Does anyone have an idea for an other/better solution?

Franz.