Bug 40987 - ifcvt sometimes goes wrong for integer modes > 64bit
Summary: ifcvt sometimes goes wrong for integer modes > 64bit
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 4.3.4
: P3 normal
Target Milestone: 4.9.0
Assignee: Not yet assigned to anyone
URL: http://gcc.gnu.org/ml/gcc-patches/200...
Keywords: patch, wrong-code
Depends on:
Blocks:
 
Reported: 2009-08-06 14:06 UTC by Thomas Philipp
Modified: 2023-06-02 01:02 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2009-09-26 10:18:29


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Philipp 2009-08-06 14:06:25 UTC
The following test fails when I disable all optimizations but if-conversion:

#include <stdio.h>

long long func(long arg) {
   long long val = 0;
   if (arg < 0) {
      val = 0xffffffff80000000ull;
   }
   return val;
}

int main() {
   long long result = func(-1);
   if (result != 0xffffffff80000000ull) {
      printf("failed.\n");
      return 1;
   } 
   printf("passed.\n");
   return 0;
}

The return value of func is 0x80000000ull where it should be 0xffffffff80000000ull. It works correctly when I don't do any optimization or with more optimization (-O1, -O2, -O3).

The compile command line is the following:
gcc test.c -O1 -fno-defer-pop -fno-delayed-branch -fno-guess-branch-probability -fno-cprop-registers -fno-loop-optimize -fno-if-conversion2 -fno-tree-ccp -fno-tree-dce -fno-tree-dominator-opts -fno-tree-dse -fno-tree-ter 
-fno-tree-lrs -fno-tree-sra -fno-tree-copyrename -fno-tree-fre -fno-tree-ch -fno-unit-at-a-time -fno-merge-constants -fno-omit-frame-pointer -o test

there are no compiler warnings, even with -Wall

I use gcc 4.3.4, configure options:
 ../gcc-4.3.4/configure --disable-shared --disable-nls --enable-languages=c++ --with-gnu-as --with-gnu-ld --enable-__cxa_atexit
Comment 1 Richard Biener 2009-08-06 14:57:57 UTC
"Doctor it hurts when I do this!" ...

well, I suggest you don't use this strange combination of options.  In particular
disabling ccp is known to cause interesting side-effects sometimes.

Keeping open in case someone wants to track down what happens here.
Comment 2 Thomas Philipp 2009-08-06 15:26:04 UTC
I can also build it like this and it fails:
gcc test.c -O1 -fno-tree-dce -o test
Comment 3 Mikael Pettersson 2009-08-06 15:58:52 UTC
Reproduced on i686-linux with the following compiler versions and minimal sets of options:

gcc-4.3-20090802: -O1 -fno-tree-dce
gcc-4.4-20090804: -O1 -fno-tree-ccp -fno-tree-dominator-opts
gcc-4.5-20090730: -O1

Note that 4.5 needed no -fno-... options to create the failure.
Comment 4 Thomas Philipp 2009-08-06 19:25:09 UTC
By the way, I see the same failure also with gcc 4.1.2 and 4.2 without any fno- options.

I thought it would be a good idea to hunt the problem down to a single optimization mechanism, which seems to be if-conversion. I guess on 4.3 and 4.4 some of the tree optimizations change the code in a way that the problem is not visible - but it is most likely there anyway and might be triggered by other test cases.
Comment 5 Steven Bosscher 2009-08-06 22:12:33 UTC
Instead of guessing -- why not inspect the dumps you get with -fdump-tree-all -fdump-rtl-all and see if someone can figure out which pass is the first to screw up the test case?
Comment 6 Steven Bosscher 2009-08-06 22:13:05 UTC
If it turns out to be the RTL if-conversion pass, assign this bug to me please.
Comment 7 Thomas Philipp 2009-08-07 07:37:52 UTC
This is how function func looks after the if conversion (test.c.144r.ce1):

(note 4 0 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)

(insn 2 4 3 2 test.c:3 (set (reg/v:SI 63 [ arg ])
        (mem/c/i:SI (reg/f:SI 16 argp) [0 arg+0 S4 A32])) 41 {*movsi_1} (expr_list:REG_EQUIV (mem/c/i:SI (reg/f:SI 16 argp) [0 arg+0 S4 A32])
        (nil)))

(note 3 2 7 2 NOTE_INSN_FUNCTION_BEG)

(insn 7 3 34 2 test.c:5 (set (reg:CCGOC 17 flags)
        (compare:CCGOC (reg/v:SI 63 [ arg ])
            (const_int 0 [0x0]))) 0 {*cmpsi_ccno_1} (nil))

(insn 34 7 35 2 test.c:5 (parallel [
            (set (reg:DI 65)
                (sign_extend:DI (reg/v:SI 63 [ arg ])))
            (clobber (reg:CC 17 flags))
            (clobber (scratch:SI))
        ]) 87 {*extendsidi2_1} (nil))

(insn 35 34 36 2 test.c:5 (parallel [
            (set (reg/v:DI 60 [ val ])
                (lshiftrt:DI (reg:DI 65)
                    (const_int 63 [0x3f])))
            (clobber (reg:CC 17 flags))
        ]) 356 {*lshrdi3_1} (nil))

(insn 36 35 24 2 test.c:5 (parallel [
            (set (reg/v:DI 60 [ val ])
                (ashift:DI (reg/v:DI 60 [ val ])
                    (const_int 31 [0x1f])))
            (clobber (reg:CC 17 flags))
        ]) 319 {*ashldi3_1} (nil))

(insn 24 36 30 2 test.c:9 (set (reg/i:DI 0 ax)
        (reg/v:DI 60 [ val ])) 63 {*movdi_2} (nil))

(insn 30 24 0 2 test.c:9 (use (reg/i:DI 0 ax)) -1 (nil))

This already looks wrong to me, but I am not completely sure. Could somebody have a look who understands this RTL code better than me?

The last version before the if conversion looks like this (test.c.141r.cse1):
(note 4 0 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)

(insn 2 4 3 2 test.c:3 (set (reg/v:SI 63 [ arg ])
        (mem/c/i:SI (reg/f:SI 16 argp) [0 arg+0 S4 A32])) 41 {*movsi_1} (expr_list:REG_EQUIV (mem/c/i:SI (reg/f:SI 16 argp) [0 arg+0 S4 A32])
        (nil)))

(note 3 2 7 2 NOTE_INSN_FUNCTION_BEG)

(insn 7 3 8 2 test.c:5 (set (reg:CCGOC 17 flags)
        (compare:CCGOC (reg/v:SI 63 [ arg ])
            (const_int 0 [0x0]))) 0 {*cmpsi_ccno_1} (nil))

(jump_insn 8 7 9 2 test.c:5 (set (pc)
        (if_then_else (lt (reg:CCGOC 17 flags)
                (const_int 0 [0x0]))
            (label_ref 13)
            (pc))) 401 {*jcc_1} (nil))

(note 9 8 10 3 [bb 3] NOTE_INSN_BASIC_BLOCK)

(insn 10 9 13 3 test.c:5 (set (reg/v:DI 60 [ val ])
        (const_int 0 [0x0])) 63 {*movdi_2} (nil))

(code_label 13 10 14 4 2 "" [1 uses])

(note 14 13 16 4 [bb 4] NOTE_INSN_BASIC_BLOCK)

(insn 16 14 17 4 test.c:6 (set (reg/v:DI 60 [ val ])
        (const_int -2147483648 [0x80000000])) 63 {*movdi_2} (nil))

(code_label 17 16 18 5 3 "" [0 uses])

(note 18 17 20 5 [bb 5] NOTE_INSN_BASIC_BLOCK)

(insn 20 18 24 5 test.c:8 (set (reg:DI 62 [ <result> ])
        (reg/v:DI 60 [ val ])) 63 {*movdi_2} (nil))

(insn 24 20 30 5 test.c:9 (set (reg/i:DI 0 ax)
        (reg/v:DI 60 [ val ])) 63 {*movdi_2} (nil))

(insn 30 24 0 5 test.c:9 (use (reg/i:DI 0 ax)) -1 (nil))

Comment 8 Richard Biener 2009-08-07 12:10:29 UTC
Seems to be a HWI32 issue as I cannot reproduce the fail on x86_64 with -m32.
Which would indeed hint at a RTL if-conversion problem (I guess it uses
scc in a wrong way).
Comment 9 Mikael Pettersson 2009-09-19 16:57:27 UTC
Seems like an if-conversion bug, in particular noce_try_store_flag_constants() appears to break on HWI32 platforms when long long literals are involved.

In this test case, noce_t_s_f_c() is invoked with an IF where a (the false value) and b (the true value) are both DImode CONST_INTs. a is zero. b is stored as 0x80000000, which is truncated but sign-extends to the correct value.

noce_t_s_f_c() then computes a subtraction and some log2() on these truncated values, and selects the second code generation template "x = (test != 0) << 3;".
The code becomes (from the ce1 dump file):

(insn 27 8 28 2 pr40987.c:4 (parallel [
            (set (reg:DI 62)
                (sign_extend:DI (reg/v:SI 60 [ arg ])))
            (clobber (reg:CC 17 flags))
            (clobber (scratch:SI))
        ]) 90 {*extendsidi2_1} (nil))

(insn 28 27 29 2 pr40987.c:4 (parallel [
            (set (reg/v:DI 58 [ val ])
                (lshiftrt:DI (reg:DI 62)
                    (const_int 63 [0x3f])))
            (clobber (reg:CC 17 flags))
        ]) 394 {*lshrdi3_1} (nil))

# since this is a logical downshift, it produces 0 or 1 from the sign bit

(insn 29 28 17 2 pr40987.c:4 (parallel [
            (set (reg/v:DI 58 [ val ])
                (ashift:DI (reg/v:DI 58 [ val ])
                    (const_int 31 [0x1f])))
            (clobber (reg:CC 17 flags))
        ]) 356 {*ashldi3_1} (nil))

# which is shifted up to produce 0 or 0x80000000, the high 32 bits are lost

All gcc-4.x releases have this bug. It does not trigger in gcc-3.4.6. There it seems that the true/false values are swapped compared to 4.x, and a test near the top of noce_t_s_f_c() causes it to bail out and not perform the conversion. (The test looks the same in 4.x, but the IF-operands are swapped so it does not bail.)

I don't know if this code can be fixed to handle longer-than-HWI literals. The simplest solution might be to just detect the situation and bail out.
Comment 10 Mikael Pettersson 2009-09-24 14:38:51 UTC
Patch posted for review:
http://gcc.gnu.org/ml/gcc-patches/2009-09/msg01655.html
Comment 11 Andrew Pinski 2012-01-21 21:27:18 UTC
This bug is now hard to hit on the trunk as i?86 requires host wide word as 64bit.
Comment 12 Andrew Pinski 2021-07-19 23:50:28 UTC
(In reply to Andrew Pinski from comment #11)
> This bug is now hard to hit on the trunk as i?86 requires host wide word as
> 64bit.

HWINT is now required to be 64bit for all targets even.
So this might happen if there is some TImode but then the patch might be best thing really.
Comment 13 Andrew Pinski 2023-06-02 00:56:08 UTC
Here is a x86_64-linux-gnu testcase:
```
#include <stdio.h>

typedef __int128_t mytype;
#define value (((__int128_t)(((unsigned long long)__LONG_LONG_MAX__)+1)) | (((__int128_t)0xffffffffffffffffull)<<64))

mytype func(long arg) __attribute__((noinline,noclone));
mytype func(long arg) {
   mytype val = 0;
   if (arg < 0) {
    // __int128_t t = 0x8000000000000000ull | (((__int128_t)0xffffffffffffffff)<<64);
      val = value;
   }
   return val;
}

int main() {
   mytype result = func(-1);
   if (result != value) {
      printf("failed.\n");
      return 1;
   } 
   printf("passed.\n");
   return 0;
}
```

It failed until GCC 4.9.0
Comment 14 Andrew Pinski 2023-06-02 01:02:09 UTC
Looks like this was fixed with r0-127324-ge15eb172b0dd (aka PR 59545).
Basically a signed integer overflow inside the compiler was causing the wrong code happening.