Bug 115631 - [15 Regression] GCN: [-PASS:-]{+FAIL:+} c-c++-common/torture/builtin-arith-overflow-6.c -O2 execution test
Summary: [15 Regression] GCN: [-PASS:-]{+FAIL:+} c-c++-common/torture/builtin-arith-ov...
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 15.0
: P3 normal
Target Milestone: 15.0
Assignee: Not yet assigned to anyone
URL:
Keywords: testsuite-fail, wrong-code
Depends on:
Blocks:
 
Reported: 2024-06-25 07:52 UTC by Thomas Schwinge
Modified: 2024-06-25 22:55 UTC (History)
2 users (show)

See Also:
Host:
Target: GCN
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments
A patch for a bug seen on arm*-*-* (732 bytes, patch)
2024-06-25 14:10 UTC, Richard Sandiford
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Schwinge 2024-06-25 07:52:11 UTC
With commit r15-1579-g792f97b44ffc5e6a967292b3747fd835e99396e7 "Add a late-combine pass [PR106594]", I see for GCN target testing (tested '-march=gfx908') regress for both C, C++:

    @@ -191300,7 +191300,7 @@ PASS: c-c++-common/torture/builtin-arith-overflow-6.c   -O0  (test for excess er
    PASS: c-c++-common/torture/builtin-arith-overflow-6.c   -O0  execution test
    UNSUPPORTED: c-c++-common/torture/builtin-arith-overflow-6.c   -O1
    PASS: c-c++-common/torture/builtin-arith-overflow-6.c   -O2  (test for excess errors)
    [-PASS:-]{+FAIL:+} c-c++-common/torture/builtin-arith-overflow-6.c   -O2  execution test
    UNSUPPORTED: c-c++-common/torture/builtin-arith-overflow-6.c   -O3 -g
    UNSUPPORTED: c-c++-common/torture/builtin-arith-overflow-6.c   -Os

    spawn -ignore SIGHUP [...]/build-gcc/gcc/gcn-run ./builtin-arith-overflow-6.exe
    GCN Kernel Aborted
    Kernel aborted
    FAIL: c-c++-common/torture/builtin-arith-overflow-6.c   -O2  execution test

With '-fno-late-combine-instructions', it's back to PASS.

The diff between good ('-fno-late-combine-instructions') vs. bad ('-flate-combine-instructions') of 'builtin-arith-overflow-6.s' as well as '-fdump-rtl-all' is big, so I'm not able to directly pinpoint one specific issue.

I however do observe a number of instances as follows (good vs. bad):

    [...]
            s_mov_b64       exec, -1
    [...]
    -       s_mov_b32       s12, 0
    -       v_writelane_b32 v0, s12, 0
            s_mov_b64       exec, 1
    +       v_mov_b32       v0, 0
            flat_store_dword        v[18:19], v0
    [...]

Might that "move across 'exec'" be in error?
Comment 1 Andrew Stubbs 2024-06-25 08:04:36 UTC
It was writing 0 to s12 (scalar register) and then moving the zero to lane zero of v0 (vector register).

Now it's writing the 0 directly to v0, of which all but lane zero is masked.

These should be identical (unless s12 was also live).

The problem must be elsewhere.
Comment 2 Richard Sandiford 2024-06-25 08:23:30 UTC
I suppose for issues like this, it would be useful to have a debug counter to bisect on.  I'll post a patch for doing that today, but I'm afraid I'll be relying on someone with gcn access to actually do the bisection.
Comment 3 Richard Sandiford 2024-06-25 12:01:26 UTC
I've now pushed a debug counter for late_combine.  Sorry to ask, but could you bisect on N in -fdbg-cnt=late_combine:N to see which transformation is causing the problem?
Comment 4 Richard Sandiford 2024-06-25 14:10:50 UTC
Created attachment 58513 [details]
A patch for a bug seen on arm*-*-*

Also, could you check whether the attached patch makes any difference?  It fixes a problem seen on arm*-*-*, and I notice GCN also defines cannot_copy_insn_p.