Bug 113206 - [14] RISC-V rv64gcv vector: Runtime mismatch with rv64gc
Summary: [14] RISC-V rv64gcv vector: Runtime mismatch with rv64gc
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 14.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2024-01-02 19:08 UTC by Patrick O'Neill
Modified: 2024-01-04 16:44 UTC (History)
4 users (show)

See Also:
Host:
Target: riscv
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Patrick O'Neill 2024-01-02 19:08:33 UTC
Testcase:
signed char e;
short f = 8;
signed d;
int(g)(int o, int r) { return o & (o ^ -1) < 0 ? o : o - r; }
#pragma pack(1)
struct {
  short h;
  unsigned : 18;
  short i;
  long j;
  int k;
  char l;
  long m;
  int n;
} a, b, s, c, *p = &b, *u = &s, q = {1};
void t() {
  *p = a;
  for (; e > -7; e = g(e, 8))
    ;
  q = *u = c;
  for (; d - 3; d = 3)
    ;
}
int main() {
  t();
  if (f == 8)
    return 0;
  else
    return 1;
}

Commands:
rv64gcv:
> /scratch/tc-testing/tc-jan-1-trunk/build-rv64gcv/bin/riscv64-unknown-linux-gnu-gcc -march=rv64gcv -O3 red.c -o user-config.out
> QEMU_CPU=rv64,vlen=128,v=true,vext_spec=v1.0,Zve32f=true,Zve64f=true timeout --verbose -k 0.1 1 /scratch/tc-testing/tc-dec-22-trunk/build-rv64gcv/bin/qemu-riscv64 user-config.out
> echo $?
1

rv64gc:
> /scratch/tc-testing/tc-jan-1-trunk/build-rv64gcv/bin/riscv64-unknown-linux-gnu-gcc -march=rv64gc -O3 red.c -o user-config.out
> QEMU_CPU=rv64,vlen=128,v=true,vext_spec=v1.0,Zve32f=true,Zve64f=true timeout --verbose -k 0.1 1 /scratch/tc-testing/tc-dec-22-trunk/build-rv64gcv/bin/qemu-riscv64 user-config.out
> echo $?
0

Nothing touches f, so it should still be 8 after the function.
Comment 1 JuzheZhong 2024-01-02 23:20:36 UTC
Do you use the latest upstream GCC ?

I tried it, but didn't reproduce the issue.
Comment 2 Patrick O'Neill 2024-01-02 23:26:38 UTC
This was with (In reply to JuzheZhong from comment #1)
> Do you use the latest upstream GCC ?
> 
> I tried it, but didn't reproduce the issue.

I tested with r14-6884-g046cea56fd1. Since it's to be overwriting unrelated variables we might be dealing with a similar situation as pr112929 where it was challenging to reproduce.

I'll rebuild with tip-of-tree and let you know if it passes/fails.
Comment 3 JuzheZhong 2024-01-03 01:15:35 UTC
Ok. I saw the bug in assembly.
It's odd that I can't reproduce the run FAIL in simulator.

I will fix it soon.
Comment 4 GCC Commits 2024-01-04 00:33:22 UTC
The master branch has been updated by Pan Li <panli@gcc.gnu.org>:

https://gcc.gnu.org/g:4a0a8dc1b88408222b88e10278017189f6144602

commit r14-6902-g4a0a8dc1b88408222b88e10278017189f6144602
Author: Juzhe-Zhong <juzhe.zhong@rivai.ai>
Date:   Thu Jan 4 06:38:43 2024 +0800

    RISC-V: Fix bug of earliest fusion for infinite loop[VSETVL PASS]
    
    As PR113206 and PR113209, the bugs happens on the following situation:
    
            li      a4,32
            ...
            vsetvli zero,a4,e8,m8,ta,ma
            ...
            slliw   a4,a3,24
            sraiw   a4,a4,24
            bge     a3,a1,.L8
            sb      a4,%lo(e)(a0)
            vsetvli zero,a4,e8,m8,ta,ma  --> a4 is polluted value not the expected "32".
            ...
    .L7:
            j       .L7 ---> infinite loop.
    
    The root cause is that infinite loop confuse earliest computation and let earliest fusion
    happens on unexpected place.
    
    Disable blocks that belong to infinite loop to fix this bug since applying ealiest LCM fusion
    on infinite loop seems quite complicated and we don't see any benefits.
    
    Note that disabling earliest fusion on infinite loops doesn't hurt the vsetvli performance,
    instead, it does improve codegen of some cases.
    
    Tested on both RV32 and RV64 no regression.
    
            PR target/113206
            PR target/113209
    
    gcc/ChangeLog:
    
            * config/riscv/riscv-vsetvl.cc (invalid_opt_bb_p): New function.
            (pre_vsetvl::compute_lcm_local_properties): Disable earliest fusion on
            blocks belong to infinite loop.
            (pre_vsetvl::emit_vsetvl): Remove fake edges.
            * config/riscv/t-riscv: Add a new include file.
    
    gcc/testsuite/ChangeLog:
    
            * gcc.target/riscv/rvv/vsetvl/avl_single-23.c: Adapt test.
            * gcc.target/riscv/rvv/vsetvl/vlmax_call-1.c: Robostify test.
            * gcc.target/riscv/rvv/vsetvl/vlmax_call-2.c: Ditto.
            * gcc.target/riscv/rvv/vsetvl/vlmax_call-3.c: Ditto.
            * gcc.target/riscv/rvv/vsetvl/vlmax_conflict-5.c: Ditto.
            * gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-1.c: Ditto.
            * gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-2.c: Ditto.
            * gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-3.c: Ditto.
            * gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-4.c: Ditto.
            * gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-5.c: Ditto.
            * gcc.target/riscv/rvv/autovec/pr113206-1.c: New test.
            * gcc.target/riscv/rvv/autovec/pr113206-2.c: New test.
            * gcc.target/riscv/rvv/autovec/pr113209.c: New test.
Comment 5 JuzheZhong 2024-01-04 00:47:40 UTC
Both PR113206 and PR113209 are the same root cause and I have fixed both of them.

Could you try the latest upstream GCC test SPEC 527/549 again to see whether it
fixes the bugs in SPEC?
Comment 6 Patrick O'Neill 2024-01-04 01:36:58 UTC
Confirmed fixed, thanks for the quick fix!
I've kicked off a spec run.
Comment 7 Patrick O'Neill 2024-01-04 02:32:16 UTC
527 still fails on zvl128. I'll let the rest of spec run overnight and let you know the status of 549 once it finishes.
Comment 8 JuzheZhong 2024-01-04 03:46:22 UTC
It seems that we still didn't locate the real problem of failed SPEC you ran.
Do you have any other ideas to locale the real problem ?

Li Pan didn't locate the problem neither.
Comment 9 Patrick O'Neill 2024-01-04 16:44:43 UTC
(In reply to JuzheZhong from comment #8)
> It seems that we still didn't locate the real problem of failed SPEC you ran.
> Do you have any other ideas to locale the real problem ?
> 
> Li Pan didn't locate the problem neither.

Using 4a0a8dc1b88408222b88e10278017189f6144602, the spec run failed on:
zvl128b (All runtime fails):
527.cam4 (Runtime)
531.deepsjeng (Runtime)
521.wrf (Runtime)
523.xalancbmk (Runtime)

zvl256b:
507.cactuBSSN (Runtime)
521.wrf (Build)
527.cam4 (Runtime)
531.deepsjeng (Runtime)
549.fotonik3d (Runtime)

With that info I think the next steps are:
1. Triage the zvl256b 521.wrf build failure
2. Bisect the newly-failing testcases
3. Finish triaging the remaining testcases the fuzzer found
4. Attempt to manually reduce cam4 for zvl128b (since it seems to have the fastest build+runtime)
5. Attempt to manually reduce other fails.