Testcase: signed char e; short f = 8; signed d; int(g)(int o, int r) { return o & (o ^ -1) < 0 ? o : o - r; } #pragma pack(1) struct { short h; unsigned : 18; short i; long j; int k; char l; long m; int n; } a, b, s, c, *p = &b, *u = &s, q = {1}; void t() { *p = a; for (; e > -7; e = g(e, 8)) ; q = *u = c; for (; d - 3; d = 3) ; } int main() { t(); if (f == 8) return 0; else return 1; } Commands: rv64gcv: > /scratch/tc-testing/tc-jan-1-trunk/build-rv64gcv/bin/riscv64-unknown-linux-gnu-gcc -march=rv64gcv -O3 red.c -o user-config.out > QEMU_CPU=rv64,vlen=128,v=true,vext_spec=v1.0,Zve32f=true,Zve64f=true timeout --verbose -k 0.1 1 /scratch/tc-testing/tc-dec-22-trunk/build-rv64gcv/bin/qemu-riscv64 user-config.out > echo $? 1 rv64gc: > /scratch/tc-testing/tc-jan-1-trunk/build-rv64gcv/bin/riscv64-unknown-linux-gnu-gcc -march=rv64gc -O3 red.c -o user-config.out > QEMU_CPU=rv64,vlen=128,v=true,vext_spec=v1.0,Zve32f=true,Zve64f=true timeout --verbose -k 0.1 1 /scratch/tc-testing/tc-dec-22-trunk/build-rv64gcv/bin/qemu-riscv64 user-config.out > echo $? 0 Nothing touches f, so it should still be 8 after the function.
Do you use the latest upstream GCC ? I tried it, but didn't reproduce the issue.
This was with (In reply to JuzheZhong from comment #1) > Do you use the latest upstream GCC ? > > I tried it, but didn't reproduce the issue. I tested with r14-6884-g046cea56fd1. Since it's to be overwriting unrelated variables we might be dealing with a similar situation as pr112929 where it was challenging to reproduce. I'll rebuild with tip-of-tree and let you know if it passes/fails.
Ok. I saw the bug in assembly. It's odd that I can't reproduce the run FAIL in simulator. I will fix it soon.
The master branch has been updated by Pan Li <panli@gcc.gnu.org>: https://gcc.gnu.org/g:4a0a8dc1b88408222b88e10278017189f6144602 commit r14-6902-g4a0a8dc1b88408222b88e10278017189f6144602 Author: Juzhe-Zhong <juzhe.zhong@rivai.ai> Date: Thu Jan 4 06:38:43 2024 +0800 RISC-V: Fix bug of earliest fusion for infinite loop[VSETVL PASS] As PR113206 and PR113209, the bugs happens on the following situation: li a4,32 ... vsetvli zero,a4,e8,m8,ta,ma ... slliw a4,a3,24 sraiw a4,a4,24 bge a3,a1,.L8 sb a4,%lo(e)(a0) vsetvli zero,a4,e8,m8,ta,ma --> a4 is polluted value not the expected "32". ... .L7: j .L7 ---> infinite loop. The root cause is that infinite loop confuse earliest computation and let earliest fusion happens on unexpected place. Disable blocks that belong to infinite loop to fix this bug since applying ealiest LCM fusion on infinite loop seems quite complicated and we don't see any benefits. Note that disabling earliest fusion on infinite loops doesn't hurt the vsetvli performance, instead, it does improve codegen of some cases. Tested on both RV32 and RV64 no regression. PR target/113206 PR target/113209 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (invalid_opt_bb_p): New function. (pre_vsetvl::compute_lcm_local_properties): Disable earliest fusion on blocks belong to infinite loop. (pre_vsetvl::emit_vsetvl): Remove fake edges. * config/riscv/t-riscv: Add a new include file. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/avl_single-23.c: Adapt test. * gcc.target/riscv/rvv/vsetvl/vlmax_call-1.c: Robostify test. * gcc.target/riscv/rvv/vsetvl/vlmax_call-2.c: Ditto. * gcc.target/riscv/rvv/vsetvl/vlmax_call-3.c: Ditto. * gcc.target/riscv/rvv/vsetvl/vlmax_conflict-5.c: Ditto. * gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-1.c: Ditto. * gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-2.c: Ditto. * gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-3.c: Ditto. * gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-4.c: Ditto. * gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-5.c: Ditto. * gcc.target/riscv/rvv/autovec/pr113206-1.c: New test. * gcc.target/riscv/rvv/autovec/pr113206-2.c: New test. * gcc.target/riscv/rvv/autovec/pr113209.c: New test.
Both PR113206 and PR113209 are the same root cause and I have fixed both of them. Could you try the latest upstream GCC test SPEC 527/549 again to see whether it fixes the bugs in SPEC?
Confirmed fixed, thanks for the quick fix! I've kicked off a spec run.
527 still fails on zvl128. I'll let the rest of spec run overnight and let you know the status of 549 once it finishes.
It seems that we still didn't locate the real problem of failed SPEC you ran. Do you have any other ideas to locale the real problem ? Li Pan didn't locate the problem neither.
(In reply to JuzheZhong from comment #8) > It seems that we still didn't locate the real problem of failed SPEC you ran. > Do you have any other ideas to locale the real problem ? > > Li Pan didn't locate the problem neither. Using 4a0a8dc1b88408222b88e10278017189f6144602, the spec run failed on: zvl128b (All runtime fails): 527.cam4 (Runtime) 531.deepsjeng (Runtime) 521.wrf (Runtime) 523.xalancbmk (Runtime) zvl256b: 507.cactuBSSN (Runtime) 521.wrf (Build) 527.cam4 (Runtime) 531.deepsjeng (Runtime) 549.fotonik3d (Runtime) With that info I think the next steps are: 1. Triage the zvl256b 521.wrf build failure 2. Bisect the newly-failing testcases 3. Finish triaging the remaining testcases the fuzzer found 4. Attempt to manually reduce cam4 for zvl128b (since it seems to have the fastest build+runtime) 5. Attempt to manually reduce other fails.