Summary: | [12/13/14/15 Regression] wrong code at -O3 on x86_64-linux-gnu since r12-2097-g9f34b780b0461e | ||
---|---|---|---|
Product: | gcc | Reporter: | Zhendong Su <zhendong.su> |
Component: | tree-optimization | Assignee: | Richard Biener <rguenth> |
Status: | ASSIGNED --- | ||
Severity: | normal | CC: | sjames |
Priority: | P2 | Keywords: | wrong-code |
Version: | 14.0 | ||
Target Milestone: | 12.5 | ||
See Also: |
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112281 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115347 |
||
Host: | Target: | x86_64-*-* | |
Build: | Known to work: | ||
Known to fail: | Last reconfirmed: | 2023-12-05 00:00:00 |
Description
Zhendong Su
2023-12-05 08:49:25 UTC
t.c:9:12: optimized: Loop nest 1 distributed: split to 2 loops and 0 library calls. -fno-tree-loop-distribution fixes this GCC 11 distributes the inner loop, but only splits out the assignment to 'j' there while with 12+ we distribute away the store to e[h].c. Looks somewhat familiar to the other "aggregate copy" case where we have no evolution in the inner loop. The critical dependence is (Data Dep: #(Data Ref: # bb: 5 # stmt: *i.1_1 = e[h.7_24]; # ref: e[h.7_24]; # base_object: e; # Access function 0: {h.7_20, +, 1}_1 #) #(Data Ref: # bb: 5 # stmt: e[h.7_24].c = 1; # ref: e[h.7_24].c; # base_object: e; # Access function 0: 32 # Access function 1: {h.7_20, +, 1}_1 #) access_fn_A: {h.7_20, +, 1}_1 access_fn_B: {h.7_20, +, 1}_1 (subscript iterations_that_access_an_element_twice_in_A: [0] last_conflict: scev_not_known iterations_that_access_an_element_twice_in_B: [0] last_conflict: scev_not_known (Subscript distance: 0 )) loop nest: (1 2 ) distance_vector: 0 0 direction_vector: = = ) which is then running into /* If the overlap is exact preserve stmt order. */ else if (lambda_vector_zerop (DDR_DIST_VECT (ddr, 0), DDR_NB_LOOPS (ddr))) ; so maybe that special-casing is indeed incorrect, and the special-casing I added /* When then dependence distance of the innermost common loop of the DRs is zero we have a conflict. */ auto l1 = gimple_bb (DR_STMT (dr1))->loop_father; auto l2 = gimple_bb (DR_STMT (dr2))->loop_father; int idx = index_in_loop_nest (find_common_loop (l1, l2)->num, DDR_LOOP_NEST (ddr)); if (DDR_DIST_VECT (ddr, 0)[idx] == 0) this_dir = 2; should instead be handled to somehow handle more generally the situation that two refs conflict in a loop where the refs do not evolve. Thanks for these testcases btw. > Thanks for these testcases btw.
Happy to be of help, Richard.
And thanks to you folks for the incredible work and dedication!
bisected to r12-2097-g9f34b780b0461e GCC 12.4 is being released, retargeting bugs to GCC 12.5. |