It appears to be a regression in 9. Gcc-8.3 works fine. Bisection points to g:8f70fdc31a7b0099e7322d0aba94830fb08f4c88 $ gcc-trunk -v gcc version 10.0.1 20200310 (experimental) [master revision cc5c935937d:88ecfc48953:3654d49d0ff651b2a78401bc2430428711e7d2eb] (GCC) $ gcc-trunk abc.c ; ./a.out 0 0 0 0 0 0 0 0 $ gcc-trunk -O3 abc.c ; ./a.out 0 0 0 4 0 0 0 0 $ gcc-8 -O3 abc.c ; ./a.out 0 0 0 0 0 0 0 0 $ cat abc.c int printf(const char *, ...); int *a; char b, f; static char c; short d[1][8][1]; int **e = &a; short *g = &d[0][3][0]; unsigned char **h; int i; int main() { unsigned char *j = &b; int k[] = {0, 0, 0, 4, 0, 0}; if (e) { unsigned char **l = &j; c = 2; for (; c >= 0; c--) { **l = f; *g = k[c + 3]; k[c + 1] = 0; } } else { unsigned char **m = &j; for (;;) h = m; } for (; i < 8; i++) printf("%d\n", d[0][i][0]); }
Somewhat reduced testcase: unsigned char b, f; short d[1][8][1], *g = &d[0][3][0]; int main () { int k[] = { 0, 0, 0, 4, 0, 0 }; for (int c = 2; c >= 0; c--) { b = f; *g = k[c + 3]; k[c + 1] = 0; } for (int i = 0; i < 8; i++) if (d[0][i][0] != 0) __builtin_abort (); return 0; }
Seems it is the ldist pass, which fails to figure out that k[c+3] load in the loop might alias with the k[c+1] = 0; store and moves all the 3 stores into a memset after the loop.
I'll have a look. Note the bisection point is a correctness fix only possibly resulting in less optimization.
Hm, it computes a dependence distance of two and in the end sorts partitions in the wrong order from a bogus partition dependence edge. The odd thing is that for for (int c = 0; c <= 2; c++) { b = f; *g = k[c + 3]; k[c + 1] = 0; } we compute a distance of minus two (two + DDR_REVERSED_P) but in both cases we need to use the same partition ordering, memset after the partition containing the k[c+3] load but the partition dependence code from the DDRs appearant different direction handles both cases differently. Something is missing here. Not sure what - Bin, any idea?
Thanks for CCing, I will have a look this WE.
GCC 9.3.0 has been released, adjusting target milestone.
Patch at https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542038.html It's a latent bug exposed by the mentioned alias analysis change, however: unsigned char b, f; short d[1][8][1], *g = &d[0][3][0]; int main () { int k[] = { 0, 0, 0, 4, 0, 0 }; for (int c = 2; c >= 0; c--) { b = f; *g = k[c + 3]; k[c + 1] = 0; } for (int i = 0; i < 8; i++) if (d[0][i][0] != 0) __builtin_abort (); return 0; } We can't tell no-alias info for pairs <f, *g> and <b, *g>. Is this expected or should be improved? Thanks
The master branch has been updated by Bin Cheng <amker@gcc.gnu.org>: https://gcc.gnu.org/g:e4e9a59105a81cdd6c1328b0a5ed9fe4cc82840e commit r10-7184-ge4e9a59105a81cdd6c1328b0a5ed9fe4cc82840e Author: Bin Cheng <bin.cheng@linux.alibaba.com> Date: Mon Mar 16 11:09:14 2020 +0800 Update post order number for merged SCC. Function loop_distribution::break_alias_scc_partitions needs to compute SCC with runtime alias edges skipped. As a result, partitions could be re-assigned larger post order number than SCC's precedent partition and distributed before the precedent one. This fixes the issue by updating the merged partition to the minimal post order in SCC. gcc/ PR tree-optimization/94125 * tree-loop-distribution.c (loop_distribution::break_alias_scc_partitions): Update post order number for merged scc. gcc/testsuite/ PR tree-optimization/94125 * gcc.dg/tree-ssa/pr94125.c: New test.
On Sun, 15 Mar 2020, amker at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94125 > > --- Comment #7 from bin cheng <amker at gcc dot gnu.org> --- > Patch at https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542038.html > It's a latent bug exposed by the mentioned alias analysis change, however: > > > unsigned char b, f; > short d[1][8][1], *g = &d[0][3][0]; > > int > main () > { > int k[] = { 0, 0, 0, 4, 0, 0 }; > for (int c = 2; c >= 0; c--) > { > b = f; > *g = k[c + 3]; > k[c + 1] = 0; > } > for (int i = 0; i < 8; i++) > if (d[0][i][0] != 0) > __builtin_abort (); > return 0; > } > > We can't tell no-alias info for pairs <f, *g> and <b, *g>. Is this expected or > should be improved? This is expected when g is not static [const] (we should discover it as const) or when not using LTO (which should as well promote the variable to const short *). Adding a restrict qualifier to g should also solve it.
Thanks Bin, fixed on trunk sofar.
(In reply to Richard Biener from comment #10) > Thanks Bin, fixed on trunk sofar. Hmm, if it's fine, I will backport this to GCC9. Thanks
On Wed, 18 Mar 2020, amker at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94125 > > --- Comment #11 from bin cheng <amker at gcc dot gnu.org> --- > (In reply to Richard Biener from comment #10) > > Thanks Bin, fixed on trunk sofar. > > Hmm, if it's fine, I will backport this to GCC9. I think so.
The releases/gcc-9 branch has been updated by Bin Cheng <amker@gcc.gnu.org>: https://gcc.gnu.org/g:95c969e58f7905b14d3f2889cf41595eb2c13cbb commit r9-8411-g95c969e58f7905b14d3f2889cf41595eb2c13cbb Author: Bin Cheng <bin.cheng@linux.alibaba.com> Date: Tue Mar 24 17:40:21 2020 +0800 backport PR94125: Update post order number for merged SCC. Function loop_distribution::break_alias_scc_partitions needs to compute SCC with runtime alias edges skipped. As a result, partitions could be re-assigned larger post order number than SCC's precedent partition and distributed before the precedent one. This fixes the issue by updating the merged partition to the minimal post order in SCC. Backport from mainline. PR tree-optimization/94125 * tree-loop-distribution.c (loop_distribution::break_alias_scc_partitions): Update post order number for merged scc. * gcc.dg/tree-ssa/pr94125.c: New test.
Fixed.