Created attachment 34403 [details] 1.c 2.c 1.c.028t.esra 2.c.028t.esra 1.c.030t.fre1 2.c.030t.fre1 1.s 2.s compile_gcc.sh Files 1.c and 2.c in attach are equivalent from C/C++ standpoint: $ diff 1.c 2.c 3c3,5 < return *(*q = ++*p); --- > ++*p; > *q = *p; > return **p; Compiled with gcc-5.0.0, disassembled with objdump (GNU Binutils) 2.24 (see full script compile_gcc.sh in attach). For 1.c gcc generates better code than for 2.c (compare 1.s vs 2.s in attach). I looked at GIMPLE optimization dumps (-fdump-tree-all) and found that up to *.029t.ealias pass dumps differ insignificantly (attached *.028t.esra dumps as they are shorter), but .030t.fre1 pass fails to reduce intermediate variable '_9' in the second case (attached *.030t.fre dumps). Looks like a bug in full redundancy elimination.
The sources are different in that 2.c dereferences *p one more time after *q is stored to while 1.c dereferences *q for the return value. Thus an equivalent 2.c would be int f (int ** p, int ** q) { ++*p; *q = *p; return **q; } it is true that we can still optimize the original 2.c but only because *p and *q are equivalent (but it is probably not worthwhile the compile-time cost handling this). That is, we have to assume that p == q and thus the store to *q invalidates the previously load *p. I also think we have a duplicate report for this somewhere. Simpler testcase (which is probably that in the duplicate): int foo (int *p, int *q) { *p = 1; *q = 1; return *p; }
> we have to assume that p == q and thus the store to *q invalidates the previously load *p I see. It seemed to me from GIMPLE dumps that both cases are equally easy to optimize, perhaps I'm missing something.
On Wed, 14 Jan 2015, skvadrik at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64541 > > --- Comment #2 from Ulya <skvadrik at gmail dot com> --- > > we have to assume that p == q and thus the store to *q invalidates the previously load *p > > I see. > It seemed to me from GIMPLE dumps that both cases are equally easy to optimize, > perhaps I'm missing something. 1: _6 = *p_2(D); *q_7(D) = _6; _9 = *q_7(D); _10 = *_9; it's easy to see that the load _9 = *q_7(D) results in _6. 2: *p_2(D) = _4; _6 = *p_2(D); *q_7(D) = _6; _9 = *p_2(D); _10 = *_9; not so much here for the load _9 = *p_2(D) as I explained (the store *q_7(D) = _6 aliases it)
Ah! Now I see the problem, thanks.