Bug 64541 - FRE pass optimization failure
Summary: FRE pass optimization failure
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 5.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2015-01-08 16:06 UTC by Ulya
Modified: 2021-12-15 21:43 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments
1.c 2.c 1.c.028t.esra 2.c.028t.esra 1.c.030t.fre1 2.c.030t.fre1 1.s 2.s compile_gcc.sh (1.67 KB, application/zip)
2015-01-08 16:06 UTC, Ulya
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ulya 2015-01-08 16:06:02 UTC
Created attachment 34403 [details]
1.c 2.c 1.c.028t.esra 2.c.028t.esra 1.c.030t.fre1 2.c.030t.fre1 1.s 2.s compile_gcc.sh

Files 1.c and 2.c in attach are equivalent from C/C++ standpoint:

$ diff 1.c 2.c
3c3,5
<       return *(*q = ++*p);
---
>       ++*p;
>       *q = *p;
>       return **p;

Compiled with gcc-5.0.0, disassembled with objdump (GNU Binutils) 2.24
(see full script compile_gcc.sh in attach).

For 1.c gcc generates better code than for 2.c
(compare 1.s vs 2.s in attach).

I looked at GIMPLE optimization dumps (-fdump-tree-all) and found that
up to *.029t.ealias pass dumps differ insignificantly (attached 
*.028t.esra dumps as they are shorter), but .030t.fre1 pass fails to
reduce intermediate variable '_9' in the second case (attached *.030t.fre
dumps).

Looks like a bug in full redundancy elimination.
Comment 1 Richard Biener 2015-01-14 11:34:04 UTC
The sources are different in that 2.c dereferences *p one more time after
*q is stored to while 1.c dereferences *q for the return value.  Thus an
equivalent 2.c would be

int f (int ** p, int ** q)
{
        ++*p;
        *q = *p;
        return **q;
}

it is true that we can still optimize the original 2.c but only because
*p and *q are equivalent (but it is probably not worthwhile the compile-time cost 
handling this). That is, we have to assume that p == q and thus the store to *q invalidates the previously load *p.

I also think we have a duplicate report for this somewhere.

Simpler testcase (which is probably that in the duplicate):

int foo (int *p, int *q)
{
  *p = 1;
  *q = 1;
  return *p;
}
Comment 2 Ulya 2015-01-14 11:56:43 UTC
> we have to assume that p == q and thus the store to *q invalidates the previously load *p

I see.
It seemed to me from GIMPLE dumps that both cases are equally easy to optimize, perhaps I'm missing something.
Comment 3 rguenther@suse.de 2015-01-14 12:00:02 UTC
On Wed, 14 Jan 2015, skvadrik at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64541
> 
> --- Comment #2 from Ulya <skvadrik at gmail dot com> ---
> > we have to assume that p == q and thus the store to *q invalidates the previously load *p
> 
> I see.
> It seemed to me from GIMPLE dumps that both cases are equally easy to optimize,
> perhaps I'm missing something.

1:

  _6 = *p_2(D);
  *q_7(D) = _6;
  _9 = *q_7(D);
  _10 = *_9;

it's easy to see that the load _9 = *q_7(D) results in _6.

2:

  *p_2(D) = _4;
  _6 = *p_2(D);
  *q_7(D) = _6;
  _9 = *p_2(D);
  _10 = *_9;

not so much here for the load _9 = *p_2(D) as I explained
(the store *q_7(D) = _6 aliases it)
Comment 4 Ulya 2015-01-14 12:03:00 UTC
Ah! Now I see the problem, thanks.