This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
A question about redundant PHI expression stmt
- From: "Jiangning Liu" <jiangning dot liu at arm dot com>
- To: <gcc at gcc dot gnu dot org>, <wschmidt at gcc dot gnu dot org>
- Date: Fri, 24 Feb 2012 16:07:28 +0800
- Subject: A question about redundant PHI expression stmt
Hi,
For the small case below, there are some redundant PHI expression stmt
generated, and finally cause back-end generates redundant copy instructions
due to some reasons around IRA.
int *l, *r, *g;
void test_func(int n)
{
int i;
static int j;
static int pos, direction, direction_pre;
pos = 0;
direction = 1;
for ( i = 0; i < n; i++ )
{
direction_pre = direction;
for ( j = 0; j <= 400; j++ )
{
if ( g[pos] == 200 )
break;
if ( direction == 0 )
pos = l[pos];
else
pos = r[pos];
if ( pos == -1 )
{
if ( direction_pre != direction )
break;
pos = 0;
direction = !direction;
}
}
f(g[pos]);
}
}
I know PR39976 has something to do with this case, and check-in r182140
caused big degradation on the real benchmark, but I'm still confusing around
this issue.
The procedure is like this,
Loop store motion generates code below,
<bb 6>:
D.4084_17 = l.4_13 + D.4077_70;
pos.5_18 = *D.4084_17;
pos_lsm.34_103 = pos.5_18;
goto <bb 8>;
<bb 7>:
D.4088_23 = r.6_19 + D.4077_70;
pos.7_24 = *D.4088_23;
pos_lsm.34_104 = pos.7_24;
<bb 8>:
# prephitmp.29_89 = PHI <pos.5_18(6), pos.7_24(7)>
# pos_lsm.34_53 = PHI <pos_lsm.34_103(6), pos_lsm.34_104(7)>
pos.2_25 = prephitmp.29_89;
if (pos.2_25 == -1)
goto <bb 9>;
else
goto <bb 20>;
And then, copy propagation transforms block 8 it into
<bb 8>:
# prephitmp.29_89 = PHI <pos.5_18(11), pos.7_24(12)>
# pos_lsm.34_53 = PHI <pos.5_18(11), pos.7_24(12)>
...
And then, these two duplicated PHI stmts have been kept until expand pass.
In dom2, a stmt like below
# pos_lsm.34_54 = PHI <pos_lsm.34_53(13), 0(16)>
is transformed into
# pos_lsm.34_54 = PHI <prephitmp.29_89(13), 0(16)>
just because they are equal.
Unfortunately, this transformation changed back-end behavior to generate
redundant copy instructions and hurt performance. This case is from a real
benchmark and hurt performance a lot.
Does the fix in r182140 intend to totally clean up this kind of redundancy?
Where should we get it fixed?
Thanks,
-Jiangning