This is the mail archive of the
mailing list for the GCC project.
Re: A question about redundant PHI expression stmt
- From: "William J. Schmidt" <wschmidt at linux dot vnet dot ibm dot com>
- To: Jiangning Liu <jiangning dot liu at arm dot com>
- Cc: gcc at gcc dot gnu dot org, wschmidt at gcc dot gnu dot org
- Date: Mon, 27 Feb 2012 09:31:33 -0600
- Subject: Re: A question about redundant PHI expression stmt
- References: <000401ccf2cb$5c44d700$14ce8500$@email@example.com>
On Fri, 2012-02-24 at 16:07 +0800, Jiangning Liu wrote:
> For the small case below, there are some redundant PHI expression stmt
> generated, and finally cause back-end generates redundant copy instructions
> due to some reasons around IRA.
> int *l, *r, *g;
> void test_func(int n)
> int i;
> static int j;
> static int pos, direction, direction_pre;
> pos = 0;
> direction = 1;
> for ( i = 0; i < n; i++ )
> direction_pre = direction;
> for ( j = 0; j <= 400; j++ )
> if ( g[pos] == 200 )
> if ( direction == 0 )
> pos = l[pos];
> pos = r[pos];
> if ( pos == -1 )
> if ( direction_pre != direction )
> pos = 0;
> direction = !direction;
> I know PR39976 has something to do with this case, and check-in r182140
> caused big degradation on the real benchmark, but I'm still confusing around
> this issue.
> The procedure is like this,
> Loop store motion generates code below,
> <bb 6>:
> D.4084_17 = l.4_13 + D.4077_70;
> pos.5_18 = *D.4084_17;
> pos_lsm.34_103 = pos.5_18;
> goto <bb 8>;
> <bb 7>:
> D.4088_23 = r.6_19 + D.4077_70;
> pos.7_24 = *D.4088_23;
> pos_lsm.34_104 = pos.7_24;
> <bb 8>:
> # prephitmp.29_89 = PHI <pos.5_18(6), pos.7_24(7)>
> # pos_lsm.34_53 = PHI <pos_lsm.34_103(6), pos_lsm.34_104(7)>
> pos.2_25 = prephitmp.29_89;
> if (pos.2_25 == -1)
> goto <bb 9>;
> goto <bb 20>;
> And then, copy propagation transforms block 8 it into
> <bb 8>:
> # prephitmp.29_89 = PHI <pos.5_18(11), pos.7_24(12)>
> # pos_lsm.34_53 = PHI <pos.5_18(11), pos.7_24(12)>
> And then, these two duplicated PHI stmts have been kept until expand pass.
> In dom2, a stmt like below
> # pos_lsm.34_54 = PHI <pos_lsm.34_53(13), 0(16)>
> is transformed into
> # pos_lsm.34_54 = PHI <prephitmp.29_89(13), 0(16)>
> just because they are equal.
> Unfortunately, this transformation changed back-end behavior to generate
> redundant copy instructions and hurt performance. This case is from a real
> benchmark and hurt performance a lot.
> Does the fix in r182140 intend to totally clean up this kind of redundancy?
> Where should we get it fixed?
Hi, sorry not to have responded sooner -- I just now got some time to
look at this.
I compiled your code with -O3 for revisions 182139 and 182140 of GCC
trunk, and found no difference between the code produced by the middle
end for the two versions. So something else has apparently come along
since then that helped produce the problematic code generation for you.
Either that or I need to use different compile flags; you didn't specify
what you used.
The fix in r182140 does just what you saw in dom2: identifies duplicate
PHIs in the same block and ensures only one of them is used. This
actually avoids inserting extra blocks during expand in certain loop
cases. I am not sure why you are getting redundant copies as a result,
but it sounds from your comments like IRA didn't coalesce a register
copy or something like that. You may want to bisect revisions on the
trunk to see where your bad code generation started to occur to get a
better handle on what happened.
As Richard said, the dom pass is likely to be removed someday, whenever
someone can get around to it. My redundant-phi band-aid in dom would go
away then as well, but presumably an extra pass of PRE would replace it,
and redundant PHIs would still be removed.