This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
RE: A question about redundant PHI expression stmt
- From: "Jiangning Liu" <jiangning dot liu at arm dot com>
- To: "Jiangning Liu" <Jiangning dot Liu at arm dot com>, "'William J. Schmidt'" <wschmidt at linux dot vnet dot ibm dot com>
- Cc: <gcc at gcc dot gnu dot org>, <wschmidt at gcc dot gnu dot org>
- Date: Tue, 28 Feb 2012 16:33:24 +0800
- Subject: RE: A question about redundant PHI expression stmt
- References: <000401ccf2cb$5c44d700$14ce8500$@liu@arm.com> <1330356693.23790.43.camel@gnopaine> <000b01ccf5c7$cbcd7b00$63687100$@liu@arm.com> <000d01ccf5ef$f64f1af0$e2ed50d0$@liu@arm.com>
> -----Original Message-----
> From: Jiangning Liu
> Sent: Tuesday, February 28, 2012 4:07 PM
> To: Jiangning Liu; 'William J. Schmidt'
> Cc: gcc@gcc.gnu.org; wschmidt@gcc.gnu.org
> Subject: RE: A question about redundant PHI expression stmt
>
>
>
> > -----Original Message-----
> > From: gcc-owner@gcc.gnu.org [mailto:gcc-owner@gcc.gnu.org] On Behalf
> Of
> > Jiangning Liu
> > Sent: Tuesday, February 28, 2012 11:19 AM
> > To: 'William J. Schmidt'
> > Cc: gcc@gcc.gnu.org; wschmidt@gcc.gnu.org
> > Subject: RE: A question about redundant PHI expression stmt
> >
> >
> >
> > > -----Original Message-----
> > > From: gcc-owner@gcc.gnu.org [mailto:gcc-owner@gcc.gnu.org] On
> Behalf
> > Of
> > > William J. Schmidt
> > > Sent: Monday, February 27, 2012 11:32 PM
> > > To: Jiangning Liu
> > > Cc: gcc@gcc.gnu.org; wschmidt@gcc.gnu.org
> > > Subject: Re: A question about redundant PHI expression stmt
> > >
> > > On Fri, 2012-02-24 at 16:07 +0800, Jiangning Liu wrote:
> > > > Hi,
> > > >
> > > > For the small case below, there are some redundant PHI expression
> > > stmt
> > > > generated, and finally cause back-end generates redundant copy
> > > instructions
> > > > due to some reasons around IRA.
> > > >
> > > > int *l, *r, *g;
> > > > void test_func(int n)
> > > > {
> > > > int i;
> > > > static int j;
> > > > static int pos, direction, direction_pre;
> > > >
> > > > pos = 0;
> > > > direction = 1;
> > > >
> > > > for ( i = 0; i < n; i++ )
> > > > {
> > > > direction_pre = direction;
> > > >
> > > > for ( j = 0; j <= 400; j++ )
> > > > {
> > > >
> > > > if ( g[pos] == 200 )
> > > > break;
> > > >
> > > > if ( direction == 0 )
> > > > pos = l[pos];
> > > > else
> > > > pos = r[pos];
> > > >
> > > > if ( pos == -1 )
> > > > {
> > > > if ( direction_pre != direction )
> > > > break;
> > > >
> > > > pos = 0;
> > > > direction = !direction;
> > > > }
> > > >
> > > > }
> > > >
> > > > f(g[pos]);
> > > > }
> > > > }
> > > >
> > > > I know PR39976 has something to do with this case, and check-in
> > > r182140
> > > > caused big degradation on the real benchmark, but I'm still
> > confusing
> > > around
> > > > this issue.
> > > >
> > > > The procedure is like this,
> > > >
> > > > Loop store motion generates code below,
> > > >
> > > > <bb 6>:
> > > > D.4084_17 = l.4_13 + D.4077_70;
> > > > pos.5_18 = *D.4084_17;
> > > > pos_lsm.34_103 = pos.5_18;
> > > > goto <bb 8>;
> > > >
> > > > <bb 7>:
> > > > D.4088_23 = r.6_19 + D.4077_70;
> > > > pos.7_24 = *D.4088_23;
> > > > pos_lsm.34_104 = pos.7_24;
> > > >
> > > > <bb 8>:
> > > > # prephitmp.29_89 = PHI <pos.5_18(6), pos.7_24(7)>
> > > > # pos_lsm.34_53 = PHI <pos_lsm.34_103(6), pos_lsm.34_104(7)>
> > > > pos.2_25 = prephitmp.29_89;
> > > > if (pos.2_25 == -1)
> > > > goto <bb 9>;
> > > > else
> > > > goto <bb 20>;
> > > >
> > > > And then, copy propagation transforms block 8 it into
> > > >
> > > > <bb 8>:
> > > > # prephitmp.29_89 = PHI <pos.5_18(11), pos.7_24(12)>
> > > > # pos_lsm.34_53 = PHI <pos.5_18(11), pos.7_24(12)>
> > > > ...
> > > >
> > > > And then, these two duplicated PHI stmts have been kept until
> > expand
> > > pass.
> > > >
> > > > In dom2, a stmt like below
> > > >
> > > > # pos_lsm.34_54 = PHI <pos_lsm.34_53(13), 0(16)>
> > > >
> > > > is transformed into
> > > >
> > > > # pos_lsm.34_54 = PHI <prephitmp.29_89(13), 0(16)>
> > > >
> > > > just because they are equal.
> > > >
> > > > Unfortunately, this transformation changed back-end behavior to
> > > generate
> > > > redundant copy instructions and hurt performance. This case is
> from
> > a
> > > real
> > > > benchmark and hurt performance a lot.
> > > >
> > > > Does the fix in r182140 intend to totally clean up this kind of
> > > redundancy?
> > > > Where should we get it fixed?
> > > >
> > >
> > > Hi, sorry not to have responded sooner -- I just now got some time
> to
> > > look at this.
> > >
> > > I compiled your code with -O3 for revisions 182139 and 182140 of
> GCC
> > > trunk, and found no difference between the code produced by the
> > middle
> > > end for the two versions. So something else has apparently come
> > along
> > > since then that helped produce the problematic code generation for
> > you.
> > > Either that or I need to use different compile flags; you didn't
> > > specify
> > > what you used.
> > >
> > > The fix in r182140 does just what you saw in dom2: identifies
> > > duplicate
> > > PHIs in the same block and ensures only one of them is used. This
> > > actually avoids inserting extra blocks during expand in certain
> loop
> > > cases. I am not sure why you are getting redundant copies as a
> > result,
> > > but it sounds from your comments like IRA didn't coalesce a
> register
> > > copy or something like that. You may want to bisect revisions on
> the
> > > trunk to see where your bad code generation started to occur to get
> a
> > > better handle on what happened.
> > >
> > > As Richard said, the dom pass is likely to be removed someday,
> > whenever
> > > someone can get around to it. My redundant-phi band-aid in dom
> would
> > > go
> > > away then as well, but presumably an extra pass of PRE would
> replace
> > it,
> > > and redundant PHIs would still be removed.
> > >
> >
> > Bill,
> >
> > Thanks a lot for your explanation!
> >
> > The bug could be exposed with -O2 if you apply the check-in r184435
> > made by Richard.
> >
> > BTW, at present is there anybody working on the NEW pass replacing
> dom?
> > If no, in short term, it would be good to still get it fixed just
> > because it is hurting benchmark. If Yes, may I know what that NEW
> pass
> > will be focusing on?
> >
>
> With dump enabled, I do see a PHI copy is identified by
> tree_ssa_dominator_optimize as below,
>
> Optimizing block #12
>
> LKUP STMT prephitmp.29_89 = PHI <pos.5_18, pos.7_24>
> prephitmp.29_89 = PHI <pos.5_18(10), pos.7_24(11)>
> 2>>> STMT prephitmp.29_89 = PHI <pos.5_18, pos.7_24>
> prephitmp.29_89 = PHI <pos.5_18(10), pos.7_24(11)>
> LKUP STMT pos_lsm.34_53 = PHI <pos.5_18, pos.7_24>
> pos_lsm.34_53 = PHI <pos.5_18(10), pos.7_24(11)>
> FIND: prephitmp.29_89
> 0>>> COPY pos_lsm.34_53 = prephitmp.29_89
> <<<< STMT prephitmp.29_89 = PHI <pos.5_18, pos.7_24>
> prephitmp.29_89 = PHI <pos.5_18(10), pos.7_24(11)>
> Optimizing statement if (prephitmp.29_89 == -1)
> LKUP STMT prephitmp.29_89 eq_expr -1
> if (prephitmp.29_89 == -1)
>
> So does it mean phicprop (eliminate_degenerate_phis) needs to be
> improved as well?
>
One more clue I find is, after dom2, prephitmp.29_89 isn't really propagated and pos_lsm.34_53 is still being used by block 15 as below, although block 15 is dominated by block 12.
<bb 15>:
# direction_lsm.33_121 = PHI <direction_lsm.33_51(14)>
# j_lsm.32_122 = PHI <j_lsm.32_52(14)>
# pos_lsm.34_123 = PHI <pos_lsm.34_53(14)>
Is this the root cause that redundant PHI stmt "pos_lsm.34_53 = PHI <pos.5_18(10), pos.7_24(11)>" can't be removed?
> > Thanks,
> > -Jiangning
> >
> > > Thanks,
> > > Bill
> > >
> > > > Thanks,
> > > > -Jiangning
> > > >
> > > >
> > > >
> > >
> >
> >
> >
> >
>
>