This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Do we need to do a loop invariant motion after loop interchange ?
On Fri, Nov 22, 2019 at 3:19 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On November 22, 2019 6:51:38 AM GMT+01:00, Li Jia He <helijia@linux.ibm.com> wrote:
> >
> >
> >On 2019/11/21 8:10 PM, Richard Biener wrote:
> >> On Thu, Nov 21, 2019 at 10:22 AM Li Jia He <helijia@linux.ibm.com>
> >wrote:
> >>>
> >>> Hi,
> >>>
> >>> I found for the follow code:
> >>>
> >>> #define N 256
> >>> int a[N][N][N], b[N][N][N];
> >>> int d[N][N], c[N][N];
> >>> void __attribute__((noinline))
> >>> double_reduc (int n)
> >>> {
> >>> for (int k = 0; k < n; k++)
> >>> {
> >>> for (int l = 0; l < n; l++)
> >>> {
> >>> c[k][l] = 0;
> >>> for (int m = 0; m < n; m++)
> >>> c[k][l] += a[k][m][l] * d[k][m] + b[k][m][l] * d[k][m];
> >>> }
> >>> }
> >>> }
> >>>
> >>> I dumped the file after loop interchange and got the following
> >information:
> >>>
> >>> <bb 3> [local count: 118111600]:
> >>> # m_46 = PHI <0(7), m_45(11)>
> >>> # ivtmp_44 = PHI <_42(7), ivtmp_43(11)>
> >>> _39 = _49 + 1;
> >>>
> >>> <bb 4> [local count: 955630224]:
> >>> # l_48 = PHI <0(3), l_47(12)>
> >>> # ivtmp_41 = PHI <_39(3), ivtmp_40(12)>
> >>> c_I_I_lsm.5_18 = c[k_28][l_48];
> >>> c_I_I_lsm.5_53 = m_46 != 0 ? c_I_I_lsm.5_18 : 0;
> >>> _2 = a[k_28][m_46][l_48];
> >>> _3 = d[k_28][m_46];
> >>> _4 = _2 * _3;
> >>> _5 = b[k_28][m_46][l_48];
> >>> _6 = _3 * _5;
> >>> _7 = _4 + _6;
> >>> _8 = _7 + c_I_I_lsm.5_53;
> >>> c[k_28][l_48] = _8;
> >>> l_47 = l_48 + 1;
> >>> ivtmp_40 = ivtmp_41 - 1;
> >>> if (ivtmp_40 != 0)
> >>> goto <bb 12>; [89.00%]
> >>> else
> >>> goto <bb 5>; [11.00%]
> >>>
> >>> we can see '_3 = d[k_28][m_46];' is a loop invariant.
> >>> Do we need to add a loop invariant motion pass after the loop
> >interchange?
> >>
> >> There is one at the end of the loop pipeline.
> >
> >Hi,
> >
> >The one at the end of the loop pipeline may miss some optimization
> >opportunities. If we vectorize the above code (a.c.158t.vect), we
> >can get information similar to the following:
> >
> >bb 3:
> > # m_46 = PHI <0(7), m_45(11)> // loop m, outer loop
> > if (_59 <= 2)
> > goto bb 20;
> > else
> > goto bb 15;
> >
> >bb 15:
> > _89 = d[k_28][m_46];
> > vect_cst__90 = {_89, _89, _89, _89};
> >
> >bb 4:
> > # l_48 = PHI <l_47(12), 0(15)> // loop l, inner loop
> > vect__6.23_100 = vect_cst__99 * vect__5.22_98;
> > if (ivtmp_110 < bnd.8_1)
> > goto bb 12;
> > else
> > goto bb 17;
> >
> >bb 20:
> >bb 18:
> > _27 = d[k_28][m_46];
> >if (ivtmp_12 != 0)
> > goto bb 19;
> > else
> > goto bb 21;
> >
> >Vectorization will do some conversions in this case. We can see
> >‘ _89 = d[k_28][m_46];’ and ‘_27 = d[k_28][m_46];’ are loop invariant
> >relative to loop l. We can move ‘d[k_28][m_46]’ to the front of
> >‘if (_59 <= 2)’ to get rid of loading data from memory in both
> >branches.
> >
> >The one at at the end of the loop pipeline can't handle this situation.
> >If we move d[k_28][m_46] from loop l to loop m before doing
> >vectorization, we can get rid of this situation.
>
> But we can't run every pass after every other. With multiple passes having ordering issues is inevitable.
>
> Now - interchange could trigger a region based invariant motion just for the nest it interchanged. But that doesn't exist right now.
With data reference/dependence information in the pass, I think it
could be quite straightforward. Didn't realize that we need it
before.
Thanks,
bin
>
> Richard.