Do we need to do a loop invariant motion after loop interchange ?
Li Jia He
helijia@linux.ibm.com
Fri Nov 22 05:51:00 GMT 2019
On 2019/11/21 8:10 PM, Richard Biener wrote:
> On Thu, Nov 21, 2019 at 10:22 AM Li Jia He <helijia@linux.ibm.com> wrote:
>>
>> Hi,
>>
>> I found for the follow code:
>>
>> #define N 256
>> int a[N][N][N], b[N][N][N];
>> int d[N][N], c[N][N];
>> void __attribute__((noinline))
>> double_reduc (int n)
>> {
>> for (int k = 0; k < n; k++)
>> {
>> for (int l = 0; l < n; l++)
>> {
>> c[k][l] = 0;
>> for (int m = 0; m < n; m++)
>> c[k][l] += a[k][m][l] * d[k][m] + b[k][m][l] * d[k][m];
>> }
>> }
>> }
>>
>> I dumped the file after loop interchange and got the following information:
>>
>> <bb 3> [local count: 118111600]:
>> # m_46 = PHI <0(7), m_45(11)>
>> # ivtmp_44 = PHI <_42(7), ivtmp_43(11)>
>> _39 = _49 + 1;
>>
>> <bb 4> [local count: 955630224]:
>> # l_48 = PHI <0(3), l_47(12)>
>> # ivtmp_41 = PHI <_39(3), ivtmp_40(12)>
>> c_I_I_lsm.5_18 = c[k_28][l_48];
>> c_I_I_lsm.5_53 = m_46 != 0 ? c_I_I_lsm.5_18 : 0;
>> _2 = a[k_28][m_46][l_48];
>> _3 = d[k_28][m_46];
>> _4 = _2 * _3;
>> _5 = b[k_28][m_46][l_48];
>> _6 = _3 * _5;
>> _7 = _4 + _6;
>> _8 = _7 + c_I_I_lsm.5_53;
>> c[k_28][l_48] = _8;
>> l_47 = l_48 + 1;
>> ivtmp_40 = ivtmp_41 - 1;
>> if (ivtmp_40 != 0)
>> goto <bb 12>; [89.00%]
>> else
>> goto <bb 5>; [11.00%]
>>
>> we can see '_3 = d[k_28][m_46];' is a loop invariant.
>> Do we need to add a loop invariant motion pass after the loop interchange?
>
> There is one at the end of the loop pipeline.
Hi,
The one at the end of the loop pipeline may miss some optimization
opportunities. If we vectorize the above code (a.c.158t.vect), we
can get information similar to the following:
bb 3:
# m_46 = PHI <0(7), m_45(11)> // loop m, outer loop
if (_59 <= 2)
goto bb 20;
else
goto bb 15;
bb 15:
_89 = d[k_28][m_46];
vect_cst__90 = {_89, _89, _89, _89};
bb 4:
# l_48 = PHI <l_47(12), 0(15)> // loop l, inner loop
vect__6.23_100 = vect_cst__99 * vect__5.22_98;
if (ivtmp_110 < bnd.8_1)
goto bb 12;
else
goto bb 17;
bb 20:
bb 18:
_27 = d[k_28][m_46];
if (ivtmp_12 != 0)
goto bb 19;
else
goto bb 21;
Vectorization will do some conversions in this case. We can see
â _89 = d[k_28][m_46];â and â_27 = d[k_28][m_46];â are loop invariant
relative to loop l. We can move âd[k_28][m_46]â to the front of
âif (_59 <= 2)â to get rid of loading data from memory in both branches.
The one at at the end of the loop pipeline can't handle this situation.
If we move d[k_28][m_46] from loop l to loop m before doing
vectorization, we can get rid of this situation.
--
BR,
Lijia He
>
>> BR,
>> Lijia He
>>
More information about the Gcc
mailing list