More aggressive threading causing loop-interchange-9.c regression
Aldy Hernandez
aldyh@redhat.com
Tue Sep 7 11:49:59 GMT 2021
Hi folks.
I have a pending patch for the path solver that pulls in global ranges
when available (the stuff in SSA_NAME_RANGE_INFO). In doing so, I have
run into a regression I was hoping the loop experts could pontificate on.
The regression comes from the simple_reduc_1() function in
tree-ssa/loop-interchange-9.c, and it involves the following path:
=========== BB 5 ============
Imports: n_10(D) j_19
Exports: n_10(D) j_13 j_19
j_13 : j_19(I)
n_10(D) int [1, 257]
j_19 int [0, 256]
Relational : (j_13 > j_19)
<bb 5> [local count: 118111600]:
# sum_21 = PHI <sum_14(4), sum_11(3)>
c[j_19] = sum_21;
j_13 = j_19 + 1;
if (n_10(D) > j_13)
goto <bb 9>; [89.00%]
else
goto <bb 6>; [11.00%]
j_13 : int [1, 257]
5->9 (T) n_10(D) : int [2, 257]
5->9 (T) j_13 : int [1, 256]
5->9 (T) j_19 : int [0, 255]
5->6 (F) n_10(D) : int [1, 257]
5->6 (F) j_13 : int [1, 257]
5->6 (F) j_19 : int [0, 256]
=========== BB 9 ============
n_10(D) int [2, 257]
j_13 int [1, 256]
Relational : (n_10(D) > j_19)
Relational : (n_10(D) > j_13)
<bb 9> [local count: 105119324]:
goto <bb 3>; [100.00%]
=========== BB 3 ============
Imports: n_10(D)
Exports: n_10(D)
n_10(D) int [1, +INF]
<bb 3> [local count: 118111600]:
# j_19 = PHI <j_13(9), 0(7)>
sum_11 = c[j_19];
if (n_10(D) > 0)
goto <bb 8>; [89.00%]
else
goto <bb 5>; [11.00%]
With knowledge from previous passes (SSA_NAME_RANGE_INFO), we know that
the loop cannot legally iterate outside the size of c[256]. So, j_13
lies within [1, 257] and n_10 is [2, +INF] at the end of the path. This
allows us to thread BB3 to BB8. Doing so, causes the test to fail here:
/* { dg-final { scan-tree-dump-times "Loop_pair<outer:., inner:.> is
interchanged" 1 "linterchange" } } */
All the blocks lie within the same loop, and the path passes all the
tests in path_profitable_p().
Is there something about the shape of this path that should make it
invalid in the backward threader, or should the test be marked with
-fdisable-tree-threadN (or something else entirely)? Note that the
backward threader is allowed to peel through loop headers.
I am attaching the gimple dump as well as the graphviz output for easier
analysis.
Thanks.
Aldy
-------------- next part --------------
__attribute__((noinline))
void simple_reduc_1 (int n)
{
int i;
int sum;
int j;
int _1;
int _2;
int _3;
<bb 2> [local count: 14598063]:
if (n_10(D) > 0)
goto <bb 7>; [89.00%]
else
goto <bb 6>; [11.00%]
<bb 7> [local count: 12992276]:
<bb 3> [local count: 118111600]:
# j_19 = PHI <j_13(9), 0(7)>
sum_11 = c[j_19];
if (n_10(D) > 0)
goto <bb 8>; [89.00%]
else
goto <bb 5>; [11.00%]
<bb 8> [local count: 105119324]:
<bb 4> [local count: 955630225]:
# sum_20 = PHI <sum_14(10), sum_11(8)>
# i_22 = PHI <i_15(10), 0(8)>
_1 = a[i_22][j_19];
_2 = b[i_22][j_19];
_3 = _1 * _2;
sum_14 = _3 + sum_20;
i_15 = i_22 + 1;
if (n_10(D) > i_15)
goto <bb 10>; [89.00%]
else
goto <bb 5>; [11.00%]
<bb 10> [local count: 850510901]:
goto <bb 4>; [100.00%]
<bb 5> [local count: 118111600]:
# sum_21 = PHI <sum_14(4), sum_11(3)>
c[j_19] = sum_21;
j_13 = j_19 + 1;
if (n_10(D) > j_13)
goto <bb 9>; [89.00%]
else
goto <bb 6>; [11.00%]
<bb 9> [local count: 105119324]:
goto <bb 3>; [100.00%]
<bb 6> [local count: 14598063]:
return;
}
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graph.ps
Type: application/postscript
Size: 22338 bytes
Desc: not available
URL: <https://gcc.gnu.org/pipermail/gcc/attachments/20210907/ea6ed978/attachment-0001.ps>
More information about the Gcc
mailing list