[PATCH, PR68953] Fix pdr accesses order
Tom de Vries
Tom_deVries@mentor.com
Fri Apr 8 07:03:00 GMT 2016
Hi,
this patch fixes wrong-code PR68953, a graphite 6 regression.
I.
Consider test.c:
...
int yu[4][1] = { { 1 }, { 2 }, { 3 }, { 4 } };
int
main (void)
{
int zh, ro;
for (zh = 0; zh < 2; ++zh)
for (ro = 0; ro < 3; ++ro)
yu[ro][0] = yu[zh + 1][0];
return yu[0][0];
}
...
The program is supposed to return 2, but when compiling with -O1
-floop-nest-optimize, the program returns 3 instead.
The problem is that graphite rewrites this loop nest:
...
for (zh = 0; zh < 2; ++zh)
for (ro = 0; ro < 3; ++ro)
yu[ro][0] = yu[zh + 1][0];
...
into this loop nest, which has different semantics:
...
for (ro = 0; ro < 3; ++ro)
for (zh = 0; zh < 2; ++zh)
yu[ro][0] = yu[zh + 1][0];
...
II.
Now, consider test2.c, the equivalent of test.c without the extra array
dimension on yu:
...
int yu[4] = { 1, 2, 3, 4 };
int
main (void)
{
int zh, ro;
for (zh = 0; zh < 2; ++zh)
for (ro = 0; ro < 3; ++ro)
yu[ro] = yu[zh + 1];
return yu[0];
}
...
In this case, when compiling with -O1 -floop-nest-optimize, the graphite
transformation doesn't happen, and the program returns 2.
III.
So, what is the difference in graphite analysis that causes different
conclusions?
For test2.c, we find the data references:
...
data references (
reads: { S_4[i1, i2] -> [1, 1 + i1] : i1 >= 0 and i1 <= 1 and i2 <= 2
and i2 >= 0 }
must_writes: { S_4[i1, i2] -> [1, i2] : i2 >= 0 and i2 <= 2 and i1 >=
0 and i1 <= 1 }
may_writes: { }
)
...
For test.c, we find these data references:
...
data references (
reads: { }
must_writes: { S_4[i1, 0] -> [1, 0, 0] : i1 >= 0 and i1 <= 1 }
may_writes: { }
)
...
In other words, there are no reads in the program, which is of course
incorrect.
Focusing on the reads, we find for test2.c:
...
Adding read to depedence graph: pdr_0 (read
in gimple stmt: _9 = yu[_8];
data accesses: { S_4[i1, i2] -> [1, 1 + i1] }
subscript sizes: { [1, i1] : i1 >= 0 and i1 <= 3 }
)
Reads depedence graph: { S_4[i1, i2] -> [1, 1 + i1] : i1 >= 0 and i1 <=
1 and i2 <= 2 and i2 >= 0 }
...
And for test.c:
...
Adding read to depedence graph: pdr_0 (read
in gimple stmt: _9 = yu[_8][0];
data accesses: { S_4[i1, i2] -> [1, 0, 1 + i1] }
subscript sizes: { [1, i1, 0] : i1 >= 0 and i1 <= 3 }
)
Reads depedence graph: { }
...
IV.
AFAICT, the data accesses and subscript sizes are parsed as follows:
...
data accesses:
{
S_4[i1, i2]
/* [alias set, last subscript, first subscript] */
-> [1, 0, 1 + i1]
}
subscript sizes:
{
/* [alias set range, first subscript range, last subscript range] */
[1, i1, 0]
: i1 >= 0 and i1 <= 3
}
...
In add_pdr_constraints, we do an intersection of the data accesses and
the subscript_sizes:
...
/* Constrain pdr->accesses with pdr->subscript_sizes and pbb->domain.*/
static isl_map *
add_pdr_constraints (poly_dr_p pdr, poly_bb_p pbb)
{
isl_map *x
= isl_map_intersect_range (isl_map_copy (pdr->accesses),
isl_set_copy (pdr->subscript_sizes));
x = isl_map_coalesce (x);
return constrain_domain (x, isl_set_copy (pbb->domain));
}
...
The output of the intersection is:
...
x: { S_4[-1, i2] -> [1, 0, 0] }
...
AFAICT, we're intersecting
'[1, 0, 1 + i1]'
with
'[1, i1, 0]'
and end up with
'[1, 0, 0]'.
In other words, we're intersecting:
- the last subscript access function with the first subscript range,
and
- the first subscript access function with the last subscript range.
This is the point from where things go wrong.
V.
I'm not really sure how this is supposed to be fixed. I imagine that we
should do one of 3:
1. we change the order in the access functions
2. we change the order in the subscript_sizes
3. we keep the orders as they are, but don't intersect them directly
but do an order inversion before.
I've picked 1, since that was the easiest for me to implement (but I'm
not sure if by doing so, I've broken any hardcoded graphite assumptions).
Bootstrapped and regtested on x86_64.
OK for stage4?
Thanks,
- Tom
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Fix-pdr-accesses-order.patch
Type: text/x-patch
Size: 1611 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20160408/ae065916/attachment.bin>
More information about the Gcc-patches
mailing list