[Bug tree-optimization/50955] [4.7 Regression] IVopts incorrectly rewrite the address of a global memory access into a local form.
rguenther at suse dot de
gcc-bugzilla@gcc.gnu.org
Wed Dec 18 09:42:00 GMT 2013
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50955
--- Comment #18 from rguenther at suse dot de <rguenther at suse dot de> ---
"amker.cheng at gmail dot com" <gcc-bugzilla@gcc.gnu.org> wrote:
>http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50955
>
>bin.cheng <amker.cheng at gmail dot com> changed:
>
> What |Removed |Added
>----------------------------------------------------------------------------
> CC| |amker.cheng at gmail dot com
>
>--- Comment #17 from bin.cheng <amker.cheng at gmail dot com> ---
>Hi Richard,
>I am having difficulty in understanding cases if this PR.
>For the reported case with two loops:
>
> for( y=0; y<4; y++, pDst += dstStep ) {
> for( x=y+1; x<4; x++ ) {
> s = ( p1[x-y-1] + p1[x-y] + p1[x-y] + p1[x-y+1] + 2 ) >> 2;
> pDst[x] = (unsigned char)s;
> }
>
> pDst[y] = p3;
> }
>
>The dump for statement 'pDst[y] = p3;' before IVOPT is like:
>
><bb 4>:
>Invalid sum of incoming frequencies 1667, should be 278
> y.2_64 = (sizetype) y_89;
> D.6421_65 = pDst_88 + y.2_64;
> *D.6421_65 = p3_37;
> pDst_69 = pDst_88 + pretmp.21_118;
> ivtmp.35_116 = ivtmp.35_87 - 1;
> if (ivtmp.35_116 != 0)
> goto <bb 18>;
> else
> goto <bb 19>;
>
>
>IVOPT chooses candidate 15:
>candidate 15
> depends on 3
> var_before ivtmp.154
> var_after ivtmp.154
> incremented before exit test
> type unsigned int
> base (unsigned int) pDst_39(D) - (unsigned int) &p1
> step (unsigned int) (pretmp.21_118 + 1)
>for use 1:
>use 1
> address
> in statement *D.6421_65 = p3_37;
>
> at position *D.6421_65
> type unsigned char *
> base pDst_39(D)
> step pretmp.21_118 + 1
> base object (void *) pDst_39(D)
> related candidates
>
>After rewriting, the dump is like:
>
><bb 4>:
>Invalid sum of incoming frequencies 1667, should be 278
> MEM[symbol: p1, index: ivtmp.154_200, offset: 0B] = p3_37;
> pDst_69 = pDst_88 + pretmp.21_118;
> ivtmp.149_218 = ivtmp.149_249 - 1;
> ivtmp.154_190 = ivtmp.154_200 + D.6617_250;
> if (x_40 != 4)
> goto <bb 18>;
> else
> goto <bb 19>;
>
>Eventually, the storing to TMR[p1,ivtmp,0] is considered local and
>deleted.
>
>BUT, for your reduced case:
>
> p3 = (unsigned char)(((signed int)p1[1] + (signed int)p2[1]
> + (signed int)p1[0] +(signed int)p1[0] + 2 ) >> 2 );
>
> for( x=y+1; x<4; x++ ) {
> s = ( p1[x-y-1] + p1[x-y] + p1[x-y] + p1[x-y+1] + 2 ) >> 2;
> pDst[x] = (unsigned char)s;
> }
>
> pDst[y] = p3;
>
>It is about the the TMR in below dump (before IVOPT):
>
><bb 6>:
> # vect_pp1.30_166 = PHI <vect_pp1.30_167(16), vect_pp1.33_165(5)>
> # vect_pp1.37_176 = PHI <vect_pp1.37_177(16), vect_pp1.40_175(5)>
> # vect_pp1.46_194 = PHI <vect_pp1.46_195(16), vect_pp1.49_193(5)>
> # vect_p.60_223 = PHI <vect_p.60_224(16), vect_p.63_222(5)>
> # ivtmp.64_225 = PHI <ivtmp.64_226(16), 0(5)>
> ...
> MEM[(unsigned char *)vect_p.60_223] = vect_var_.58_219;
> vect_pp1.30_167 = vect_pp1.30_166 + 8;
> vect_pp1.37_177 = vect_pp1.37_176 + 8;
> vect_pp1.46_195 = vect_pp1.46_194 + 8;
> vect_p.60_224 = vect_p.60_223 + 8;
> ivtmp.64_226 = ivtmp.64_225 + 1;
> if (ivtmp.64_226 < bnd.27_128)
> goto <bb 16>;
> else
> goto <bb 7>;
>
>Your patch prevents IVOPT from choosing cand 4:
>candidate 4 (important)
> var_before ivtmp.110
> var_after ivtmp.110
> incremented before exit test
> type unsigned int
> base (unsigned int) (&p1 + 8)
> step 8
> base object (void *) &p1
>for use 3:
>use 3
> generic
> in statement vect_p.60_223 = PHI <vect_p.60_224(16), vect_p.63_222(5)>
>
> at position
> type vector(8) unsigned char *
> base batmp.61_221 + 1
> step 8
> base object (void *) batmp.61_221
> is a biv
> related candidates
>
>To prevent IVOPT from rewriting into:
>
><bb 6>:
> # ivtmp.107_150 = PHI <ivtmp.107_256(16), 0(5)>
> # ivtmp.110_241 = PHI <ivtmp.110_146(16), ivtmp.110_132(5)>
> D.6585_133 = (unsigned int) batmp.61_221;
> p1.131_277 = (unsigned int) &p1;
> D.6587_278 = D.6585_133 - p1.131_277;
> D.6588_279 = D.6587_278 + ivtmp.110_241;
> D.6589_280 = D.6588_279 + 4294967289;
> D.6590_281 = (vector(8) unsigned char *) D.6589_280;
> vect_p.60_223 = D.6590_281;
> ...
> MEM[(unsigned char *)vect_p.60_223] = vect_var_.58_219;
> ivtmp.107_256 = ivtmp.107_150 + 1;
> ivtmp.110_146 = ivtmp.110_241 + 8;
> if (ivtmp.107_256 < bnd.27_128)
> goto <bb 16>;
> else
> goto <bb 7>;
>
>Thus prevents IVOPT from generating candidate 15 in outer loop.
>(Expressing
>use 3 by cand 4 itself is good, right?)
>
>
>-------------------------------
>But,
>It seems because the check:
>
> if (address_p)
> {
> /* Do not try to express address of an object with computation based
> on address of a different object. This may cause problems in rtl
> level alias analysis (that does not expect this to be happening,
> as this is illegal in C), and would be unlikely to be useful
> anyway. */
> if (use->iv->base_object
> && cand->iv->base_object
> && !operand_equal_p (use->iv->base_object, cand->iv->base_object, 0))
> return infinite_cost;
>
>failed because cand(15)->iv->base_object == NULL. For the reported
>case, it's
>not about an iv use appearing in memory reference while not marked as
>address_p, and can be fixed by revise the existing check condition, is
>it true?
No, even expressing an address this way is broken as for example dependence
analysis via scev can get confused about the actual base object.
IIRC previously we already avoided the mem-use case and I had to generalize it
to also avoid addresses.
>PS, sorry for replying to a fixed PR, I found it's kind of impossible
>to fix
>PR52272 without fully understanding this one.
More information about the Gcc-bugs
mailing list