This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [fix] PR39300: make PRE not disturb vectorizer
Hi,
On Tue, 21 Jul 2009, Sebastian Pop wrote:
> Michael, could you also check whether your patch fixes or not the
> following PR: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31756 This PR
> has a reduced Fortran testcase.
The patch doesn't change the situation occuring here, which is not about
loop carried dependencies, but rather about SCEV/data-ref not being able
to do something sensible with this:
loop:
# j_2 = PHI <1(5), j_51(7)>
D.1592_38 = (integer(kind=8)) j_2;
D.1593_39 = D.1592_38 * stride.2_7;
D.1595_41 = pretmp.23_22 + D.1593_39;
D.1599_48 = (*b_47(D))[D.1595_41];
(*a_49(D))[D.1595_41] = D.1599_48;
j_51 = j_2 + 1;
if (j_2 == pretmp.21_44)
goto <bb 8>;
else
goto <bb 7>;
In the dumps I do see scev doing something sane, namely:
------
(instantiate_scev
(instantiate_below = 5)
(evolution_loop = 2)
(chrec = {stride.2_7 + pretmp.23_22, +, stride.2_7}_2)
(res = {stride.2_7 + pretmp.23_22, +, stride.2_7}_2))
------
So, it was able to determine base/step of this induction variable nicely,
but then data-ref seems to fail to use it sensibly:
------
(compute_affine_dependence
(stmt_a = D.1599_48 = (*b_47(D))[D.1595_41];)
(stmt_b = (*a_49(D))[D.1595_41] = D.1599_48;)
)
pr31756.f:4: note: not vectorized: data ref analysis failed
D.1599_48 = (*b_47(D))[D.1595_41];
------
Turing off/on PRE doesn't change the picture, the only difference is, that
the above induction variable becomes even more complicated without PRE:
(instantiate_scev
(instantiate_below = 5)
(evolution_loop = 2)
(chrec = {((integer(kind=8)) i_1 + offset.3_18) + stride.2_7, +, stride.2_7}_2)
(res = {((integer(kind=8)) i_1 + offset.3_18) + stride.2_7, +, stride.2_7}_2))
So, it's actually PRE doing something useful, not more preventing
vectorization than it already is by the incapabilities of data-ref.
> Could you also please look at
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33244
> that has again a PRE vs. vectorization problem and some performance
> bonus linked to it ;-)
It's not PRE anymore which creates a problem. But there are a multitude
of other problems left to be able to vectorize the
do j=0,Ng2
G(i,j) = Ginteg( -D1/2,-D2/2, D1/2,D2/2, i*D1,j*D2 )
end do
loop (assuming that this is what you meant):
* ginteg needs to be inlined
* ginteg calls vprim. also needs to be inlined
* vprim needs sqrtf/logf. The latter is vectorizable only with libcalls
to ACML or SVML on x86-64.
* when inlined ginteg will expose control flow, needs to be if-converted
* loop-invariant loads of d1/d2 stay in the loop, but they only do so
because of the non-inlined calls above (which as far as the compiler
knows could change the global variables each time), hence the load can't
be hoisted out). With everything inlined this loop-invariant load would
not be a problem anymore, if aliasing work correct (so that it
determines that d1 and g.data don't alias, which it meanwhile should as
G is allocated)
So, now news on this front unfortunately. Though someone was working on
the decl things in the fortran frontend which would help inlining above.
Ciao,
Michael.