[PATCH PR81740]Enforce dependence check for outer loop vectorization

Richard Sandiford richard.sandiford@arm.com
Mon Apr 1 17:26:00 GMT 2019


Richard Biener <richard.guenther@gmail.com> writes:
> On Tue, Mar 26, 2019 at 1:56 AM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>> Based on the "complete unrolling" view, if we number statements as
>> (i, n), where i is the outer loop iteration and n is a statement number
>> in the (completely unrolled) loop body, then the original scalar code
>> executes in lexicographical order while for the vector loop:
>>
>> (1) (i,n) executes before (i+ix,n+nx) for all ix>=0, nx>=1, regardless of VF
>> (2) (i,n) executes before (i+ix,n-nx) for all ix>=VF, nx>=0
>>       (well, nx unrestricted, but only nx>=0 is useful given (1))
>>
>> So for any kind of dependence between (i,n) and (i+ix,n-nx), ix>=1, nx>=0
>> we need to restrict VF to ix so that (2) ensures the right order.
>> This means that the unnormalised distances of interest are:
>>
>> - (ix, -nx), ix>=1, nx>=0
>> - (-ix, nx), ix>=1, nx>=0
>>
>> But the second gets normalised to the first, which is actually useful
>> in this case :-).
>>
>> In terms of the existing code, I think that means we want to change
>> the handling of nested statements (only) to:
>>
>> - ignore DDR_REVERSED_P (ddr)
>> - restrict the main dist > 0 case to when the inner distance is <= 0.
>>
>> This should have the side effect of allowing outer-loop vectorisation for:
>>
>> void __attribute__ ((noipa))
>> f (int a[][N], int b[restrict])
>> {
>>   for (int i = N - 1; i-- > 0; )
>>     for (int j = 0; j < N - 1; ++j)
>>       a[j + 1][i] = a[j][i + 1] + b[i];
>> }
>>
>> At the moment we reject this, but AFAICT it should be OK.
>> (We do allow it for s/i + 1/i/, since then the outer distance is 0.)
>
> Can you file an enhancement request so we don't forget?

OK, for the record it's https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89908



More information about the Gcc-patches mailing list