This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Fix PR48052: loop not vectorized if index is "unsigned int"


On Wed, May 6, 2015 at 7:02 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Mon, May 4, 2015 at 9:47 PM, Abderrazek Zaafrani
> <az.zaafrani@gmail.com> wrote:
>> This is an old thread and we are still running into similar issues:
>> Code is not being vectorized on 64-bit target due to scev not being
>> able to optimally analyze overflow condition.
>>
>> While the original test case shown here seems to work now, it does not
>> work if the start value is not a constant and the loop index variable
>> is of unsigned type: Ex
>>
>> void loop2( double const * __restrict__ x_in, double * __restrict__
>> x_out, double const * __restrict__ c, unsigned int N, unsigned int
>> start) {
>>  for(unsigned int i=start; i!=N; ++i)
>>    x_out[i] = c[i]*x_in[i];
>> }
>>
>> Here is our unit test:
>>
>> int foo(int* A, int* B, unsigned start, unsigned B)
>> {
>>   int s;
>>   for (unsigned k = start; k <start+B; k++)
>>     s += A[k] * B[k];
>>   return s;
>> }
>>
>> Our unit test case is extracted from a matrix multiply of a
>> two-dimensional array and all loops are blocked by hand by a factor of
>> B. Even though a bit modified, above loop corresponds to the innermost
>> loop of the blocked matrix multiply.
>>
>> We worked on patch to solve the problem (see attachment.)
>> The attached patch passed bootstrap and make check on x86_64-linux.
>> Ok for trunk?
>
> Apart from coding style / API issues the case you handle is very special
> (IVs with step 1 only?!) I believe it is also wrong - the assumption that
> if there is a symbolic or constant expression for the number of iterations
> a BIV will not wrap is not true.  niter analysis can very well compute
> the number of iterations for a loop with wrapping IVs.  For your unit test
> this only works because of the special-casing of step 1 IVs.
I happen to look into similar issue right now.  scev_probably_wraps_p
and thus chrec_convert_1 should be improved using niter information.
Actually all information (and the wrap behavior) has already been
computed in tree-ssa-loop-niter.c.  We just need to find a way to used
it.

>
> Technically it might be more interesting to compute wrapping of IVs
> during niter analysis in some more generic way (we have iv->no_overflow
> computed by simple_iv, but that is rather not useful here).

For it iv->no_overflow is computed in simple_iv as below:
      tmp = analyze_scalar_evolution (use_loop, ev);
      ev = resolve_mixers (use_loop, tmp);

      if (folded_casts && tmp != ev)
    *folded_casts = true;

It's inaccurate because calling resolve_mixers doesn't mean the result
scev will wrap.  resolve_mixers could have just done exact the same
transformation as instantiate_parameters.  Also
chrec_convert_aggressive is incomplete and need to revised too.

Thanks,
bin
>
> Richard.
>
>> Thanks,
>> Abderrazek Zaafrani


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]