This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Fix PR48052: loop not vectorized if index is "unsigned int"


Richard,

Agree that the code is handling a very special case but this special
case is common enough and is limiting the vectorizer in a significant
way. The special case is: loops with unsigned index, non-constant
start value, and step 1. We have a code for a matrix multiply â loops
blocked by hand -  taken from an industry benchmark that is not being
vectorized because of the overflow issue. Note that if we relax the
step 1 assumption, then we will probably not be able to prove
non-overflow and note also that the general case such as constant
start value is already working fine.

I am not too sure about the incorrectness mentioned below for the
cases we are handling. loop->nb_iterations holds a symbolic expression
for the number of iterations (our special case falls into the symbolic
expression). Based on several loops that I experimented with and that
fall under our limited scope, we have either this symbolic expression
holding the exact number of iterations for the loop and without
overflow or the scev_not_known flag is set to true. May be you can
share an example in case you have an example in mind.

The suggestion about improving niter analysis and improving
iv->no_overflow flag and moving what we are trying to do here into
that section with the possibility of using existing information is
good and we may look into it.

Abderrazek

On Wed, May 6, 2015 at 6:02 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Mon, May 4, 2015 at 9:47 PM, Abderrazek Zaafrani
> <az.zaafrani@gmail.com> wrote:
>> This is an old thread and we are still running into similar issues:
>> Code is not being vectorized on 64-bit target due to scev not being
>> able to optimally analyze overflow condition.
>>
>> While the original test case shown here seems to work now, it does not
>> work if the start value is not a constant and the loop index variable
>> is of unsigned type: Ex
>>
>> void loop2( double const * __restrict__ x_in, double * __restrict__
>> x_out, double const * __restrict__ c, unsigned int N, unsigned int
>> start) {
>>  for(unsigned int i=start; i!=N; ++i)
>>    x_out[i] = c[i]*x_in[i];
>> }
>>
>> Here is our unit test:
>>
>> int foo(int* A, int* B, unsigned start, unsigned B)
>> {
>>   int s;
>>   for (unsigned k = start; k <start+B; k++)
>>     s += A[k] * B[k];
>>   return s;
>> }
>>
>> Our unit test case is extracted from a matrix multiply of a
>> two-dimensional array and all loops are blocked by hand by a factor of
>> B. Even though a bit modified, above loop corresponds to the innermost
>> loop of the blocked matrix multiply.
>>
>> We worked on patch to solve the problem (see attachment.)
>> The attached patch passed bootstrap and make check on x86_64-linux.
>> Ok for trunk?
>
> Apart from coding style / API issues the case you handle is very special
> (IVs with step 1 only?!) I believe it is also wrong - the assumption that
> if there is a symbolic or constant expression for the number of iterations
> a BIV will not wrap is not true.  niter analysis can very well compute
> the number of iterations for a loop with wrapping IVs.  For your unit test
> this only works because of the special-casing of step 1 IVs.
>
> Technically it might be more interesting to compute wrapping of IVs
> during niter analysis in some more generic way (we have iv->no_overflow
> computed by simple_iv, but that is rather not useful here).
>
> Richard.
>
>> Thanks,
>> Abderrazek Zaafrani


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]