This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Fix PR48052: loop not vectorized if index is "unsigned int"
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Abderrazek Zaafrani <az dot zaafrani at gmail dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Sebastian Pop <sebpop at gmail dot com>
- Date: Wed, 6 May 2015 13:02:15 +0200
- Subject: Re: Fix PR48052: loop not vectorized if index is "unsigned int"
- Authentication-results: sourceware.org; auth=none
- References: <CAGrkkCATyT28OgKzXpSbAY=5=NTZKpp1p60wA8BBdYohgjCY-w at mail dot gmail dot com>
On Mon, May 4, 2015 at 9:47 PM, Abderrazek Zaafrani
<az.zaafrani@gmail.com> wrote:
> This is an old thread and we are still running into similar issues:
> Code is not being vectorized on 64-bit target due to scev not being
> able to optimally analyze overflow condition.
>
> While the original test case shown here seems to work now, it does not
> work if the start value is not a constant and the loop index variable
> is of unsigned type: Ex
>
> void loop2( double const * __restrict__ x_in, double * __restrict__
> x_out, double const * __restrict__ c, unsigned int N, unsigned int
> start) {
> for(unsigned int i=start; i!=N; ++i)
> x_out[i] = c[i]*x_in[i];
> }
>
> Here is our unit test:
>
> int foo(int* A, int* B, unsigned start, unsigned B)
> {
> int s;
> for (unsigned k = start; k <start+B; k++)
> s += A[k] * B[k];
> return s;
> }
>
> Our unit test case is extracted from a matrix multiply of a
> two-dimensional array and all loops are blocked by hand by a factor of
> B. Even though a bit modified, above loop corresponds to the innermost
> loop of the blocked matrix multiply.
>
> We worked on patch to solve the problem (see attachment.)
> The attached patch passed bootstrap and make check on x86_64-linux.
> Ok for trunk?
Apart from coding style / API issues the case you handle is very special
(IVs with step 1 only?!) I believe it is also wrong - the assumption that
if there is a symbolic or constant expression for the number of iterations
a BIV will not wrap is not true. niter analysis can very well compute
the number of iterations for a loop with wrapping IVs. For your unit test
this only works because of the special-casing of step 1 IVs.
Technically it might be more interesting to compute wrapping of IVs
during niter analysis in some more generic way (we have iv->no_overflow
computed by simple_iv, but that is rather not useful here).
Richard.
> Thanks,
> Abderrazek Zaafrani