This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Question about vectorization limit
- From: Jakub Jelinek <jakub at redhat dot com>
- To: Toon Moene <toon at moene dot org>
- Cc: Richard Biener <richard dot guenther at gmail dot com>, Dehao Chen <dehao at google dot com>, GCC Development <gcc at gcc dot gnu dot org>
- Date: Fri, 31 May 2013 15:41:31 +0200
- Subject: Re: Question about vectorization limit
- References: <CAO2gOZX7_-08m_+AEybF0RwG=8Y_qPG_+wjmgsq6ymVWTr3=Vw at mail dot gmail dot com> <CAFiYyc3ehiZeyrXUb+wgj_tBi7WqmeHNQdFK9vDinnMWYHYswA at mail dot gmail dot com> <51A8A3EF dot 3010508 at moene dot org>
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
On Fri, May 31, 2013 at 03:21:51PM +0200, Toon Moene wrote:
> SUBROUTINE XYZ(A, B, N)
> DIMENSION A(N), B(N)
> DO I = 1, N
> IF (A(I) > 0.0) THEN
> A(I) = B(I) / A(I)
> ELSE
> A(I) = B(I)
> ENDIF
> ENDDO
> END
Well, in this case (with -Ofast) it is just the case that ifcvt
or earlier passes did a poor job at moving the load from B(I)
before the conditional, which, if we ignore exceptions, should be possible,
as both branches read from the same memory.
The store to A(I) is already hoisted by cselim out of the conditional.
If you rewrite the above into:
SUBROUTINE XYZ(A, B, N)
DIMENSION A(N), B(N)
DO I = 1, N
C = B(I)
IF (A(I) > 0.0) THEN
A(I) = C / A(I)
ELSE
A(I) = C
ENDIF
ENDDO
END
then it is vectorized just fine. Similarly even if this optimization
isn't performed, with masked loads it should be optimizable.
See http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00202.html
though we probably just want a better infrastructure for that.
Jakub