This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug middle-end/36099] [4.4 Regression] early loop unrolling pass prevents vectorization, SLP doesn't do its job



------- Comment #5 from dominiq at lps dot ens dot fr  2008-05-13 15:27 -------
I just noticed today that the vectorization of the variant induct.v2.f90
depends on the -m64 flag:

[ibook-dhum] source/dir_indu% gfc -m64 -O3 -ffast-math -funroll-loops
-ftree-vectorizer-verbose=2 indu.v2.f90
...
indu.v2.f90:2322: note: not vectorized: unsupported use in stmt.
indu.v2.f90:2245: note: not vectorized: unsupported unaligned store.
indu.v2.f90:2244: note: vectorizing stmts using SLP.
indu.v2.f90:2244: note: LOOP VECTORIZED.
indu.v2.f90:2146: note: not vectorized: unsupported use in stmt.
indu.v2.f90:2069: note: not vectorized: unsupported unaligned store.
indu.v2.f90:2068: note: vectorizing stmts using SLP.
indu.v2.f90:2068: note: LOOP VECTORIZED.
indu.v2.f90:1976: note: not vectorized: complicated access pattern.
indu.v2.f90:1875: note: vectorized 2 loops in function.

indu.v2.f90:1816: note: not vectorized: unsupported use in stmt.
indu.v2.f90:1771: note: not vectorized: unsupported unaligned store.
indu.v2.f90:1770: note: vectorizing stmts using SLP.
indu.v2.f90:1770: note: LOOP VECTORIZED.
indu.v2.f90:1682: note: not vectorized: unsupported use in stmt.
indu.v2.f90:1633: note: not vectorized: unsupported unaligned store.
indu.v2.f90:1632: note: vectorizing stmts using SLP.
indu.v2.f90:1632: note: LOOP VECTORIZED.
indu.v2.f90:1543: note: not vectorized: complicated access pattern.
indu.v2.f90:1441: note: vectorized 2 loops in function.
...
[ibook-dhum] source/dir_indu% gfc -O3 -ffast-math -funroll-loops
-ftree-vectorizer-verbose=2 indu.v2.f90
...
indu.v2.f90:2334: note: LOOP VECTORIZED.
indu.v2.f90:2245: note: not vectorized: unsupported unaligned store.
indu.v2.f90:2244: note: vectorizing stmts using SLP.
indu.v2.f90:2244: note: LOOP VECTORIZED.
indu.v2.f90:2158: note: LOOP VECTORIZED.
indu.v2.f90:2069: note: not vectorized: unsupported unaligned store.
indu.v2.f90:2068: note: vectorizing stmts using SLP.
indu.v2.f90:2068: note: LOOP VECTORIZED.
indu.v2.f90:1976: note: not vectorized: complicated access pattern.
indu.v2.f90:1875: note: vectorized 4 loops in function.

indu.v2.f90:1825: note: LOOP VECTORIZED.
indu.v2.f90:1771: note: not vectorized: unsupported unaligned store.
indu.v2.f90:1770: note: vectorizing stmts using SLP.
indu.v2.f90:1770: note: LOOP VECTORIZED.
indu.v2.f90:1691: note: LOOP VECTORIZED.
indu.v2.f90:1633: note: not vectorized: unsupported unaligned store.
indu.v2.f90:1632: note: vectorizing stmts using SLP.
indu.v2.f90:1632: note: LOOP VECTORIZED.
indu.v2.f90:1543: note: not vectorized: complicated access pattern.
indu.v2.f90:1441: note: vectorized 4 loops in function.
...

Where the nested loop vectorized without -m64 at 1691 is:

...
          do j = 1, 9
              c_vector(3) = 0.5_longreal * h_coil * z1gauss(j)
!
!       rotate coil vector into the global coordinate system and translate it
!
              rot_c_vector(1) = rot_i_vector(1) + rotate_coil(1,3) *
c_vector(3)
              rot_c_vector(2) = rot_i_vector(2) + rotate_coil(2,3) *
c_vector(3)
              rot_c_vector(3) = rot_i_vector(3) + rotate_coil(3,3) *
c_vector(3)
!
              do k = 1, 9                ! <==== line 1691
!
!       rotate quad vector into the global coordinate system
!
                  rot_q_vector(1) = rot_q1_vector(k,1) - rot_c_vector(1)
                  rot_q_vector(2) = rot_q1_vector(k,2) - rot_c_vector(2)
                  rot_q_vector(3) = rot_q1_vector(k,3) - rot_c_vector(3)

!
!       compute and add in quadrature term
!
                  numerator = dotp * w1gauss(j) * w2gauss(k)
                 
dotp2=rot_q_vector(1)*rot_q_vector(1)+rot_q_vector(2)*rot_q_vector(2)+    &
                        rot_q_vector(3)*rot_q_vector(3)
                  denominator = sqrt(dotp2)
                  l12_lower = l12_lower + numerator/denominator
              end do
          end do
...


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36099


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]