This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug fortran/83017] DO CONCURRENT not parallelizing


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83017

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
If I "fix" GCC to consider the loop you annotate parallel:

    do concurrent (i = 1:nsplit)
        pi(i) = sum(compute( low(i), high(i) ))
    end do

then we arrive at computing 4 iterations of that loop and with 2 threads
and MIN_PER_THREAD 100 (arbitrary define) we run into

      if (!flag_loop_parallelize_all
          && !oacc_kernels_p
          && ((estimated != -1
               && estimated <= (HOST_WIDE_INT) n_threads * MIN_PER_THREAD)
              /* Do not bother with loops in cold areas.  */
              || optimize_loop_nest_for_size_p (loop)))
        continue;

(estimated is 4).  With -floop-parallelize-all I then get:

> ./f951 -quiet t.f90 -Ofast -ftree-parallelize-loops=2 -fdump-tree-parloops-details -floop-parallelize-all -fopt-info-loop
t.f90:28:0: note: loop with 5 iterations completely unrolled (header execution
count 375)
t.f90:26:0: note: loop with 5 iterations completely unrolled (header execution
count 1500)
t.f90:38:0: note: loop with 5 iterations completely unrolled (header execution
count 1500)
t.f90:18:0: note: loop with 4 iterations completely unrolled (header execution
count 375)
t.f90:15:0: note: loop with 5 iterations completely unrolled (header execution
count 375)
t.f90:26:0: note: parallelizing outer loop 3
t.f90:24:0: note: basic block vectorized
t.f90:41:0: note: basic block vectorized
t.f90:41:0: note: basic block vectorized

yay.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]