This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/46032] openmp inhibits loop vectorization


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46032

Feng Chen <fchen0000 at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fchen0000 at gmail dot com

--- Comment #7 from Feng Chen <fchen0000 at gmail dot com> 2012-07-06 16:17:28 UTC ---
Any update on this? I do see loops getting slower even for large nx*ny
sometimes after omp on gcc 4.6.2, e.g.,

#pragma omp parallel for
for(int iy=0; iy<ny; iy++) {
  for(int ix=0; ix<nx; ix++) {
    dest[(size_t)iy*nx + ix] = src[(size_t)iy*nx + ix] * 2;
  }
}

Sometimes gcc won't vectorize the inner loop, i have to put it into an inline
function to force it.  The performance is only marginally better after that.
ps: I break the loop because I noticed previously that omp parallel inhibits
auto-vectorization, forgot which gcc version I used ...

Graphite did improve the scalability of openmp programs from my experience, so
the fix (with tests) is important ...

(In reply to comment #6)
> Good. But it Graphite breaks it, let's add Sebastian in CC..


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]