This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/46032] openmp inhibits loop vectorization
- From: "fchen0000 at gmail dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 06 Jul 2012 16:17:28 +0000
- Subject: [Bug tree-optimization/46032] openmp inhibits loop vectorization
- Auto-submitted: auto-generated
- References: <bug-46032-4@http.gcc.gnu.org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46032
Feng Chen <fchen0000 at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |fchen0000 at gmail dot com
--- Comment #7 from Feng Chen <fchen0000 at gmail dot com> 2012-07-06 16:17:28 UTC ---
Any update on this? I do see loops getting slower even for large nx*ny
sometimes after omp on gcc 4.6.2, e.g.,
#pragma omp parallel for
for(int iy=0; iy<ny; iy++) {
for(int ix=0; ix<nx; ix++) {
dest[(size_t)iy*nx + ix] = src[(size_t)iy*nx + ix] * 2;
}
}
Sometimes gcc won't vectorize the inner loop, i have to put it into an inline
function to force it. The performance is only marginally better after that.
ps: I break the loop because I noticed previously that omp parallel inhibits
auto-vectorization, forgot which gcc version I used ...
Graphite did improve the scalability of openmp programs from my experience, so
the fix (with tests) is important ...
(In reply to comment #6)
> Good. But it Graphite breaks it, let's add Sebastian in CC..