This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: gomp slowness


I'm not sure what OpenMP spec says about default data scope (too lazy
to read through), but it seems that examples from
http://kallipolis.com/openmp/2.html assume default(private), while GCC
GOMP defaults to shared.  In your case,

  #pragma omp parallel for shared(A, row, col)
    for (i = k+1; i<SIZE; i++) {
      for (j = k+1; j<SIZE; j++) {
          A[i][j] = A[i][j] - row[i] * col[j];
      }
    }

'#pragma omp for' makes 'i' private implicitly (it couldn't be
otherwise), but 'j' is still shared.  I just tried your original case,
not only it is slow, but it also produces different results with and
without OpenMP (just try to print any elem of 'A').  Adding
'private(j)' (or defining 'j' inside the outer loop) will fix the
case.

It would be nice if someone would post the measurement for the fixed
case, my machine has only HT, and I experience slowdown for this
example (but still it runs much faster then before the fix).


-- 
   Tomash Brechko


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]