[Bug libgomp/40852] New: parallel sort run time increases ~10 fold when vector size gets over ~4*10^9

jaffe at broad dot mit dot edu gcc-bugzilla@gcc.gnu.org
Fri Jul 24 20:15:00 GMT 2009


Parallel sorts get ~10 times slower as one increases the vector size from
4*10^9 to 5*10^9, perhaps at exactly 2^32, but this wasn't checked.  The
example below is for a vector of ints but the same phenomenon is observed on a
vector of long longs.

To reproduce (sort_test.cc is below):

0. Adjust 'processors' in sort_test.cc.
1. g++ -O3 -fopenmp sort_test.cc -lgomp
2. ./a.out

output:

58 seconds used in sort [for vector of size 4,000,000,000]
667 seconds used in sort [for vector of size 5,000,000,000]

gcc version information:

crd4% gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-4.4.1/configure
--with-gmp=/broad/tools/Linux/x86_64/pkgs/gcc_4.4.1
--with-mpfr=/broad/tools/Linux/x86_64/pkgs/gcc_4.4.1
--prefix=/broad/tools/Linux/x86_64/pkgs/gcc_4.4.1
Thread model: posix
gcc version 4.4.1 (GCC) 
We first observed the problem under gcc 4.3.3.

hardware info:

crd4% uname -a
Linux crd4 2.6.16.54-0.2.5-smp #1 SMP Mon Jan 21 13:29:51 UTC 2008 x86_64
x86_64 x86_64 GNU/Linux
This is a 32-processor machine with 256 GB of memory, but I don't think the
problem is 
specific to this architecture.

sort_test.cc:

#include <iostream>
#include <omp.h>
#include <time.h>
#include <vector>
using namespace std;
int main( )
{    for ( long long  m = 4; m <= 5; m++ )
     {    const long long entries = m * (long long) 1000000000;
          const int processors = 32;
          vector<int> x(entries);
          for ( long long i = 0; i < entries; i++ )
               x[i] = (i*i) % 123456789;
          time_t clock1, clock2; time( &clock1 );
          omp_set_num_threads(processors);
          sort( x.begin( ), x.end( ) );
          time( &clock2 );           
          cout << clock2 - clock1 << " seconds used in sort" << endl;    }    }


-- 
           Summary: parallel sort run time increases ~10 fold when vector
                    size gets over ~4*10^9
           Product: gcc
           Version: 4.4.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libgomp
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: jaffe at broad dot mit dot edu
 GCC build triplet: x86_64-unknown-linux-gnu
  GCC host triplet: x86_64-unknown-linux-gnu
GCC target triplet: x86_64-unknown-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40852



More information about the Gcc-bugs mailing list