libstdc++ and openmp problem with GCC4.4.0 port to interix
Robert Oeffner
robert@oeffner.net
Wed Sep 30 12:13:00 GMT 2009
Hi,
Still trying to solve the problems porting libstdc++ in GCC 4.4.0 to interix I notice that when I compile the test code below with flags -fopenmp -D_GLIBCXX_PARALLEL -O3 it works as expected at a decent speed. But if I omit the _GLIBCXX_PARALLEL flag then the weird behaviour with high CPU kernel times and slow execution occures whenever more than one thread is requested in the program.
Are there any libstdc++ gurus out there who knows what these symptoms might mean?
Many thanks,
Rob
----- Original Message -----
From: "Robert Oeffner" <robert@oeffner.net>
To: <gcc-help@gcc.gnu.org>
Sent: Saturday, September 26, 2009 11:20 AM
Subject: libstdc++ and openmp problem with GCC4.4.0 port to interix
> Hi,
>
> Probably a long shot but I wonder if anyone would have a useful tip on a
> problem porting gcc4.4.0 to interix (a BSD-like OS running on top of the
> Windows kernel).
>
> As libgomp in GCC so far isn't targeting interix I have made some changes to
> libgomp in my copy of the GCC 4.4.0 distribution. A new source file was
> created, gcc-4.4.0/libgomp/config/posix/interix/proc.c, which is templated
> on the existing gcc-4.4.0/libgomp/config/posix/proc.c and
> gcc-4.4.0/libgomp/config/posix/mingw32/proc.c in the distribution (see
> http://www.oeffner.net/stuff/gcc-4.4.0_interix_changes.zip or
> http://www.suacommunity.com/forum/tm.aspx?m=16600 ). With this file and
> modifications to GCC configuration files in the distribution I can bootstrap
> GCC 4.4.0 to build gcc and g++ compilers on interix.
>
> The port produces fast code for single threaded running programs. However,
> there's a major problem with OpenMP. It's something to do with libstdc++
> that tends to go in overdrive when you request OpenMP to create more than
> one thread for the compiled program. When calling string::clear() from
> libstdc++ it somehow hogs the CPU with high kernel times and runs orders of
> magnitudes slower. The code below demonstrates the problem. It runs fast
> when using just one thread but abysmally slow when two or more threads are
> present, even though the loop doing the work is actually single threaded and
> the other threads remain idle.
> Windows Taskmanager shows that execution times is roughly 50% kernel and 50%
> user time whenever you run more than one thread. Invoked with a single
> thread execution time is just spend in user mode.
>
> As far as I know releasing and locking data objects is done by the OS on
> behalf of a programs request and it's done in kernel mode. Are there
> situations where libstdc++ may be confused about idle threads in a program
> and then do unnecessary requests for locking and releasing data objects?
>
> If there is anyone who has a suggestion on what causes these symptoms in my
> GCC port that would be greatly appreciated.
>
> Many thanks,
>
> Rob
>
>
> #include <iostream>
> #include <omp.h>
>
> using namespace std;
>
> const long lmax = 50000;
>
> int main()
> {
> int nthreads = 1;
> cout<<"Enter number of OpenMP threads to create: ";
> cin >> nthreads;
> omp_set_num_threads(nthreads);
>
> #pragma omp parallel
> {
> #pragma omp single
> cout << "Doing string stuff with "<<omp_get_num_threads()<<"
> thread(s)"<<endl;
> }
>
> time_t start, now;
> time( &start );
>
> string pairlbl("");
>
> for (long m = 0; m< lmax; m++)
> {
> if ((m % (lmax/20))==0)
> cout << "m = "<<m<<endl;
>
> for (int j=1;j<=2000;j++)
> {
> pairlbl.clear();
> }
> }
>
> time( &now);
> cout<<"\ntime= "<<difftime( now, start )<<" sec\n";
>
> return 0;
> }
>
>
>
>
>
More information about the Gcc-help
mailing list