gcc 4.9.0 and cilkplus high kernel cpu usage?

Stefan Ruppert sr@myarm.com
Mon Apr 28 09:03:00 GMT 2014


Hi,

in the last few days I wanted to test the cilkplus feature of gcc 4.9. 
The standard fibonacci example works fine here. But my program has a 
high kernel cpu usage and is slower as the non-cilkplus (single 
threaded) version.

My program calculates the minimum distance route between passed cities 
in germany. It builds up a complete tree where the root is the start of 
the route and each leave is a possible end of the route.

A route with 10 cities needs about 2.1 seconds in the non-cilk version. 
The cilk version which spawns 4 cilk tasks need about 2.5 seconds:

Non-cilk (single threaded) version:
$ time ./myroute  65830 60306 55130 Sörgenloch 25849 65439 52388 Berlin 
MÃŒnchen Hamburg

leaves: 362880

route: |-- Kriftel --[8.14497km]--> Wicker, Main-Taunus- Kreis 
--[9.55325km]--> Weisenau --[12.6317km]--> Sörgenloch --[148.816km]--> 
Wissersheim --[160.604km]--> Frankfurt am Main --[217.106km]--> MÃŒnchen 
--[354.114km]--> Berlin --[255.292km]--> Hamburg --[137.506km]--> 
Westertilli--| total distance is 1303.77km.

real	0m2.118s
user	0m2.040s
sys	0m0.032s


Cilk-version with 4-worker threads:
$ time ./myroute -c 65830 60306 55130 Sörgenloch 25849 65439 52388 
Berlin MÃŒnchen Hamburg

leaves: 362880

route: |-- Kriftel --[8.14497km]--> Wicker, Main-Taunus- Kreis 
--[9.55325km]--> Weisenau --[12.6317km]--> Sörgenloch --[148.816km]--> 
Wissersheim --[160.604km]--> Frankfurt am Main --[217.106km]--> MÃŒnchen 
--[354.114km]--> Berlin --[255.292km]--> Hamburg --[137.506km]--> 
Westertilli--| total distance is 1303.77km.

real	0m2.564s
user	0m3.972s
sys	0m4.468s

Also I find out that when setting the number of workers to 2 I get a 
slightly faster response time as the non-cilk version:

Cilk-version with 2-worker threads:
$ time ./myroute -c 65830 60306 55130 Sörgenloch 25849 65439 52388 
Berlin MÃŒnchen Hamburg

leaves: 362880

route: |-- Kriftel --[8.14497km]--> Wicker, Main-Taunus- Kreis 
--[9.55325km]--> Weisenau --[12.6317km]--> Sörgenloch --[148.816km]--> 
Wissersheim --[160.604km]--> Frankfurt am Main --[217.106km]--> MÃŒnchen 
--[354.114km]--> Berlin --[255.292km]--> Hamburg --[137.506km]--> 
Westertilli--| total distance is 1303.77km.

real	0m2.045s
user	0m2.452s
sys	0m0.988s

Any idea why the kernel cpu usage is so high?

Regards,
Stefan

PS: Here is my config:
I build gcc 4.9 from source with the following options:

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/opt/devel/build/gcc-4.9.0/libexec/gcc/x86_64-unknown-linux-gnu/4.9.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-4.9.0/configure 
--prefix=/opt/devel/build/gcc-4.9.0 --with-system-zlib 
--with-gmp=/opt/devel/build/gcc-4.9.0 
--with-mpfr=/opt/devel/build/gcc-4.9.0 
--with-cloog=/opt/devel/build/gcc-4.9.0 
--with-mpc=/opt/devel/build/gcc-4.9.0 --with-tune=generic 
--enable-languages=c,c++ --enable-multilib --with-multilib-list=m32,m64
Thread model: posix
gcc version 4.9.0 (GCC)

$ uname -a
Linux myarm 3.5.0-34-generic #55-Ubuntu SMP Thu Jun 6 20:18:19 UTC 2013 
x86_64 x86_64 x86_64 GNU/Linux



More information about the Gcc-help mailing list