This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug other/60828] New: Compile time speedups when using tcmalloc


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60828

            Bug ID: 60828
           Summary: Compile time speedups when using tcmalloc
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: other
          Assignee: unassigned at gcc dot gnu.org
          Reporter: trippels at gcc dot gnu.org

There are noticeable compile time speedups when one links gcc with
tcmalloc. This happens mostly for C++ programs. Plain C projects
show not much difference. 
Here are the compile times for Firefox an my 4-core machine:

Firefox -O3:
glibc malloc:
 2806.82s user 126.92s system 349% cpu 13:58.37 total    0% speedup
tcmalloc:
 2707.31s user 129.93s system 358% cpu 13:10.61 total  5.7% speedup
jemalloc:
 2708.30s user 175.53s system 354% cpu 13:34.29 total  2.9% speedup

Firefox -flto=4 -O3: 
glibc malloc:
 3241.66s user 155.71s system 316% cpu 17:54.13 total    0% speedup
tcmalloc:
 3140.43s user 164.22s system 323% cpu 17:01.13 total  4.9% speedup
jemalloc:
 3155.74s user 226.63s system 320% cpu 17:35.51 total  1.7% speedup

A simpler example is tramp3d-v4:
glibc malloc:
 % time g++ -w -O3 -march=native tramp3d-v4.cpp
 22.30s user 0.34s system 97% cpu 23.301 total
tcmalloc:
 ~ % time g++ -w -O3 -march=native tramp3d-v4.cpp
 21.36s user 0.30s system 99% cpu 21.659 total    (~7% speedup)

tcmalloc build in heap-profiler shows (number of allocated megabytes.
This includes the space that has since been deallocated):

markus@x4 ~ % pprof --alloc_space --text
/usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.0/cc1 /tmp/mybin.hprof_4474.0010.heap 
Using local file /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.0/cc1.
Using local file /tmp/mybin.hprof_4474.0010.heap.
Total: 34.3 MB
     7.7  22.6%  22.6%      7.8  22.6% c_common_nodes_and_builtins [clone
.cold.171]
     5.7  16.7%  39.3%      5.7  16.7% tree_ssa_lim
     4.3  12.5%  51.8%     10.8  31.5% cpp_classify_number
     3.8  11.1%  62.9%      5.2  15.1% do_endif [clone .lto_priv.2364]
     2.6   7.5%  70.4%      2.6   7.5% _cpp_pop_context
     2.6   7.5%  77.8%      2.6   7.5% cgraph_add_node_removal_hook
     2.2   6.5%  84.3%      2.2   6.5% __gmp_default_allocate
     1.7   5.1%  89.4%      1.7   5.1% rtx_moveable_p [clone .isra.7] [clone
.lto_priv.5842]
     1.5   4.2%  93.6%      1.7   5.1% add_exit_phis [clone .lto_priv.5880]
     0.7   2.1%  95.7%      0.7   2.1% ix86_target_macros_internal [clone
.lto_priv.7319]
     0.3   0.9%  96.6%      0.3   0.9% init_alias_vars [clone .lto_priv.9038]
     0.3   0.8%  97.4%      0.3   0.8% gimple_fold_builtin
...

And total objects (including deallocated):
Total: 619253 objects
  290259  46.9%  46.9%   290259  46.9% __gmp_default_allocate
   89866  14.5%  61.4%    89866  14.5% rtx_moveable_p [clone .isra.7] [clone
.lto_priv.5842]
   74190  12.0%  73.4%   107769  17.4% cpp_classify_number
   66198  10.7%  84.1%    66243  10.7% do_endif [clone .lto_priv.2364]
   44778   7.2%  91.3%    44778   7.2% _cpp_pop_context
   20931   3.4%  94.7%    20939   3.4% simplify_plus_minus [clone
.lto_priv.5851]
    8642   1.4%  96.1%    11749   1.9% expand_asm_operands [clone
.lto_priv.6838]
    5665   0.9%  97.0%     5801   0.9% c_common_nodes_and_builtins [clone
.cold.171]
    4659   0.8%  97.7%     4659   0.8% merge_classes [clone .part.41] [clone
.lto_priv.3432]
    3659   0.6%  98.3%     3773   0.6% init_alias_vars [clone .lto_priv.9038]
    2541   0.4%  98.7%     2541   0.4% tree_ssa_lim


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]