Bug 100299 - [11 Regression] cc1plus taking all RAM in EVRP
Summary: [11 Regression] cc1plus taking all RAM in EVRP
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 11.1.0
: P2 normal
Target Milestone: 11.2
Assignee: Andrew Macleod
URL:
Keywords: compile-time-hog, memory-hog
Depends on:
Blocks:
 
Reported: 2021-04-28 00:29 UTC by Vincent
Modified: 2021-07-14 22:17 UTC (History)
7 users (show)

See Also:
Host:
Target:
Build:
Known to work: 12.0
Known to fail: 11.1.0
Last reconfirmed: 2021-04-28 00:00:00


Attachments
ii file to reproduce the issue. gunzip first. (311.12 KB, application/gzip)
2021-04-28 00:29 UTC, Vincent
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Vincent 2021-04-28 00:29:06 UTC
Created attachment 50692 [details]
ii file to reproduce the issue. gunzip first.

While compiling a relatively large file (ii file ~2 MB), g++ compilation in -O3 aborts after suddenly allocating in a few steps within a few seconds all the RAM (128 GB!).

Compiles just fine with -O2 (and it takes 5 times longer than to abort in -O3).

To my utter surprise, it compiles with -O2 -fgcse-after-reload -fipa-cp-clone -floop-interchange -floop-unroll-and-jam -fpeel-loops -fpredictive-commoning -fsplit-loops -fsplit-paths -ftree-loop-distribution -ftree-loop-vectorize -ftree-partial-pre -ftree-slp-vectorize -funswitch-loops -fvect-cost-model -fvect-cost-model=dynamic -fversion-loops-for-strides

While the specific 11.1 man specifies that these options are equivalent to -O3. Excerpt of man:

-O3 Optimize yet more.  -O3 turns on all optimizations specified by -O2 and also turns on the following optimization flags:

           -fgcse-after-reload -fipa-cp-clone -floop-interchange -floop-unroll-and-jam -fpeel-loops -fpredictive-commoning -fsplit-loops
           -fsplit-paths -ftree-loop-distribution -ftree-loop-vectorize -ftree-partial-pre -ftree-slp-vectorize -funswitch-loops
           -fvect-cost-model -fvect-cost-model=dynamic -fversion-loops-for-strides

The man must not be correct, some other option must be added in -O3.

I am on x86_64-linux-gnu (Ubuntu 20.04) - but I am fairly sure it is not platform-dependent.

gcc is configured using 

./configure -v --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --prefix=/usr/local/gcc-11.1 --enable-checking=release --enable-languages=c,c++ --disable-multilib --program-suffix=-11.1

The complete command line that triggers the bug:

g++-11.1 -std=c++20 -O3 -c test.ii

The error:

g++-11.1: fatal error: Killed signal terminated program cc1plus
compilation terminated.

gzipped test.ii attached to this bug report.

Previous versions of gcc do not exhibit the bug, but do compile very very slowly compared to -O0 option, or compared to clang.
Comment 1 Richard Biener 2021-04-28 06:20:33 UTC
Confirmed.  Ranger is the culprit (pressed ctrl-C at ~10GB):

#0  0x00007ffff68c61f7 in __memset_avx2_unaligned_erms () from /lib64/libc.so.6
#1  0x00000000015bbba6 in ssa_block_ranges::ssa_block_ranges (this=0x6f8d9d0, 
    t=0x7ffff6593c78, allocator=<optimized out>)
    at /home/rguenther/src/gcc-11-branch/gcc/gimple-range-cache.cc:147
#2  0x00000000015bc1cd in block_range_cache::get_block_ranges (
    this=this@entry=0x258b8d0, name=name@entry=0x7fffd9602dc8)
    at /home/rguenther/src/gcc-11-branch/gcc/gimple-range-cache.cc:262
#3  0x00000000015bc209 in block_range_cache::set_bb_range (
    this=this@entry=0x258b8d0, name=name@entry=0x7fffd9602dc8, 
    bb=bb@entry=0x7fffd4527b60, r=...)
    at /home/rguenther/src/gcc-11-branch/gcc/gimple-range-cache.cc:286
#4  0x00000000015bcb10 in ranger_cache::fill_block_cache (
    this=this@entry=0x258b780, name=name@entry=0x7fffd9602dc8, 
    bb=bb@entry=0x7fffd4527b60, def_bb=0x7fffd45279c0)
    at /home/rguenther/src/gcc-11-branch/gcc/gimple-range-cache.cc:1023
#5  0x00000000015bd148 in ranger_cache::block_range (
    this=this@entry=0x258b780, r=..., bb=bb@entry=0x7fffd4527b60, 
    name=name@entry=0x7fffd9602dc8, calc=calc@entry=true)
    at /home/rguenther/src/gcc-11-branch/gcc/gimple-range-cache.cc:842
#6  0x00000000015b5033 in gimple_ranger::range_on_entry (this=0x258b770, 
    r=..., bb=0x7fffd4527b60, name=0x7fffd9602dc8)
    at /home/rguenther/src/gcc-11-branch/gcc/gimple-range.cc:992
#7  0x00000000015b57a9 in gimple_ranger::range_of_expr (this=0x258b770, r=..., 
    expr=0x7fffd9602dc8, stmt=<optimized out>)
    at /home/rguenther/src/gcc-11-branch/gcc/gimple-range.cc:963
#8  0x0000000000f5f7e2 in range_query::value_of_expr (this=0x258b770, 
    name=0x7fffd9602dc8, stmt=<optimized out>)
    at /home/rguenther/src/gcc-11-branch/gcc/value-query.cc:86
#9  0x00000000015c3ed2 in hybrid_folder::value_of_expr (this=0x7fffffffd8f0, 
    op=0x7fffd9602dc8, stmt=0x7ffff327b428)
    at /home/rguenther/src/gcc-11-branch/gcc/gimple-ssa-evrp.c:235
#10 0x0000000000e3db0c in substitute_and_fold_engine::replace_uses_in (
    this=0x7fffffffd8f0, stmt=stmt@entry=0x7ffff327b428)
    at /home/rguenther/src/gcc-11-branch/gcc/tree-ssa-propagate.c:871
#11 0x0000000000e3de15 in substitute_and_fold_dom_walker::before_dom_children (
    this=0x7fffffffd880, bb=0x7fffd4527b60)
    at /home/rguenther/src/gcc-11-branch/gcc/tree-ssa-propagate.c:1141
#12 0x0000000001593db8 in dom_walker::walk (this=0x7fffffffd880, 
    bb=0x7fffd4527b60) at /home/rguenther/src/gcc-11-branch/gcc/domwalk.c:309
#13 0x0000000000e3d336 in substitute_and_fold_engine::substitute_and_fold (
    this=this@entry=0x7fffffffd8f0, block=block@entry=0x0)
    at /home/rguenther/src/gcc-11-branch/gcc/tree-ssa-propagate.c:1283
#14 0x00000000015c3b47 in execute_early_vrp ()
    at /home/rguenther/src/gcc-11-branch/gcc/gimple-ssa-evrp.c:349
#15 0x0000000000c2566d in execute_one_pass (pass=0x2465d20)
    at /home/rguenther/src/gcc-11-branch/gcc/passes.c:2567

finishing frame #12 results in OOM.
Comment 2 Jason Merrill 2021-05-10 16:46:56 UTC
Changing component, then.
Comment 3 Andrew Macleod 2021-06-07 22:13:21 UTC
typo in the changelog.
fixed.

commit 9858cd1a6827ee7a928318acb5e86389f79b4012 (HEAD -> master, origin/master, origin/HEAD)
Author: Andrew MacLeod <amacleod@redhat.com>
Date:   Mon Jun 7 13:18:55 2021 -0400

    Implement a sparse bitmap representation for Rangers on-entry cache.
    
    Use a sparse representation for the on entry cache, and utilize it when
    the number of basic blocks in the function exceeds param_evrp_sparse_threshold.
    
            PR tree-optimization/PR100299
            * gimple-range-cache.cc (class sbr_sparse_bitmap): New.
            (sbr_sparse_bitmap::sbr_sparse_bitmap): New.
            (sbr_sparse_bitmap::bitmap_set_quad): New.
            (sbr_sparse_bitmap::bitmap_get_quad): New.
            (sbr_sparse_bitmap::set_bb_range): New.
            (sbr_sparse_bitmap::get_bb_range): New.
            (sbr_sparse_bitmap::bb_range_p): New.
            (block_range_cache::block_range_cache): initialize bitmap obstack.
            (block_range_cache::~block_range_cache): Destruct obstack.
            (block_range_cache::set_bb_range): Decide when to utilze the
            sparse on entry cache.
            * gimple-range-cache.h (block_range_cache): Add bitmap obstack.
            * params.opt (-param=evrp-sparse-threshold): New.

commit 5ad089a3c946aec655436fa3b0b50d6574b78197
Author: Andrew MacLeod <amacleod@redhat.com>
Date:   Mon Jun 7 13:12:01 2021 -0400

    Implement multi-bit aligned accessors for sparse bitmap.
    
    Provide set/get routines to allow sparse bitmaps to be treated as an array
    of multiple bit values. Only chunk sizes that are powers of 2 are supported.
    
            * bitmap.c (bitmap_set_aligned_chunk): New.
            (bitmap_get_aligned_chunk): New.
            (test_aligned_chunk): New.
            (bitmap_c_tests): Call test_aligned_chunk.
            * bitmap.h (bitmap_set_aligned_chunk, bitmap_get_aligned_chunk): New
Comment 4 Richard Biener 2021-06-08 08:15:54 UTC
On trunk.  Keeping open for eventual backporting.
Comment 5 GCC Commits 2021-07-14 21:58:28 UTC
The releases/gcc-11 branch has been updated by Andrew Macleod <amacleod@gcc.gnu.org>:

https://gcc.gnu.org/g:52f0aa4dee8401ef3958dbf789780b0ee877beab

commit r11-8746-g52f0aa4dee8401ef3958dbf789780b0ee877beab
Author: Andrew MacLeod <amacleod@redhat.com>
Date:   Mon Jun 7 13:18:55 2021 -0400

    Implement a sparse bitmap representation for Rangers on-entry cache.
    
    Use a sparse representation for the on entry cache, and utilize it when
    the number of basic blocks in the function exceeds param_evrp_sparse_threshold.
    
            PR tree-optimization/100299
            * gimple-range-cache.cc (class sbr_sparse_bitmap): New.
            (sbr_sparse_bitmap::sbr_sparse_bitmap): New.
            (sbr_sparse_bitmap::bitmap_set_quad): New.
            (sbr_sparse_bitmap::bitmap_get_quad): New.
            (sbr_sparse_bitmap::set_bb_range): New.
            (sbr_sparse_bitmap::get_bb_range): New.
            (sbr_sparse_bitmap::bb_range_p): New.
            (block_range_cache::block_range_cache): initialize bitmap obstack.
            (block_range_cache::~block_range_cache): Destruct obstack.
            (block_range_cache::set_bb_range): Decide when to utilze the
            sparse on entry cache.
            * gimple-range-cache.h (block_range_cache): Add bitmap obstack.
            * params.opt (-param=evrp-sparse-threshold): New.
    
    (cherry picked from commit 9858cd1a6827ee7a928318acb5e86389f79b4012)
Comment 6 Andrew Macleod 2021-07-14 22:17:41 UTC
fixed.