Bug 93532 - RISCV g++ hangs with optimization >= -O2
Summary: RISCV g++ hangs with optimization >= -O2
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 8.3.0
: P3 normal
Target Milestone: ---
Assignee: Jim Wilson
Keywords: compile-time-hog
Depends on:
Reported: 2020-02-01 00:23 UTC by Giulio Benetti
Modified: 2020-02-08 22:09 UTC (History)
3 users (show)

See Also:
Target: riscv*-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2020-02-01 00:00:00

.ii of file where gcc hangs on building (49.85 KB, text/plain)
2020-02-03 16:24 UTC, Giulio Benetti
testcase that reproduces for me (45.57 KB, text/plain)
2020-02-04 05:38 UTC, Jim Wilson
untested patch to fix the problem (483 bytes, patch)
2020-02-06 19:23 UTC, Jim Wilson
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Giulio Benetti 2020-02-01 00:23:16 UTC
Buildroot hangs on building package "bullet" when compiling with g++ and optimization >= -O2, it's noted instead that the problem doesn't show anymore using -O1 as a work-around. Nothing changes trying to build with/without debug symbols.

Here is build hang log:
Comment 1 Giulio Benetti 2020-02-01 00:26:00 UTC
And the same behaviour is observed building lmbench too.
Here is the log:
Comment 2 Giulio Benetti 2020-02-01 00:28:21 UTC
Sorry for the noise. This is true when compiling c files too. :-)
Comment 3 Andrew Pinski 2020-02-01 00:30:40 UTC
Can you read https://gcc.gnu.org/bugs/ and provide the needed information?
Comment 4 Giulio Benetti 2020-02-03 16:24:58 UTC
Created attachment 47769 [details]
.ii of file where gcc hangs on building

This is the .ii of file where gcc hangs on building.
Comment 5 Giulio Benetti 2020-02-03 16:27:34 UTC
Here is the specific command line which compiles .cpp file:
/home/giuliobenetti/br_reproduce/9a405ec6fabfa306c14a671a5e09359ac623c25b/output/host/bin/riscv32-linux-g++ --sysroot=/home/giuliobenetti/br_reproduce/9a405ec6fabfa306c14a671a5e09359ac623c25b/output/host/riscv32-buildroot-linux-gnu/sysroot  -DBT_USE_EGL -DBulletCollision_EXPORTS -DNO_OPENGL3 -DUSE_GRAPHICAL_BENCHMARK -I/home/giuliobenetti/br_reproduce/9a405ec6fabfa306c14a671a5e09359ac623c25b/output/build/bullet-2.89/src  -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64  -O2   -save-temps  -DNDEBUG -fPIC   -o CMakeFiles/BulletCollision.dir/NarrowPhaseCollision/btPolyhedralContactClipping.o -c /home/giuliobenetti/br_reproduce/9a405ec6fabfa306c14a671a5e09359ac623c25b/output/build/bullet-2.89/src/BulletCollision/NarrowPhaseCollision/btPolyhedralContactClipping.cpp

then it sits there forever.
Comment 6 Giulio Benetti 2020-02-03 16:30:32 UTC
And this is how riscv32 gcc has been configured:
Using built-in specs.
Target: riscv32-buildroot-linux-gnu
Configured with: ./configure --prefix=/opt/br-riscv32-glibc-2019.11 --sysconfdir=/opt/br-riscv32-glibc-2019.11/etc --enable-static --target=riscv32-buildroot-linux-gnu --with-sysroot=/opt/br-riscv32-glibc-2019.11/riscv32-buildroot-linux-gnu/sysroot --enable-__cxa_atexit --with-gnu-ld --disable-libssp --disable-multilib --disable-decimal-float --with-gmp=/opt/br-riscv32-glibc-2019.11 --with-mpc=/opt/br-riscv32-glibc-2019.11 --with-mpfr=/opt/br-riscv32-glibc-2019.11 --with-pkgversion='Buildroot 2019.11' --with-bugurl=http://bugs.buildroot.net/ --disable-libquadmath --enable-tls --enable-threads --without-isl --without-cloog --with-arch=rv32imafd --with-abi=ilp32d --enable-languages=c,c++ --with-build-time-tools=/opt/br-riscv32-glibc-2019.11/riscv32-buildroot-linux-gnu/bin --enable-shared --disable-libgomp
Thread model: posix
gcc version 8.3.0 (Buildroot 2019.11)
Comment 7 Jakub Jelinek 2020-02-03 17:08:20 UTC
Can't reproduce, neither with 8.3.1 20200111 nor current trunk, it compiles pretty much instantly (cross-compiler from x86_64-linux to riscv32-linux).
Comment 8 Giulio Benetti 2020-02-03 17:14:53 UTC
Do you mind to use official Buildroot script to reproduce?
Here is the procedure:

# git clone git://git.busybox.net/buildroot
# wget https://git.busybox.net/buildroot-test/tree/utils/br-reproduce-build

- modify BASE_GIT=... with your buildroot path in br-reproduce-build then:
# chmod a+x br-reproduce-build
# ./br-reproduce-build 9a405ec6fabfa306c14a671a5e09359ac623c25b

and wait until it hangs, otherwise I see it difficult to reproduce it.
Is it ok for you?
Comment 9 Jim Wilson 2020-02-04 05:37:52 UTC
I tried the buildroot instructions.  It didn't work on an ubuntu 16.04 server machine.  There is a 'python3 pip3 -q docwriter' command that hangs.  I also discovered that the script isn't restartable.  It runs -rf on the build directory and exits with an error.  I did get it to work on my ubuntu 18.04 laptop.  And it does hang, but it isn't the btPolyhedralContactClipping.cpp file that hangs for me, it is the btBoxBoxDetector.cpp file.  I was able to reproduce this with a gcc-8.3.0 build using -O2 -fPIC -fstack-protector-strong options to compile the file.  It does not reproduce using the top of the gcc-8-branch svn tree, suggesting that either it is already fixed, or it is maybe a memory corruption problem that is hard to reproduce.

Using gdb to attach to the gcc-8.3.0 compiler, I see that it is looping in lra, but I haven't tried to debug that yet.

#0  0x0000000000705e7b in bitmap_find_bit (bit=42321, bit@entry=330, 
    head=0x376ae88) at ../../gcc-8.3.0/gcc/bitmap.c:539
#1  bitmap_set_bit (head=0x376ae88, bit=bit@entry=42321)
    at ../../gcc-8.3.0/gcc/bitmap.c:600
#2  0x000000000099b95f in mark_regno_dead (regno=42321, mode=<optimized out>, 
    point=<optimized out>) at ../../gcc-8.3.0/gcc/lra-lives.c:362
#3  0x000000000099c9c4 in process_bb_lives (dead_insn_p=false, 
    curr_point=@0x7ffc9a90cccc: 181876, bb=<basic_block 0x7f8e439c50d0 (38)>)
    at ../../gcc-8.3.0/gcc/lra-lives.c:842
#4  lra_create_live_ranges_1 (all_p=all_p@entry=true, 
    at ../../gcc-8.3.0/gcc/lra-lives.c:1337
#5  0x000000000099e7c0 in lra_create_live_ranges (all_p=all_p@entry=true, 
    at ../../gcc-8.3.0/gcc/lra-lives.c:1406
#6  0x0000000000982d0c in lra (f=<optimized out>)
    at ../../gcc-8.3.0/gcc/lra.c:2473
#7  0x000000000093fa32 in do_reload () at ../../gcc-8.3.0/gcc/ira.c:5465
#8  (anonymous namespace)::pass_reload::execute (this=<optimized out>)
    at ../../gcc-8.3.0/gcc/ira.c:5649
Comment 10 Jim Wilson 2020-02-04 05:38:56 UTC
Created attachment 47774 [details]
testcase that reproduces for me

compile with -O2 -fPIC -fstack-protector-strong
Comment 11 Jim Wilson 2020-02-04 18:12:52 UTC
I'm able to reproduce with the gcc-8-branch now.  Maybe I made a mistake with my earlier build.  Anyways, it looks like it is going wrong here in the reload dump

      Creating newreg=1856, assigning class NO_REGS to save r1856
  434: fa0:SF=call [`sqrtf'] argc:0
      REG_UNUSED fa0:SF
      REG_CALL_DECL `sqrtf'
    Add reg<-save after:
 2446: r114:SF#0=r1856:DF

    Add save<-reg after:
 2445: r1856:DF=r114:SF#0


then later we appear to end up in a loop generating secondary reloads that need secondary reloads themselves, and so forth.  The instruction above looks funny, trying to use a subreg to convert DFmode to SFmode.  I don't think we should be generating that.

So it looks like a caller save problem.  If I add -fno-caller-saves the compile finishes.  It appears that we need a definition for HARD_REGNO_CALLER_SAVE_MODE because the default definition can't work here.  The comment in sparc.h for HARD_REGNO_CALLER_SAVE_MODE looks relevant.  The same definition may work for RISC-V.  Looks like the MIPS port does it the same way too.
Comment 12 Jim Wilson 2020-02-06 19:20:50 UTC
A bisection on mainline between the gcc-8 and gcc-9 releases shows that this testcase was fixed by a combine patch for PR87600 that stops combining hard regs with pseudos to reduce register pressure.  The commentary refers to ira and lra problems.  A combine patch won't be as safe as a RISC-V backend patch though.

I tried testing the riscv HARD_REGNO_CALLER_SAVE_MODE patch with buildroot but it turns out that it is downloading a pre-built compiler instead of building one.  So dropping in the patch doesn't do anything.  I will have to figure out what is going on there.

Trying the riscv patch with mainline on the testcase, I see that I get better rematerialization without the confusing subregs, and I also get smaller stack frames since we are saving SFmode now to the stack instead of DFmode now.  Otherwise, I don't see any significant changes to the code.

I tried a make check with the riscv patch on mainline, and got an unexpected g++ testsuite failure, so I will have to look into that.
Comment 13 Jim Wilson 2020-02-06 19:23:06 UTC
Created attachment 47794 [details]
untested patch to fix the problem
Comment 14 Giulio Benetti 2020-02-08 13:47:27 UTC
Hi Ji,

thanks for providing this patch, it fixes the problem.
Comment 15 Giulio Benetti 2020-02-08 16:06:33 UTC
I mark this bug as resolved by:
Comment 16 Andrew Pinski 2020-02-08 16:11:31 UTC
(In reply to Giulio Benetti from comment #15)
> I mark this bug as resolved by:
> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47794

The patch has not been applied to the sources yet.
Comment 17 Giulio Benetti 2020-02-08 16:19:32 UTC
(In reply to Andrew Pinski from comment #16)
> (In reply to Giulio Benetti from comment #15)
> > I mark this bug as resolved by:
> > https://gcc.gnu.org/bugzilla/attachment.cgi?id=47794
> The patch has not been applied to the sources yet.

Oops, sorry, I'm not very used to bugzilla/gcc.

Thanks again, for providing that patch.
Comment 18 CVS Commits 2020-02-08 22:01:46 UTC
The master branch has been updated by Jim Wilson <wilson@gcc.gnu.org>:


commit r10-6528-gb780f68e025b2cf5631183e199ebf672ea463af6
Author: Jim Wilson <jimw@sifive.com>
Date:   Sat Feb 8 13:57:36 2020 -0800

    RISC-V: Improve caller-save code generation.
    Avoid paradoxical subregs when caller save.  This reduces stack frame size
    due to smaller loads and stores, and more frequent rematerialization.
    	PR target/93532
    	* config/riscv/riscv.h (HARD_REGNO_CALLER_SAVE_MODE): Define.
Comment 19 Jim Wilson 2020-02-08 22:07:13 UTC
Patch applied to mainline.  This is just a minor optimization for gcc-10 as a combiner patch between gcc-8 and gcc-9 reduces register pressure enough to prevent the hang.  Hence there is no real need for the patch in gcc-9.  The patch might be useful in gcc-8, but the problem is hard to reproduce, buildroot is the only one that ran into the problem, and they can always add the patch to their tree, so not clear if we really need it on the gcc-8 branch.
Comment 20 Jim Wilson 2020-02-08 22:09:38 UTC
Thanks for confirming that it solves the buildroot build problem.

My gcc mainline g++ test failure turned out to be a thread related issue with qemu cross testing.  The testcase works always on hardware, but fails maybe 10-20% of the time when run under qemu.  RISC-V qemu is known to still have a few bugs in this area, though they might already be fixed in newer qemu versions than what I have.