On sh4-unknown-linux-gnu, gcc5/6 fails to compile libiberty/regex.c with -O2 -fpic: libiberty/regex.c: In function 'byte_re_match_2_internal': libiberty/regex.c:7486:1: error: insn does not satisfy its constraints: (insn 14303 10571 14304 388 (set (reg:SI 2 r2) (sign_extend:SI (mem:QI (plus:SI (reg/f:SI 6 r6 [orig:794 D.8765 ] [794]) (const_int 5 [0x5])) [0 MEM[(unsigned char *)_1000 + 5B]+0 S1 A8]))) libiberty/regex.c:7109 232 {*extendqisi2_compact_mem_disp} (nil)) libiberty/regex.c:7486:1: internal compiler error: in extract_constrain_insn, at recog.c:2232 0x86d443f _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) ../../ORIG/trunk/gcc/rtl-error.c:110 0x86d4475 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*) ../../ORIG/trunk/gcc/rtl-error.c:121 0x86a0835 extract_constrain_insn(rtx_insn*) ../../ORIG/trunk/gcc/recog.c:2232 0x86a3b29 copyprop_hardreg_forward_1 ../../ORIG/trunk/gcc/regcprop.c:788 0x86a4877 execute ../../ORIG/trunk/gcc/regcprop.c:1283 There is no problem with gcc-4.9.x.
Created attachment 35661 [details] reduced test case FYI, it doesn't fail with -O2 -fpic -mlra.
There was this idea of doing some sort of pre-reg-alloc or special case handling for R0 in an SH specific RTL pass before regular RA -- see PR 64785. One option could be to have a simple version of that pass for GCC 5 to fix this PR (and maybe several other of this kind). For GCC 6 we could probably just add -mno-lra option and make -mlra the default setting.
(In reply to Oleg Endo from comment #2) Defaulting -mlra might be reasonable for gcc 6. For gcc 5, I thought the patch for prepare_move_operands like diff --git a/config/sh/sh.c b/config/sh/sh.c index 1cf6ed0..b855d70 100644 --- a/config/sh/sh.c +++ b/config/sh/sh.c @@ -1789,9 +1789,8 @@ prepare_move_operands (rtx operands[], machine_mode mode) target/55212. We split possible load/store to two move insns via r0 so as to shorten R0 live range. It will make some codes worse but will - win on avarage for LRA. */ - else if (sh_lra_p () - && TARGET_SH1 && ! TARGET_SH2A + win on avarage. */ + else if (TARGET_SH1 && ! TARGET_SH2A && (mode == QImode || mode == HImode) && ((REG_P (operands[0]) && MEM_P (operands[1])) || (REG_P (operands[1]) && MEM_P (operands[0])))) which would be a simplest form of the preallocating r0 for this limited case, though I'm afraid that it's still too invasive for the release branch.
(In reply to Kazumoto Kojima from comment #1) > Created attachment 35661 [details] > reduced test case > > FYI, it doesn't fail with -O2 -fpic -mlra. Somehow the reduced test case seems to work OK on sh-elf even without -mlra. sh-elf-gcc -v Using built-in specs. COLLECT_GCC=sh-elf-gcc COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/sh-elf/6.0.0/lto-wrapper Target: sh-elf Configured with: ../gcc-trunk/configure --target=sh-elf --prefix=/usr/local --enable-languages=c,c++ --enable-multilib --enable-libssp --disable-nls --disable-werror --enable-lto --with-newlib --with-system-zlib Thread model: single gcc version 6.0.0 20150521 (experimental) (GCC) Maybe I'm missing some option?
Created attachment 35673 [details] original test case My sh-elf compiler COLLECT_GCC=../xsh-elf-combined/build/gcc/xgcc Target: sh-unknown-elf Configured with: /exp/ldroot/dodes/xsh-elf-combined/combined/configure --target=sh-unknown-elf --disable-libssp --disable-libgomp --disable-libstdcxx-pch --with-gnu-as --with-gnu-ld --enable-languages=c,c++ --disable-shared --with-newlib --disable-nls --prefix=/exp/ldroot/dodes/xsh-elf-combined/install --with-headers=yes --disable-gdbtk --with-mpfr=/opt2/i686-pc-linux-gnu --with-gmp=/opt2/i686-pc-linux-gnu --without-libgloss Thread model: single gcc version 6.0.0 20150531 (experimental) (GCC) can reproduce it with -O2 -fpic -m4 -ml, though the bug looks a bit fragile like as other RA related bugs. I've attached unreduced test case. sh-elf compiler ICEs for this test case even with -O2 only here.
(In reply to Kazumoto Kojima from comment #3) > (In reply to Oleg Endo from comment #2) > > Defaulting -mlra might be reasonable for gcc 6. > For gcc 5, I thought the patch for prepare_move_operands like > > diff --git a/config/sh/sh.c b/config/sh/sh.c > index 1cf6ed0..b855d70 100644 > --- a/config/sh/sh.c > +++ b/config/sh/sh.c > @@ -1789,9 +1789,8 @@ prepare_move_operands (rtx operands[], machine_mode > mode) > target/55212. > We split possible load/store to two move insns via r0 so as to > shorten R0 live range. It will make some codes worse but will > - win on avarage for LRA. */ > - else if (sh_lra_p () > - && TARGET_SH1 && ! TARGET_SH2A > + win on avarage. */ > + else if (TARGET_SH1 && ! TARGET_SH2A > && (mode == QImode || mode == HImode) > && ((REG_P (operands[0]) && MEM_P (operands[1])) > || (REG_P (operands[1]) && MEM_P (operands[0])))) > > which would be a simplest form of the preallocating r0 for this limited case, > though I'm afraid that it's still too invasive for the release branch. There could be some negative side effects with the patch above, because it forces the R0 usage quite early (at RTL expansion). I'd like to give the RTL pass a try, although it's probably even more invasive than the patch above.
(In reply to Oleg Endo from comment #6) > There could be some negative side effects with the patch above, because it > forces the R0 usage quite early (at RTL expansion). Yes, the comment part says it and it was discussed in #c83 and #c82 of PR55212. This patch will be a 'micro-degradation', though it wins CSiBE for the code size at least, even without LRA. > I'd like to give the RTL pass a try, although it's probably even more > invasive than the patch above. It sounds better and OK for trunk. Also yes, it looks too invasive for the release branch.
(In reply to Kazumoto Kojima from comment #7) > (In reply to Oleg Endo from comment #6) > > There could be some negative side effects with the patch above, because it > > forces the R0 usage quite early (at RTL expansion). > > Yes, the comment part says it and it was discussed in #c83 > and #c82 of PR55212. This patch will be a 'micro-degradation', > though it wins CSiBE for the code size at least, even without LRA. Ah, right. If it fixes the problem, then I think it's the only option we have for the release branch.
*** Bug 66395 has been marked as a duplicate of this bug. ***
(In reply to Oleg Endo from comment #8) > Ah, right. If it fixes the problem, then I think it's the only option we > have for the release branch. It can be used as a last resort. It depends on whether various rtl passes take care of hard registers or not. I'd like to wait your pre-allocating pass. It's surely cleaner than that a bit unreliable artifact in prepare_move_operands. Perhaps both can be almost equaly invasive to the release branch. If the pass is enough simple and can give good results, it may be a safer bet, especially when we are in transition to LRA.
Any news on this issue? The sh4 buildds in Debian are currently building a snapshot as of 2015-06-13 (r224454), let's see how far it gets. Adrian
(In reply to John Paul Adrian Glaubitz from comment #11) > Any news on this issue? The sh4 buildds in Debian are currently building a > snapshot as of 2015-06-13 (r224454), let's see how far it gets. It will take a while to develop the R0 pre-allocating RTL pass as mentioned in c#10. Once this has been done and it's been stabilized it can be backported to the GCC 5 release branch -- assuming that it will fix the R0 reload problems.
(In reply to Kazumoto Kojima from comment #5) > Created attachment 35673 [details] > original test case > > My sh-elf compiler > > COLLECT_GCC=../xsh-elf-combined/build/gcc/xgcc > Target: sh-unknown-elf > Configured with: /exp/ldroot/dodes/xsh-elf-combined/combined/configure > --target=sh-unknown-elf --disable-libssp --disable-libgomp > --disable-libstdcxx-pch --with-gnu-as --with-gnu-ld --enable-languages=c,c++ > --disable-shared --with-newlib --disable-nls > --prefix=/exp/ldroot/dodes/xsh-elf-combined/install --with-headers=yes > --disable-gdbtk --with-mpfr=/opt2/i686-pc-linux-gnu > --with-gmp=/opt2/i686-pc-linux-gnu --without-libgloss > Thread model: single > gcc version 6.0.0 20150531 (experimental) (GCC) > > can reproduce it with -O2 -fpic -m4 -ml, though the bug looks a bit > fragile like as other RA related bugs. I've attached unreduced test > case. sh-elf compiler ICEs for this test case even with -O2 only here. Confirmed with gcc version 6.0.0 20150617 (experimental) (GCC)
It seems the problem is adjacent insns that need R0: (insn 10503 2627 2628 402 (set (reg:SI 2424) (sign_extend:SI (mem:QI (plus:SI (reg/v/f:SI 243 [ p2 ]) (const_int 2 [0x2])) [0 MEM[(unsigned char *)p2_97 + 2B]+0 S1 A8]))) ../../ORIG/trunk/libiberty/regex.c:7109 232 {*extendqisi2_compact_mem_disp} (nil)) (note 2628 10503 10505 402 NOTE_INSN_DELETED) (insn 10505 2628 2629 402 (set (reg:SI 2425) (sign_extend:SI (mem:QI (plus:SI (reg/f:SI 774 [ D.8751 ]) (const_int 5 [0x5])) [0 MEM[(unsigned char *)_1000 + 5B]+0 S1 A8]))) ../../ORIG/trunk/libiberty/regex.c:7109 232 {*extendqisi2_compact_mem_disp} (nil)) (note 2629 10505 2631 402 NOTE_INSN_DELETED) (note 2631 2629 10507 402 NOTE_INSN_DELETED) (insn 10507 2631 2633 402 (set (reg:SI 147 t) (eq:SI (and:SI (reg:SI 2424) (reg:SI 2425)) (const_int 0 [0]))) ../../ORIG/trunk/libiberty/regex.c:7109 1 {tstsi_t} (expr_list:REG_DEAD (reg:SI 2425) (expr_list:REG_DEAD (reg:SI 2424) (nil)))) In insn 10503 reg 2424 will be allocated to R0 and will remain live until insn 10507. Reload then fails to insert a move insn R0 -> other reg after insn 10503 and R0 allocation for insn 10505 becomes impossible. I've observed this issue already a while ago. I think this condition can be improved using some sort of linear register allocator for R0-insns.
GCC 5.2 is being released, adjusting target milestone to 5.3.
Created attachment 36167 [details] C source code
I had a go at cross compiling Linux kernel for sh, and got something similar with gcc 5.1.1 dated 20150618 $ sh-linux-gnu-gcc -c -O2 bug224.c drivers/hwmon/w83627ehf.c: In function ‘show_caseopen’: drivers/hwmon/w83627ehf.c:1774:1: error: insn does not satisfy its constraints: (insn 135 111 136 2 (set:SI (reg:SI 1 r1) (sign_extend:SI (mem:QI (plus:SI (reg/f:SI 8 r8 [175]) (const_int 12 [0xc])) [0 MEM[(struct sensor_device_attribute_2 *)attr_6(D)].index+0 S1 A32]))) drivers/hwmon/w83627ehf.c:1772 231 {*extendqisi2_compact_mem_disp} (nil)) drivers/hwmon/w83627ehf.c:1774:1: internal compiler error: in extract_constrain_insn, at recog.c:2246 Problem seems to go away when I add flag -mlra, so I have a workaround. Test case attached.
GCC 5.3 is being released, adjusting target milestone.
(In reply to David Binderman from comment #17) > I had a go at cross compiling Linux kernel for sh, and got something similar > with gcc 5.1.1 dated 20150618 With recent gcc trunk on x86_64, I get $ ~/gcc/results/bin/gcc -c -O3 -march=native gcc.target/i386/pr70300.c gcc.target/i386/pr70300.c: In function ‘bar’: gcc.target/i386/pr70300.c:25:1: error: insn does not satisfy its constraints: } ^ (insn 127 98 128 2 (set (reg:V4SF 53 xmm16 [125]) (vec_select:V4SF (vec_concat:V8SF (reg:V4SF 53 xmm16 [125]) (reg:V4SF 53 xmm16 [125])) (parallel [ (const_int 0 [0]) (const_int 4 [0x4]) (const_int 1 [0x1]) (const_int 5 [0x5]) ]))) gcc.target/i386/pr70300.c:21 2460 {vec_interleave_lowv4sf} (nil)) gcc.target/i386/pr70300.c:25:1: internal compiler error: in extract_constrain_insn, at recog.c:2190 0xb7d207 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) ../../src/trunk/gcc/rtl-error.c:108 0xb7d23f _fatal_insn_not_found(rtx_def const*, char const*, int, char const*) ../../src/trunk/gcc/rtl-error.c:119 0xb3b39e extract_constrain_insn(rtx_insn*) ../../src/trunk/gcc/recog.c:2190 0xb47120 copyprop_hardreg_forward_1 ../../src/trunk/gcc/regcprop.c:774 0xb47f91 execute ../../src/trunk/gcc/regcprop.c:1280 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. $ But if I remove the -march=native, all goes well: $ ~/gcc/results/bin/gcc -c -O3 gcc.target/i386/pr70300.c $ Exact CPU model seems to be model name : AMD Phenom(tm) II X4 970 Processor
(In reply to David Binderman from comment #19) > (In reply to David Binderman from comment #17) > > I had a go at cross compiling Linux kernel for sh, and got something similar > > with gcc 5.1.1 dated 20150618 > > With recent gcc trunk on x86_64, I get > This doesn't look related to the SH issue here. Please file a new PR for this issue.
(In reply to Oleg Endo from comment #20) > (In reply to David Binderman from comment #19) > > (In reply to David Binderman from comment #17) > > > I had a go at cross compiling Linux kernel for sh, and got something similar > > > with gcc 5.1.1 dated 20150618 > > > > With recent gcc trunk on x86_64, I get > > > > This doesn't look related to the SH issue here. Please file a new PR for > this issue. Done - #70367.
GCC 5.4 is being released, adjusting target milestone.
Problem seems to have gone away with gcc version 6.1.1, dated 20160621
(In reply to David Binderman from comment #23) > Problem seems to have gone away with gcc version 6.1.1, dated 20160621 Thanks for your report. I've confirmed that the testcases don't fail with the head of 5/6/7 branches on sh4-unknown-linux-gnu. I think that now the issue is latent on those branches, though. Register allocation has the unstable nature and we could see the issue again when R0 is allocated with the old RA like in the way which #14 describes.
GCC 6 branch is being closed
The GCC 7 branch is being closed, re-targeting to GCC 8.4.
GCC 8.4.0 has been released, adjusting target milestone.
GCC 8 branch is being closed.
GCC 9.4 is being released, retargeting bugs to GCC 9.5.
GCC 9 branch is being closed
GCC 10.4 is being released, retargeting bugs to GCC 10.5.
GCC 10 branch is being closed.
GCC 11 branch is being closed.
Latent or possibly fixed. Unclear.
GCC 12 branch is being closed.
Per c#24 and c#34.