I found a bug of pcrel too far in gcc-4.4.3 and gcc-4.3.4 on sh-elf. Sorry. There is the code to reappear, but cannot lower it. $ gcc -O2 -fPIC -DPIC -c gong_1424_debug_1.c /tmp/ccjdeDlE.s: Assembler messages: /tmp/ccjdeDlE.s:714: Error: pcrel too far When I don't optimize it, it doesn't become the error. $ gcc-4.4 -v Using built-in specs. Target: sh4-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 4.4.2-9' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 --program-suffix=-4.4 --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --with-multilib-list=m4,m4-nofpu --with-cpu=sh4 --enable-checking=release --build=sh4-linux-gnu --host=sh4-linux-gnu --target=sh4-linux-gnu Thread model: posix gcc version 4.4.3 20100108 (prerelease) (Debian 4.4.2-9) $ gcc-4.3 -v Using built-in specs. Target: sh4-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 4.3.4-6+sh4' --with-bugurl=file:///usr/share/doc/gcc-4.3/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --with-gxx-include-dir=/usr/include/c++/4.3 --program-suffix=-4.3 --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --enable-mpfr --with-multilib-list=m4,m4-nofpu --with-cpu=sh4 --enable-checking=release --build=sh4-linux-gnu --host=sh4-linux-gnu --target=sh4-linux-gnu Thread model: posix gcc version 4.3.4 (Debian 4.3.4-6)
Created attachment 19687 [details] The source code that can reproduce a problem.
I've confirmed that the test case also fails on 4.5.0 and doesn't on 4.2.4.
Hello, I had a similar problem a while ago, but was never able to reproduce on trunk. I was a phasing problem between branch_shortening from sh_reorg and the delayed branch scheduler, that would change the size of a bf (2) against a bf/s+instruction (4). Thus breaking surrounding branch offsets. -fno-delayed-branch is the workaround.
Created attachment 19689 [details] Conservative fix. Conservatively increase length of undelayed conditional branches to prevent a problem with the ds scheduler inserting an instruction in the slot.
(In reply to comment #4) > Conservatively increase length of undelayed conditional branches to prevent a > problem with the ds scheduler inserting an instruction in the slot. Looks fine. A very minor nit, JUMP_P and JUMP_TABLE_DATA_P macro can be used for the first 3 lines of the if-condition.
(In reply to comment #5) > (In reply to comment #4) > > Conservatively increase length of undelayed conditional branches to prevent a > > problem with the ds scheduler inserting an instruction in the slot. > > Looks fine. A very minor nit, JUMP_P and JUMP_TABLE_DATA_P macro > can be used for the first 3 lines of the if-condition. > Thanks. I don't think I can use JUMP_TABLE_DATA_P since this is a != test and JUMP_TABLE_DATA_P includes JUMP_P. Anyway, OK for trunk ? (just need to fix the date in the ChangeLog). regtesting done.
(In reply to comment #6) > Anyway, OK for trunk ? (just need to fix the date in the ChangeLog). regtesting > done. OK. And the patch is pre-approved for branches too after one week or so. BTW, I mean JUMP_P(x) && !JUMP_TABLE_DATA_P(x): a && !(a && (b || c)) == a &&(!a || !(b || c)) == (a && !a) || (a && !(b || c)) == 0 || (a && !(b || c)) == a && !b && !c
Created attachment 19690 [details] and cleanup with JUMP_TABLE_DATA_P
(In reply to comment #7) > (In reply to comment #6) > > Anyway, OK for trunk ? (just need to fix the date in the ChangeLog). regtesting > > done. > > OK. And the patch is pre-approved for branches too after one week or so. > > BTW, I mean JUMP_P(x) && !JUMP_TABLE_DATA_P(x): > didn't read you correctly. So I took the opportunity to cleanup every other occurrences of the same idioms in the file. OK ?
(In reply to comment #9) > So I took the opportunity to cleanup every other > occurrences of the same idioms in the file. OK ? OK. Thanks!
Subject: Bug 42841 Author: chrbr Date: Tue Jan 26 07:20:27 2010 New Revision: 156229 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=156229 Log: fix PR target/42841 Modified: trunk/gcc/ChangeLog trunk/gcc/config/sh/sh.c
Subject: Bug 42841 Author: chrbr Date: Tue Jan 26 07:21:57 2010 New Revision: 156230 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=156230 Log: fix PR target/42841 Modified: branches/gcc-4_4-branch/gcc/ChangeLog branches/gcc-4_4-branch/gcc/config/sh/sh.c
Subject: Bug 42841 Author: chrbr Date: Tue Jan 26 07:28:05 2010 New Revision: 156231 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=156231 Log: fix PR target/42841 Modified: branches/gcc-4_3-branch/gcc/ChangeLog branches/gcc-4_3-branch/gcc/config/sh/sh.c
fixed in 4.5, 4.3 and 4.4
(In reply to comment #10) > (In reply to comment #9) > > So I took the opportunity to cleanup every other > > occurrences of the same idioms in the file. OK ? > > OK. Thanks! > Thanks for your patch! I confirmed that problem on other program was fixed with this patch.
I've got some new libstdc++-v3 testsuite failures with the patch on my nightly sh4-linux tester: Running /exp/ldroot/dodes/ORIG/trunk/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp ... FAIL: 23_containers/deque/requirements/exception/basic.cc (test for excess errors) WARNING: 23_containers/deque/requirements/exception/basic.cc compilation failed to produce executable FAIL: 23_containers/deque/requirements/exception/propagation_consistent.cc (test for excess errors) WARNING: 23_containers/deque/requirements/exception/propagation_consistent.cc compilation failed to produce executable FAIL: 30_threads/packaged_task/members/get_future.cc execution test FAIL: 30_threads/shared_future/members/get.cc execution test The first failure is /tmp/ccl5TCl4.s: Assembler messages: /tmp/ccl5TCl4.s:43070: Error: undefined symbol `.L3394' in operation FAIL: 23_containers/deque/requirements/exception/basic.cc (test for excess errors) The last 2 failures are resulted with the unaligned accesses. I saw Sending SIGBUS to "get_future.exe" due to unaligned access (PC 296554a8 PR 2965549a) Sending SIGBUS to "get.exe" due to unaligned access (PC 296554a8 PR 2965549a) on the target machine. With reverting the first hunk of the patch, these errors go away. Christian, could you please revert or disable the first hunk of patches temporarily? Sorry I didn't catch this earlier.
strange, I didn't see that, even the undefined symbol in the assembler. OK I disable the fix until this is clarified. Let me do a recheck on the silicium, will let you know. -c (In reply to comment #16) > I've got some new libstdc++-v3 testsuite failures with the patch > on my nightly sh4-linux tester: > > Running > /exp/ldroot/dodes/ORIG/trunk/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp > ... > FAIL: 23_containers/deque/requirements/exception/basic.cc (test for excess > errors) > WARNING: 23_containers/deque/requirements/exception/basic.cc compilation failed > to produce executable > FAIL: 23_containers/deque/requirements/exception/propagation_consistent.cc > (test for excess errors) > WARNING: 23_containers/deque/requirements/exception/propagation_consistent.cc > compilation failed to produce executable > FAIL: 30_threads/packaged_task/members/get_future.cc execution test > FAIL: 30_threads/shared_future/members/get.cc execution test > > The first failure is > > /tmp/ccl5TCl4.s: Assembler messages: > /tmp/ccl5TCl4.s:43070: Error: undefined symbol `.L3394' in operation > > FAIL: 23_containers/deque/requirements/exception/basic.cc (test for excess > errors) > > The last 2 failures are resulted with the unaligned accesses. I saw > > Sending SIGBUS to "get_future.exe" due to unaligned access (PC 296554a8 PR > 2965549a) > Sending SIGBUS to "get.exe" due to unaligned access (PC 296554a8 PR 2965549a) > > on the target machine. > With reverting the first hunk of the patch, these errors go away. > Christian, could you please revert or disable the first hunk > of patches temporarily? Sorry I didn't catch this earlier. >
Subject: Bug 42841 Author: chrbr Date: Wed Jan 27 13:24:40 2010 New Revision: 156282 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=156282 Log: temporarily revert fix for PR target/42841 Modified: trunk/gcc/ChangeLog trunk/gcc/config/sh/sh.c
to make sure we are in the same testing/configuration environment could you please send me the preprocessed file for 23_containers/deque/requirements/exception/propagation_consistent.cc as well as the compilation line in libstdc++.log that you used ? many thanks Christian
Created attachment 19729 [details] A test case "cc1plus -std=gnu++0x -O2 propagation_consistent.ii" produces a problematic code here.
This one is marked as unsupported in my sh-superh-elf log, But I can reproduce it now on sh4-linux. (despite that I have rebuilt a whole distrib without seeing it :O). Anyway I'm investigating. I'm reopening the bug and will revert in the branches as well if I don't find a quick solution. Regards (In reply to comment #20) > Created an attachment (id=19729) [edit] > A test case > > "cc1plus -std=gnu++0x -O2 propagation_consistent.ii" produces > a problematic code here. >
humm, looks like a latent bug. Accidentally the CP is inserted before a compact_jump, which enable further redirect jump optimisation. I don't think it is directly related to the fix, but lets work it a little bit more. so we have just before dbr: jump_insn -> 2586 a constant pool L2586 jump_insn -> 3394 L3394: ... then in reorg_redirect_jump we redirect the jump over the CP and delete_related_insn so the code between the CP and the jump becomes dead. and we have jump_insn -> 3394 a constant pool L3394 ... but the label L2586 is used in the exception table... and thus remains undefined. now my question: how the exception table can refer to a region delimited by deleted labels. It's should be built after dbr isn't it ?
I agree with you that there is a latent problem. It seems that sh_reorg inserts a CP with a new jump at the landing pad for the exception in basic.cc and propagation_consistent.cc cases. This confuses EH processing because the labels for landing pads are defined and recorded very early and used later to output DW2 frame data, unfortunately. A simple work around may be not to insert a jump+CP at the possible position for the landing pad. The patch --- ORIG/trunk/gcc/config/sh/sh.c 2010-01-26 18:33:47.000000000 +0900 +++ trunk/gcc/config/sh/sh.c 2010-01-29 09:56:50.000000000 +0900 @@ -4641,6 +4641,9 @@ find_barrier (int num_mova, rtx mova, rt a jump makes it more likely that the bra delay slot will be filled. */ while (NOTE_P (from) || JUMP_P (from) + || (flag_exceptions + && CALL_P (from) + && find_reg_note (from, REG_EH_REGION, NULL_RTX)) || LABEL_P (from)) from = PREV_INSN (from); fixes the failures for basic.cc and propagation_consistent.cc, though it doesn't fix the execution failures for get_future.cc and get.cc.
Created attachment 19747 [details] fixed removal of landing pad label rtx The landing_pad label rtx was created and recorded in tree_inline (duplicate_eh_regions). Seems that reorg_redirect_jump or delete_insn should check for it before deciding it can be removed. I'm testing this patch that does this.
by the way, FYI, trying to explain the differences between your results and mine for sh4-linux. my build was is configured with --enable-target-optspace, so all my runtime build tests are ran with -Os, not -O2 like yours. Which could make a huge differences in CP layout... I repass in -O2 over the week end. Cheers
I'm afraid the unaligned access sigbug regression is another latent bug just exhibited by the fix for the original PR :-( what happens is the the GOT loading sequence is broken by a constant pool: we end up to emit: mov.l .L542,r12 (X) bra .L516 nop .L542: .align 2 .long _GLOBAL_OFFSET_TABLE_ ... .L516: mova .L545,r0 (Y) !!!!! add r0,r12 .L545: .long _GLOBAL_OFFSET_TABLE_ The reason for that is that the second mova instruction is unluckily now out of range by 2 bytes. (which could happen with any other situation, even without this patch). IMHO We should forbid the duplication of a _GLOBAL_OFFSET_TABLE_ loading constant while in a UNSPEC_MOVA sequence. We should probably reduce si_limit in find_barrier when a (set (reg:SI 0 r0) (unspec:SI [ (const:SI (unspec:SI [ (symbol_ref ("*_GLOBAL_OFFSET_TABLE_")) is met and next is (set (reg:SI 12 r12) (const:SI (unspec:SI [ (symbol_ref ("*_GLOBAL_OFFSET_TABLE_") in PIC. I experimenting with a couple of different solutions in this direction. this PR was a really interesting bugs finder.... !
Created attachment 19792 [details] A patch Indeed! I've tested the attached patch and confirmed that it doesn't regress with the top level "make -k check" for all languages except ada on sh4-linux.
Hello Kaj, thanks for your proposal thanks for the proposal. but I'm wondering if preventing the scheduling of the mov.l and mova instructions are not too much overkill ? (sh_reorg comes after the scheduler, but even if it didn't that should be ok to mov up instructions. (the R0 liverange between the add and load is another more general problem) Do I miss something ? We only want to avoid the CP to be inserted between those 2 instructions, it's not necessary to have more blockages. I'm working on something that tracks the GOT loading access during the find_barrier walk and then revert back at the end to the latest safe place. OK on the example but the full linux distrib rebuild and validation is still ongoing.
I think these blockages are not overkill. GOTaddr2picreg is used only at prologue and non-pic tls initial exec accesses. The former is at most once for each function and never in the minor loop. The latter case wouldn't occur so frequently and the initial exec access is loaded sequence of instructions in the first place. > We only want to avoid the CP to be inserted between those 2 instructions, > it's not necessary to have more blockages. I'm working on something that > tracks the GOT loading access during the find_barrier walk and then revert > back at the end to the latest safe place. OK on the example but the full > linux distrib rebuild and validation is still ongoing. Of course, it's OK if it passes all the usual tests.
Created attachment 19794 [details] patch to fix GOT access load with constant pool Patch under validation.
Looks smart and clean! One minor nit, I guess that the occurence of gbr and GBR in ChangeLog and comments should be replaced with GOT to avoid confusion with GBR register of SH CPU. When you propose it to the list, could you please separate the third hunk which is for the original PR42841 as an independant patch. Also don't forget to update the copyright years in the first one.
> Looks smart and clean! One minor nit, I guess that the occurence of > gbr and GBR in ChangeLog and comments should be replaced with GOT to > avoid confusion with GBR register of SH CPU. Thanks for catching up this error in the comment. I meant GP of course, which is even more preferable that GOT (which is what we load, not what we compute). (In reply to comment #31) > When you propose it to the list, could you please separate the third > hunk which is for the original PR42841 as an independant patch. Also > don't forget to update the copyright years in the first one. > OK, that was also my intention to submit the 3rd hunk (the one that fixes the jump to the landing pad around the constant table right ?) as a separate patch as it will require the approval of a middle end maintainer. If it cannot go in the trunk before the 4.5 freeze I can propose you to commit your workaround (comment #23) so not to block the regression. Then we can revert when the proper patch is discussed/accepted. (I'm a little bit late for that sorry).
Your fix of the middle end looks plausible but I think the target shouldn't generate a CP at the eh landing pad anyway. I'll commit the hunk below anyway after your patch for pic problem is installed. @@ -4654,6 +4654,13 @@ find_barrier (int num_mova, rtx mova, rt if (last_got) from = PREV_INSN (last_got); + /* Don't insert the constant pool table at the position which + may be the landing pad. */ + if (flag_exceptions + && CALL_P (from) + && find_reg_note (from, REG_EH_REGION, NULL_RTX)) + from = PREV_INSN (from); + /* Walk back to be just before any jump or label. Putting it before a label reduces the number of times the branch around the constant pool table will be hit. Putting it before
(In reply to comment #33) > Your fix of the middle end looks plausible but I think the target > shouldn't generate a CP at the eh landing pad anyway. I'll commit > the hunk below anyway after your patch for pic problem is installed. > OK. I didn't check the code quality difference between the middle-end fix and yours. Since there are no fallthru to the landing pad, and locality with the upcoming exception region is not important, (if we suppose that the exception handler is not on the critical path), I was expecting that the landing pad was a good place for the constant pool on the contrary.
(In reply to comment #34) > I was expecting that the landing pad was > a good place for the constant pool on the contrary. I thought so too. But on second thought, it'd be a bit surprising for the non CP world and may cause similar problems. We should be defensive in this regard, I think.
(In reply to comment #33) > Your fix of the middle end looks plausible but I think the target > shouldn't generate a CP at the eh landing pad anyway. I'll commit > the hunk below anyway after your patch for pic problem is installed. > done, you can commit your w/a. > @@ -4654,6 +4654,13 @@ find_barrier (int num_mova, rtx mova, rt > if (last_got) > from = PREV_INSN (last_got); > > + /* Don't insert the constant pool table at the position which > + may be the landing pad. */ > + if (flag_exceptions > + && CALL_P (from) > + && find_reg_note (from, REG_EH_REGION, NULL_RTX)) > + from = PREV_INSN (from); > + > /* Walk back to be just before any jump or label. > Putting it before a label reduces the number of times the branch > around the constant pool table will be hit. Putting it before >
*** Bug 43744 has been marked as a duplicate of this bug. ***