Tested with gcc-4.5.0-RC-20100406.tar.bz2 Reduced testcase: /*********************************************************/ #include <stdio.h> #include <stdint.h> char do_reverse_endian = 0; # define bswap_32(x) \ ((((x) & 0xff000000) >> 24) | \ (((x) & 0x00ff0000) >> 8) | \ (((x) & 0x0000ff00) << 8) | \ (((x) & 0x000000ff) << 24)) #define EGET(X) \ (__extension__ ({ \ uint64_t __res; \ if (!do_reverse_endian) { __res = (X); \ } else if (sizeof(X) == 4) { __res = bswap_32((X)); \ } \ __res; \ })) void __attribute__((noinline)) X(char **phdr, char **data, int *phoff) { *phdr = *data + EGET(*phoff); } int main() { char *phdr; char *data = (char *)0x40164000; int phoff = 0x34; X(&phdr, &data, &phoff); printf("got %p (expecting 0x40164034)\n", phdr); return 0; } /*********************************************************/ # gcc -Os -o test test.c # ./test got 0x74164000 (expecting 0x40164034)
Questions: 1. Does compiling with -march=armv5tel make any difference? 2. Does gcc-4.4.3 work? 3. How was your gcc-4.5.0 configured?
Compiling this with gcc-4.5.0-RC-20100406 -march=armv7-a -Os I get: X: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. ldr r3, .L4 ldr r2, [r2, #0] ldr r1, [r1, #0] ldrb r3, [r3, #0] @ zero_extendqisi2 cmp r3, #0 moveq r3, r2, asr #31 rev r2, r2 movne r3, #0 .L3: add r2, r1, r2 str r2, [r0, #0] bx lr .L5: .align 2 .L4: .word .LANCHOR0 ... .LANCHOR0 = . + 0 .type do_reverse_endian, %object .size do_reverse_endian, 1 do_reverse_endian: .space 1 which looks completely bogus. It's unconditionally rev:ing *phoff in r2 and conditionally (based on do_reverse_endian) munging r3 which is dead. I can't test this however as my HW is armv5tel.
I can see the same failure on 4.5 branch with the testcase. The flags I used on the command line were -mcpu=cortex-a8 -Os . cheers Ramana
It appears as though there's a latent bug in arm_ccfsm_state_machine. If you mark this correctly as being "predicable" which the rev insn is - the bug goes away. If arm_final_prescan_insn sees something of the following form, it goes ahead and assumes it can convert the rev in this pattern into a condexec and removes the branch. (jump_insn:TI 12 15 19 /tmp/bitswap.c:23 (set (pc) (if_then_else (eq (reg:CC 24 cc) (const_int 0 [0x0])) (label_ref:SI 23) (pc))) 228 {*arm_cond_branch} (expr_list:REG_DEAD (reg:CC 24 cc) (expr_list:REG_BR_PROB (const_int 3900 [0xf3c]) (nil))) -> 23) (note 19 12 21 [bb 3] NOTE_INSN_BASIC_BLOCK) (insn:TI 21 19 22 /tmp/bitswap.c:23 (set (reg:SI 2 r2 [orig:141 D.4313 ] [141]) (bswap:SI (reg:SI 2 r2 [152]))) 347 {arm_rev} (nil)) (insn:TI 22 21 23 /tmp/bitswap.c:23 (set (reg/v:DI 2 r2 [orig:133 __res ] [133]) (zero_extend:DI (reg:SI 2 r2 [orig:141 D.4313 ] [141]))) 138 {*arm_zero_extendsidi2} (expr_list:REG_UNUSED (reg:SI 3 r3) (nil))) The patch below which I'm testing, fixes the immediate problem with rev as below. Longer term we ought to remove the ccfsm_state_machine once we fix the problem with conditional calls. Index: arm.md =================================================================== --- arm.md (revision 158138) +++ arm.md (working copy) @@ -11201,8 +11201,9 @@ [(set (match_operand:SI 0 "s_register_operand" "=r") (bswap:SI (match_operand:SI 1 "s_register_operand" "r")))] "TARGET_EITHER && arm_arch6" - "rev\t%0, %1" - [(set (attr "length") + "rev%?\t%0, %1" + [(set_attr "predicable" "yes") + (set (attr "length") (if_then_else (eq_attr "is_thumb" "yes") (const_int 2) (const_int 4)))] Ramana
Actually strike out the last patch - that's just wrong for Thumb1. This is what I am testing currently. Index: arm.md =================================================================== --- arm.md (revision 158138) +++ arm.md (working copy) @@ -11200,12 +11200,18 @@ (define_insn "arm_rev" [(set (match_operand:SI 0 "s_register_operand" "=r") (bswap:SI (match_operand:SI 1 "s_register_operand" "r")))] - "TARGET_EITHER && arm_arch6" + "TARGET_32BIT && arm_arch6" + "rev%?\t%0, %1" + [(set_attr "predicable" "yes") + (set_attr "length" "4")] +) + +(define_insn "thumb1_rev" + [(set (match_operand:SI 0 "s_register_operand" "=l") + (bswap:SI (match_operand:SI 1 "s_register_operand" "l")))] + "TARGET_THUMB1 && arm_arch6" "rev\t%0, %1" - [(set (attr "length") - (if_then_else (eq_attr "is_thumb" "yes") - (const_int 2) - (const_int 4)))] + [(set_attr "length" "2")] ) (define_expand "arm_legacy_rev"
(In reply to comment #1) > 2. Does gcc-4.4.3 work? Yes, gcc-4.4.3 works (it just does not use 'rev' instruction). So it is a regression in 4.5. Thanks for a very fast response and analysis of the issue.
Patch submitted here. http://gcc.gnu.org/ml/gcc-patches/2010-04/msg00401.html
(In reply to comment #7) > Patch submitted here. > > http://gcc.gnu.org/ml/gcc-patches/2010-04/msg00401.html Thank you. I have been testing it for two days already. It really helps (in the sense that it is apparently better to have this fix than not to have). I have bootstrapped the hard vfp system successfully and did not notice any other problems so far. Btw, miscompilation (of all the same package) also happens with -O2 optimization settings in some other place, but I did not try to investigate where exactly it fails. But I understand that it is just a workaround for the problem which happens somewhere in the upper layer? If REV instruction did not actually support conditional execution, then the fix would require actually finding the real cause.
Subject: Re: [4.5/4.6 Regression] Invalid code when building gentoo pax-utils-0.1.19 with -Os optimizations On Mon, 2010-04-12 at 09:34 +0000, siarhei dot siamashka at gmail dot com wrote: > > ------- Comment #8 from siarhei dot siamashka at gmail dot com 2010-04-12 09:34 ------- > (In reply to comment #7) > > Patch submitted here. > > > > http://gcc.gnu.org/ml/gcc-patches/2010-04/msg00401.html > > Thank you. I have been testing it for two days already. > > It really helps (in the sense that it is apparently better to have this fix > than not to have). I have bootstrapped the hard vfp system successfully and did > not notice any other problems so far. Btw, miscompilation (of all the same > package) also happens with -O2 optimization settings in some other place, but I > did not try to investigate where exactly it fails. > > But I understand that it is just a workaround for the problem which happens > somewhere in the upper layer? If REV instruction did not actually support > conditional execution, then the fix would require actually finding the real > cause. There were 2 issues, one is that REV isn't marked as being allowed conditional and the other is that the CCFSM state machine appears to go wrong in this particular case which is something I need to investigate when I have some more free time. (Note this is ARM backend specific only.) Thus the fix isn't a work-around because it introduces predicated support for something like the REV instruction which is the right thing to do here.
Maybe I'm too impatient, but is there anything that prevents this patch from getting committed to SVN?
Ping :)
Updated the summary to better describe the problem (which is distro independent). The fact that this bug breaks pax-utils tool, which is a vital part of gentoo packaging system, thus rendering the system unusable is probably not so interesting in gcc bugzilla context :)
Subject: Bug 43698 Author: ramana Date: Thu Jul 22 08:30:36 2010 New Revision: 162404 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=162404 Log: Fix PR target/43698 2010-07-22 Ramana Radhakrishnan <ramana.radhakrishnan@arm.com> PR target/43698 * config/arm/arm.md: Split arm_rev into *arm_rev and *thumb1_rev. Set *arm_rev to be predicable. 2010-07-22 Ramana Radhakrishnan <ramana.radhakrishnan@arm.com> PR target/43698 * gcc.target/arm/pr43698.c: New test. Added: trunk/gcc/testsuite/gcc.target/arm/pr43698.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/arm/arm.md trunk/gcc/testsuite/ChangeLog
Thanks, this final variant of fix seems to work fine. Can this patch be backported to 4.5 branch and released with gcc 4.5.1 too? As I see it, the risk should be minimal because current gcc 4.5 branch is so broken on armv6/armv7 because of this bug, that it simply can't become any worse. As recently discovered in MeeGo [1], this bug has a high chance of breaking just about any program which does endian byteswapping. The list of broken packages includes 'dbus' and 'utils-linux-ng' to name a few, but surely there are more. 1. http://bugs.meego.com/show_bug.cgi?id=3936
Patch can be backported and tested. But since 4.5 is frozen right now, needs RM permission. Adding RM to CC. cheers Ramana
Subject: Re: [4.5/4.6 Regression] Wrong use of ARMv6 REV instruction for endian bytewapping with -Os or -O2 optimizations On Fri, 23 Jul 2010, ramana at gcc dot gnu dot org wrote: > ------- Comment #15 from ramana at gcc dot gnu dot org 2010-07-23 12:21 ------- > Patch can be backported and tested. But since 4.5 is frozen right now, needs RM > permission. > > Adding RM to CC. Please wait until after the 4.5.1 release. Richard.
Subject: Bug 43698 Author: ramana Date: Fri Jul 30 22:35:40 2010 New Revision: 162725 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=162725 Log: Backport fix for PR target/43698. 2010-07-30 Ramana Radhakrishnan <ramana.radhakrishnan@arm.com> Backport from mainline. 2010-07-22 Ramana Radhakrishnan <ramana.radhakrishnan@arm.com> PR target/43698 * config/arm/arm.md: Split arm_rev into *arm_rev and *thumb1_rev. Set *arm_rev to be predicable. 2010-07-30 Ramana Radhakrishnan <ramana.radhakrishnan@arm.com> Backport from mainline 2010-07-22 Ramana Radhakrishnan <ramana.radhakrishnan@arm.com> PR target/43698 * gcc.target/arm/pr43698.c: New test. Added: branches/gcc-4_5-branch/gcc/testsuite/gcc.target/arm/pr43698.c - copied unchanged from r162404, trunk/gcc/testsuite/gcc.target/arm/pr43698.c Modified: branches/gcc-4_5-branch/ (props changed) branches/gcc-4_5-branch/gcc/ChangeLog branches/gcc-4_5-branch/gcc/config/arm/arm.md branches/gcc-4_5-branch/gcc/testsuite/ChangeLog Propchange: branches/gcc-4_5-branch/ ('svn:mergeinfo' added)
And hence fixed. Thanks for allowing the backport. Ramana