Bug 43698 - [4.5/4.6 Regression] Wrong use of ARMv6 REV instruction for endian bytewapping with -Os or -O2 optimizations
[4.5/4.6 Regression] Wrong use of ARMv6 REV instruction for endian bytewappin...
Status: RESOLVED FIXED
Product: gcc
Classification: Unclassified
Component: target
4.5.0
: P2 normal
: 4.5.1
Assigned To: Ramana Radhakrishnan
: wrong-code
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-04-09 00:05 UTC by Siarhei Siamashka
Modified: 2010-07-30 22:38 UTC (History)
7 users (show)

See Also:
Host: armv7l-unknown-linux-gnueabi
Target: armv7l-unknown-linux-gnueabi,
Build: armv7l-unknown-linux-gnueabi
Known to work: 4.4.3
Known to fail: 4.5.0
Last reconfirmed: 2010-04-09 07:38:25


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Siarhei Siamashka 2010-04-09 00:05:16 UTC
Tested with gcc-4.5.0-RC-20100406.tar.bz2

Reduced testcase:

/*********************************************************/
#include <stdio.h>
#include <stdint.h>

char do_reverse_endian = 0;

#  define bswap_32(x) \
                        ((((x) & 0xff000000) >> 24) | \
                         (((x) & 0x00ff0000) >>  8) | \
                         (((x) & 0x0000ff00) <<  8) | \
                         (((x) & 0x000000ff) << 24))

#define EGET(X) \
        (__extension__ ({ \
                uint64_t __res; \
                if (!do_reverse_endian) {    __res = (X); \
                } else if (sizeof(X) == 4) { __res = bswap_32((X)); \
                } \
                __res; \
        }))

void __attribute__((noinline)) X(char **phdr, char **data, int *phoff)
{
    *phdr = *data + EGET(*phoff);
}

int main()
{
    char *phdr;
    char *data = (char *)0x40164000;
    int phoff = 0x34;
    X(&phdr, &data, &phoff);
    printf("got %p (expecting 0x40164034)\n", phdr);
    return 0;
}
/*********************************************************/

# gcc -Os -o test test.c
# ./test
got 0x74164000 (expecting 0x40164034)
Comment 1 Mikael Pettersson 2010-04-09 05:49:07 UTC
Questions:

1. Does compiling with -march=armv5tel make any difference?
2. Does gcc-4.4.3 work?
3. How was your gcc-4.5.0 configured?
Comment 2 Mikael Pettersson 2010-04-09 06:57:06 UTC
Compiling this with gcc-4.5.0-RC-20100406 -march=armv7-a -Os I get:

X:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        ldr     r3, .L4
        ldr     r2, [r2, #0]
        ldr     r1, [r1, #0]
        ldrb    r3, [r3, #0]    @ zero_extendqisi2
        cmp     r3, #0
        moveq   r3, r2, asr #31
        rev     r2, r2
        movne   r3, #0
.L3:
        add     r2, r1, r2
        str     r2, [r0, #0]
        bx      lr
.L5:
        .align  2
.L4:
        .word   .LANCHOR0
...
.LANCHOR0 = . + 0
        .type   do_reverse_endian, %object
        .size   do_reverse_endian, 1
do_reverse_endian:
        .space  1

which looks completely bogus. It's unconditionally rev:ing *phoff in r2 and conditionally (based on do_reverse_endian) munging r3 which is dead.

I can't test this however as my HW is armv5tel.
Comment 3 Ramana Radhakrishnan 2010-04-09 07:02:25 UTC
I can see the same failure on 4.5 branch with the testcase. The flags I used on the command line were -mcpu=cortex-a8 -Os . 

cheers
Ramana
Comment 4 Ramana Radhakrishnan 2010-04-09 07:38:25 UTC
It appears as though there's a latent bug in arm_ccfsm_state_machine. If you mark this correctly as being "predicable" which the rev insn is - the bug goes away. 

If arm_final_prescan_insn sees something of the following form, it goes ahead and assumes it can convert the rev in this pattern into a condexec and removes the branch. 


(jump_insn:TI 12 15 19 /tmp/bitswap.c:23 (set (pc)
        (if_then_else (eq (reg:CC 24 cc)
                (const_int 0 [0x0]))
            (label_ref:SI 23)
            (pc))) 228 {*arm_cond_branch} (expr_list:REG_DEAD (reg:CC 24 cc)
        (expr_list:REG_BR_PROB (const_int 3900 [0xf3c])
            (nil)))
 -> 23)

(note 19 12 21 [bb 3] NOTE_INSN_BASIC_BLOCK)

(insn:TI 21 19 22 /tmp/bitswap.c:23 (set (reg:SI 2 r2 [orig:141 D.4313 ] [141])
        (bswap:SI (reg:SI 2 r2 [152]))) 347 {arm_rev} (nil))

(insn:TI 22 21 23 /tmp/bitswap.c:23 (set (reg/v:DI 2 r2 [orig:133 __res ] [133])
        (zero_extend:DI (reg:SI 2 r2 [orig:141 D.4313 ] [141]))) 138 {*arm_zero_extendsidi2} (expr_list:REG_UNUSED (reg:SI 3 r3)
        (nil)))


The patch below which I'm testing, fixes the immediate problem with rev as below. Longer term we ought to remove the ccfsm_state_machine once we fix the problem with conditional calls. 

Index: arm.md
===================================================================
--- arm.md      (revision 158138)
+++ arm.md      (working copy)
@@ -11201,8 +11201,9 @@
   [(set (match_operand:SI 0 "s_register_operand" "=r")
        (bswap:SI (match_operand:SI 1 "s_register_operand" "r")))]
   "TARGET_EITHER && arm_arch6"
-  "rev\t%0, %1"
-  [(set (attr "length")
+  "rev%?\t%0, %1"
+  [(set_attr "predicable" "yes")
+   (set (attr "length")
         (if_then_else (eq_attr "is_thumb" "yes")
                      (const_int 2)
                      (const_int 4)))]



Ramana











Comment 5 Ramana Radhakrishnan 2010-04-09 07:48:57 UTC
Actually strike out the last patch - that's just wrong for Thumb1. This is what I am testing currently.


Index: arm.md
===================================================================
--- arm.md      (revision 158138)
+++ arm.md      (working copy)
@@ -11200,12 +11200,18 @@
 (define_insn "arm_rev"
   [(set (match_operand:SI 0 "s_register_operand" "=r")
        (bswap:SI (match_operand:SI 1 "s_register_operand" "r")))]
-  "TARGET_EITHER && arm_arch6"
+  "TARGET_32BIT && arm_arch6"
+  "rev%?\t%0, %1"
+  [(set_attr "predicable" "yes")
+   (set_attr "length" "4")]
+)
+
+(define_insn "thumb1_rev"
+  [(set (match_operand:SI 0 "s_register_operand" "=l")
+       (bswap:SI (match_operand:SI 1 "s_register_operand" "l")))]
+  "TARGET_THUMB1 && arm_arch6"
   "rev\t%0, %1"
-  [(set (attr "length")
-        (if_then_else (eq_attr "is_thumb" "yes")
-                     (const_int 2)
-                     (const_int 4)))]
+  [(set_attr "length" "2")]
 )
 
 (define_expand "arm_legacy_rev"
Comment 6 Siarhei Siamashka 2010-04-09 08:04:52 UTC
(In reply to comment #1)
> 2. Does gcc-4.4.3 work?

Yes, gcc-4.4.3 works (it just does not use 'rev' instruction). So it is a regression in 4.5. Thanks for a very fast response and analysis of the issue.
Comment 7 Ramana Radhakrishnan 2010-04-12 08:38:07 UTC
Patch submitted here. 

http://gcc.gnu.org/ml/gcc-patches/2010-04/msg00401.html
Comment 8 Siarhei Siamashka 2010-04-12 09:34:11 UTC
(In reply to comment #7)
> Patch submitted here. 
> 
> http://gcc.gnu.org/ml/gcc-patches/2010-04/msg00401.html

Thank you. I have been testing it for two days already.

It really helps (in the sense that it is apparently better to have this fix than not to have). I have bootstrapped the hard vfp system successfully and did not notice any other problems so far. Btw, miscompilation (of all the same package) also happens with -O2 optimization settings in some other place, but I did not try to investigate where exactly it fails.

But I understand that it is just a workaround for the problem which happens somewhere in the upper layer? If REV instruction did not actually support conditional execution, then the fix would require actually finding the real cause.
Comment 9 ramana.radhakrishnan@arm.com 2010-04-12 09:51:25 UTC
Subject: Re:  [4.5/4.6 Regression] Invalid code when
	building gentoo pax-utils-0.1.19 with -Os optimizations


On Mon, 2010-04-12 at 09:34 +0000, siarhei dot siamashka at gmail dot
com wrote:
> 
> ------- Comment #8 from siarhei dot siamashka at gmail dot com  2010-04-12 09:34 -------
> (In reply to comment #7)
> > Patch submitted here. 
> > 
> > http://gcc.gnu.org/ml/gcc-patches/2010-04/msg00401.html
> 
> Thank you. I have been testing it for two days already.
> 
> It really helps (in the sense that it is apparently better to have this fix
> than not to have). I have bootstrapped the hard vfp system successfully and did
> not notice any other problems so far. Btw, miscompilation (of all the same
> package) also happens with -O2 optimization settings in some other place, but I
> did not try to investigate where exactly it fails.
> 
> But I understand that it is just a workaround for the problem which happens
> somewhere in the upper layer? If REV instruction did not actually support
> conditional execution, then the fix would require actually finding the real
> cause.

There were 2 issues, one is that REV isn't marked as being allowed
conditional and the other is that the CCFSM state machine appears to go
wrong in this particular case which is something I need to investigate
when I have some more free time. (Note this is ARM backend specific
only.)

Thus the fix isn't a work-around because it introduces predicated
support for something like the REV instruction which is the right thing
to do here.




Comment 10 Siarhei Siamashka 2010-05-17 18:48:16 UTC
Maybe I'm too impatient, but is there anything that prevents this patch from getting committed to SVN?
Comment 11 Raúl Porcel 2010-07-03 16:05:51 UTC
Ping :)
Comment 12 Siarhei Siamashka 2010-07-19 13:54:07 UTC
Updated the summary to better describe the problem (which is distro independent).

The fact that this bug breaks pax-utils tool, which is a vital part of gentoo packaging system, thus rendering the system unusable is probably not so interesting in gcc bugzilla context :)
Comment 13 Ramana Radhakrishnan 2010-07-22 08:31:00 UTC
Subject: Bug 43698

Author: ramana
Date: Thu Jul 22 08:30:36 2010
New Revision: 162404

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=162404
Log:
Fix PR target/43698

2010-07-22  Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>

	PR target/43698
	* config/arm/arm.md: Split arm_rev into *arm_rev
	and *thumb1_rev. Set *arm_rev to be predicable.

2010-07-22  Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>

	PR target/43698
	* gcc.target/arm/pr43698.c: New test.


Added:
    trunk/gcc/testsuite/gcc.target/arm/pr43698.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/arm/arm.md
    trunk/gcc/testsuite/ChangeLog

Comment 14 Siarhei Siamashka 2010-07-22 20:54:33 UTC
Thanks, this final variant of fix seems to work fine. Can this patch be backported to 4.5 branch and released with gcc 4.5.1 too?

As I see it, the risk should be minimal because current gcc 4.5 branch is so broken on armv6/armv7 because of this bug, that it simply can't become any worse. 

As recently discovered in MeeGo [1], this bug has a high chance of breaking just about any program which does endian byteswapping. The list of broken packages includes 'dbus' and 'utils-linux-ng' to name a few, but surely there are more.

1. http://bugs.meego.com/show_bug.cgi?id=3936
Comment 15 Ramana Radhakrishnan 2010-07-23 12:21:51 UTC
Patch can be backported and tested. But since 4.5 is frozen right now, needs RM permission. 

Adding RM to CC.

cheers
Ramana
Comment 16 rguenther@suse.de 2010-07-23 12:27:41 UTC
Subject: Re:  [4.5/4.6 Regression] Wrong use of ARMv6 REV
 instruction for endian bytewapping with -Os or -O2 optimizations

On Fri, 23 Jul 2010, ramana at gcc dot gnu dot org wrote:

> ------- Comment #15 from ramana at gcc dot gnu dot org  2010-07-23 12:21 -------
> Patch can be backported and tested. But since 4.5 is frozen right now, needs RM
> permission. 
> 
> Adding RM to CC.

Please wait until after the 4.5.1 release.

Richard.
Comment 17 Ramana Radhakrishnan 2010-07-30 22:36:05 UTC
Subject: Bug 43698

Author: ramana
Date: Fri Jul 30 22:35:40 2010
New Revision: 162725

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=162725
Log:
Backport fix for PR target/43698.

2010-07-30  Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>

	Backport from mainline.
	2010-07-22  Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>

 	PR target/43698
	* config/arm/arm.md: Split arm_rev into *arm_rev and *thumb1_rev.
	Set *arm_rev to be predicable.

2010-07-30  Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>

	Backport from mainline
	2010-07-22  Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>

	PR target/43698
	* gcc.target/arm/pr43698.c: New test.


Added:
    branches/gcc-4_5-branch/gcc/testsuite/gcc.target/arm/pr43698.c
      - copied unchanged from r162404, trunk/gcc/testsuite/gcc.target/arm/pr43698.c
Modified:
    branches/gcc-4_5-branch/   (props changed)
    branches/gcc-4_5-branch/gcc/ChangeLog
    branches/gcc-4_5-branch/gcc/config/arm/arm.md
    branches/gcc-4_5-branch/gcc/testsuite/ChangeLog

Propchange: branches/gcc-4_5-branch/
            ('svn:mergeinfo' added)


Comment 18 Ramana Radhakrishnan 2010-07-30 22:38:53 UTC
And hence fixed. Thanks for allowing the backport.

Ramana