Bug 39429 - compiler create bad asm codes.
Summary: compiler create bad asm codes.
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.3.3
: P2 normal
Target Milestone: 4.5.0
Assignee: Not yet assigned to anyone
Keywords: wrong-code
Depends on:
Reported: 2009-03-11 07:24 UTC by Ryos Suzuki
Modified: 2015-03-12 11:27 UTC (History)
8 users (show)

See Also:
Host: arm-linux, i386-linux, i386-freebsd
Target: arm-linux
Known to work:
Known to fail:
Last reconfirmed: 2009-03-16 22:53:12

test sources (6.67 KB, application/octet-stream)
2009-03-12 01:19 UTC, Ryos Suzuki
reduced test case in plain C (226 bytes, text/plain)
2009-07-12 11:29 UTC, Mikael Pettersson
fix arith_adjacentmem LDM splitting code (381 bytes, patch)
2009-07-13 13:07 UTC, Mikael Pettersson
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ryos Suzuki 2009-03-11 07:24:43 UTC
the Compiler creates bad asm codes from following source.

    if ((mapsize - size) < 16*1024)
        return false;

    return true;

        .align  2
        .global _ZN10ArmGccTest12useOffScreenEv
        .type   _ZN10ArmGccTest12useOffScreenEv, %function
        @ Function supports interworking.
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        ldr     r0, [r0, #1080]
        ldr     r3, [r0, #1084]
        sub     r0, r0, r3
        cmp     r0, #16384
        movcc   r0, #0
        movcs   r0, #1
        bx      lr

r0 is assigned for "this". however r0 is rewritten to use "mapsize", so r3 is broken.  You can see this problem by following simple c++ source.

class ArmGccTest {
        unsigned int value001;

        unsigned int value109;
        unsigned int value10a;
        unsigned int value10b;
        unsigned int value10c;
        unsigned int value10d;

        unsigned int mapsize;
        unsigned int size;

        unsigned int value10e;

        bool useOffScreen();
Comment 1 Ryos Suzuki 2009-03-12 01:19:44 UTC
Created attachment 17445 [details]
test sources

this is test sources.
Comment 2 Richard Earnshaw 2009-03-16 22:53:12 UTC
Confirmed.  This is a bug in the arith_adjacent_mem pattern that only triggers when the offset to the memory from the base pointer exceeds the range of a simple add instruction (ie more than 1024 bytes).  In that case we fall back to emitting two ldr instructions, but fail to consider the case when the first load overwrites the base address.
Comment 3 Mikael Pettersson 2009-07-11 20:20:37 UTC
It seems that cpu type and tuning options make a difference here. If I compile with -mcpu and -mtune referring to a cpu that does not imply FL_LDSCHED, such as arm740t, then I get the broken code that clobbers r0 before loading r3. Changing cpu and tune types to a cpu that does imply FL_LDSCHED, such as arm8 or xscale, then r3 is loaded before r0 is clobbered and the sub becomes an rsb.
Comment 4 Mikael Pettersson 2009-07-12 11:29:53 UTC
Created attachment 18179 [details]
reduced test case in plain C
Comment 5 Ramana Radhakrishnan 2009-07-12 20:51:39 UTC
(In reply to comment #4)
> Created an attachment (id=18179) [edit]
> reduced test case in plain C

What options did you use  ?  Did you use -O2 , -O3 or -Os  with the testcase you've added here ? I don't see the problem with 4.5.0 trunk 149479 with either -mcpu=arm740t or with arm7tdmi.

Comment 6 Mikael Pettersson 2009-07-12 21:21:44 UTC
(In reply to comment #5)
> What options did you use  ?  Did you use -O2 , -O3 or -Os  with the testcase
> you've added here ? I don't see the problem with 4.5.0 trunk 149479 with either
> -mcpu=arm740t or with arm7tdmi.

Either -O2 or -Os plus -mcpu=arm740t will trigger it in gcc-4.3.4 and gcc-4.4.1. After prepping a patch for 4.4.1 I noticed that I couldn't trigger it in 4.5; I'm currently bisecting 4.5 to identify what changed it there.
Comment 7 Mikael Pettersson 2009-07-12 23:58:35 UTC
Revision 146451 on 4.5 changed it from generating broken code to generating not-so-broken code. That's completely unexpected since that revision is a enable-bootstrap-with-c++ thing which isn't supposed to change any behaviour.
Comment 8 Mikael Pettersson 2009-07-13 13:05:53 UTC
Mystery solved. Buried in revision 146451, which should just fix enum conversions for C++ compatibility, is the following bug fix:

--- trunk/gcc/config/arm/arm.c  2009/04/20 19:30:55     146450
+++ trunk/gcc/config/arm/arm.c  2009/04/20 19:35:00     146451
@@ -7408,7 +7410,7 @@
       /* Don't accept any offset that will require multiple
         instructions to handle, since this would cause the
         arith_adjacentmem pattern to output an overlong sequence.  */
-      if (!const_ok_for_op (PLUS, val0) || !const_ok_for_op (PLUS, val1))
+      if (!const_ok_for_op (val0, PLUS) || !const_ok_for_op (val1, PLUS))
        return 0;
       /* Don't allow an eliminable register: register elimination can make

The parameters to const_ok_for_op had been swapped, causing this if statement to not reject offsets that are awkward for ARM. Combined with a non-FL_LDSCHED cpu type this enabled arith_adjacentmem for a bad offset, which forced it to split the LDM into two LDRs, and that code fails to order the LDRs to avoid clobbering the shared base register. With the above patch arith_adjacentmem will not trigger for bad offsets, avoiding the broken LDM splitting code.

This patch is needed also for the 4.4 and 4.3 branches, and I've checked that it fixes this test case there too.

It seems that there is a bit of redundancy between the adjacent_mem_locations test and the arith_adjacentmem pattern. Both check const_ok_for_arm on the offset and the negated offset. The first attempts to reject bad offsets, while the second attempts to handle them. I'm not sure, but I _think_ that the code in arith_adjacentmem to split an LDM into two LDRs is now dead (after the bug fix above). However, just in case it isn't, I'm attaching a patch to correct it.

Unrelated to this PR, buried in revision 146451 is another bug fix:

--- trunk/gcc/config/arm/arm.c  2009/04/20 19:30:55     146450
+++ trunk/gcc/config/arm/arm.c  2009/04/20 19:35:00     146451
@@ -5465,7 +5465,7 @@
       return true;
     case ABS:
-      if (GET_MODE_CLASS (mode == MODE_FLOAT))
+      if (GET_MODE_CLASS (mode) == MODE_FLOAT)
          if (TARGET_HARD_FLOAT && (mode == SFmode || mode == DFmode))

This one is also needed in 4.4, but not in 4.3.
Comment 9 Mikael Pettersson 2009-07-13 13:07:31 UTC
Created attachment 18186 [details]
fix arith_adjacentmem LDM splitting code
Comment 10 Ramana Radhakrishnan 2015-03-12 11:27:46 UTC
Fixed in 4.5.0 but not earlier.