Bug 9760 - [arm] Combine cannot do its job because immediate operand is used instead of register
Summary: [arm] Combine cannot do its job because immediate operand is used instead of ...
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 3.3
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2003-02-19 17:06 UTC by Arpad Beszedes
Modified: 2021-07-26 07:29 UTC (History)
5 users (show)

See Also:
Host:
Target: arm-*-elf
Build:
Known to work:
Known to fail:
Last reconfirmed: 2010-02-10 11:02:55


Attachments
combine-imm.tar.gz (675 bytes, application/x-gzip )
2003-05-21 15:17 UTC, Arpad Beszedes
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Arpad Beszedes 2003-02-19 17:06:00 UTC
Combine cannot combine a shift and arithmetic insn into one insn in ARM target, because immediate is used as the second operand of the arithmetic and the combined ARM instruction can use only register for the source operand.

Release:
gcc version 3.3 20030217 (prerelease)

Environment:
BUILD & HOST: Linux 2.4.20 i686 unknown
TARGET: arm-unknown-elf

How-To-Repeat:
gcc -Os -S 01.c

// 01.c

void func(char c, int t)
{
  ;
}
void foo(int u)
{
  func ( 8, (u >> 24) & 0xffL );
  func ( 8, (u >> 16) & 0xffL );
  func ( 8, (u >> 8) & 0xffL );
}
Comment 1 Dara Hazeghi 2003-05-26 19:49:38 UTC
Hello,

with gcc 3.3 I get the following code:

        .file   "01.i"
        .text
        .align  2
        .global func
        .type   func, %function
func:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        @ lr needed for prologue
        mov     pc, lr
        .size   func, .-func
        .align  2
        .global foo
        .type   foo, %function
foo:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 1, uses_anonymous_args = 0
        mov     ip, sp
        stmfd   sp!, {r4, fp, ip, lr, pc}
        mov     r1, r0, lsr #24
        sub     fp, ip, #4
        mov     r4, r0
        mov     r0, #8
        bl      func
        mov     r1, r4, asr #16
        and     r1, r1, #255
        mov     r0, #8
        bl      func
        mov     r4, r4, asr #8
        and     r4, r4, #255
        mov     r1, r4
        mov     r0, #8
        ldmea   fp, {r4, fp, sp, lr}
        b       func
        .size   foo, .-foo
        .ident  "GCC: (GNU) 3.3 20030508 (prerelease)"

with mainline, I get this slightly different code:

        .file   "01.i"
        .text
        .align  2
        .global func
        .type   func, %function
func:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        @ lr needed for prologue
        mov     pc, lr
        .size   func, .-func
        .align  2
        .global foo
        .type   foo, %function
foo:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 1, uses_anonymous_args = 0
        mov     ip, sp
        stmfd   sp!, {r4, fp, ip, lr, pc}
        mov     r1, r0, lsr #24
        sub     fp, ip, #4
        mov     r4, r0
        mov     r0, #8
        bl      func
        mov     r1, r4, lsr #16
        and     r1, r1, #255
        mov     r0, #8
        bl      func
        mov     r4, r4, lsr #8
        and     r1, r4, #255
        mov     r0, #8
        ldmea   fp, {r4, fp, sp, lr}
        b       func
        .size   foo, .-foo
        .ident  "GCC: (GNU) 3.4 20030508 (experimental)"

Is this an improvement from the code you saw (sorry, I don't know arm asm, and there are too 
many changes from the .s file you provided just to do a diff)? Thanks,

Dara
Comment 2 Andrew Pinski 2003-05-26 20:09:17 UTC
See Dara's question.
Comment 3 Gábor Lóki 2003-05-30 12:23:30 UTC
Hello,

>Is this an improvement from the code you saw (sorry, I don't know arm asm, and
>there are too many changes from the .s file you provided just to do a diff)?

No, the problem is still present on branch and mainline (see the differences
in the attached examples).

The problematic code is this:

foo:
...
  mov     r1, r4, asr #16
  and     r1, r1, #255
  mov     r0, #8
  bl      func
  mov     r4, r4, asr #8
  and     r4, r4, #255
...

Solution (as the attached example said):

foo:
...
  mov     r5, #255
  and     r1, r5, r4, asr #16
  mov     r0, #8
  bl      func
  and     r4, r5, r4, asr #8
...


Regards,
  Gabor Loki
Comment 4 Andrew Pinski 2003-06-02 01:18:46 UTC
According to the submitter the problem still exists.
Comment 5 Andrew Pinski 2004-08-20 06:42:57 UTC
Note to reproduce this on the mainline you have remove the definition of the function func because 
func is really a pure function and we remove the call to the function and the rest of the foo is really 
dead code.
Comment 6 Ramana Radhakrishnan 2009-03-12 18:45:28 UTC
With Mainline today gcc produces : 

        stmfd   sp!, {r4, lr}
        mov     r1, r0, lsr #24
        mov     r4, r0
        mov     r0, #8
        bl      func
        mov     r1, r4, lsr #16
        and     r1, r1, #255
        mov     r0, #8
        bl      func
        mov     r1, r4, lsr #8
        and     r1, r1, #255
        mov     r0, #8
        ldmfd   sp!, {r4, lr}
        b       func

The problem still exists. This can't be a problem with the combiner because the combine would stop at function call boundaries. 


Comment 7 Steven Bosscher 2010-02-08 16:11:04 UTC
New test case is with func defined extern, as already mentioned in comment #5:

extern void func(char c, int t);

void foo(int u)
{
  func ( 8, (u >> 24) & 0xffL );
  func ( 8, (u >> 16) & 0xffL );
  func ( 8, (u >> 8) & 0xffL );
}

Trunk today (r156595) produces this:

        stmfd   sp!, {r4, lr}
        mov     r4, r0
        mov     r1, r4, lsr #24
        mov     r0, #8
        bl      func
        mov     r1, r4, lsr #16
        mov     r0, #8
        and     r1, r1, #255
        bl      func
        mov     r1, r4, lsr #8
        mov     r0, #8
        and     r1, r1, #255
        ldmfd   sp!, {r4, lr}
        b       func

Can someone please explain what the expected code is?
Comment 8 Richard Earnshaw 2010-02-08 16:30:17 UTC
Subject: Re:  [arm] Combine cannot do its job because
	immediate operand is used instead of register


On Mon, 2010-02-08 at 16:11 +0000, steven at gcc dot gnu dot org wrote:
> Can someone please explain what the expected code is?

Something like

        stmfd   sp!, {r3, r4, r5, lr}
        mov     r4, r0
	mov	r5, #255
        mov     r1, r4, lsr #24
        mov     r0, #8
        bl      func
        and     r1, r5, r4, lsr #16
        mov     r0, #8
        bl      func
        mov     r1, r5, r4, lsr #8
        mov     r0, #8
        ldmfd   sp!, {r3, r4, r5, lr}
        b       func

By putting 255 into a register we add one instruction, but remove 2; so
save one overall.  This is one of the rare cases where an immediate is
actually more expensive on ARM than using a register.

Comment 9 Richard Earnshaw 2010-02-08 16:31:20 UTC
Subject: Re:  [arm] Combine cannot do its job because
	immediate operand is used instead of register


On Mon, 2010-02-08 at 16:30 +0000, rearnsha at arm dot com wrote:
>         mov     r1, r5, r4, lsr #8
> 

Should, of course, be AND.


Comment 10 Andrew Pinski 2021-07-26 07:29:31 UTC
Still happens on the trunk.
Note armv7-a produces reasonable code though as it uses ubfx which is new for armv7-a:
        push    {r4, lr}
        mov     r4, r0
        movs    r0, #8
        lsrs    r1, r4, #24
        bl      func
        ubfx    r1, r4, #16, #8
        movs    r0, #8
        bl      func
        ubfx    r1, r4, #8, #8
        movs    r0, #8
        pop     {r4, lr}
        b       func