Combine cannot combine a shift and arithmetic insn into one insn in ARM target, because immediate is used as the second operand of the arithmetic and the combined ARM instruction can use only register for the source operand. Release: gcc version 3.3 20030217 (prerelease) Environment: BUILD & HOST: Linux 2.4.20 i686 unknown TARGET: arm-unknown-elf How-To-Repeat: gcc -Os -S 01.c // 01.c void func(char c, int t) { ; } void foo(int u) { func ( 8, (u >> 24) & 0xffL ); func ( 8, (u >> 16) & 0xffL ); func ( 8, (u >> 8) & 0xffL ); }
Hello, with gcc 3.3 I get the following code: .file "01.i" .text .align 2 .global func .type func, %function func: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. @ lr needed for prologue mov pc, lr .size func, .-func .align 2 .global foo .type foo, %function foo: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 1, uses_anonymous_args = 0 mov ip, sp stmfd sp!, {r4, fp, ip, lr, pc} mov r1, r0, lsr #24 sub fp, ip, #4 mov r4, r0 mov r0, #8 bl func mov r1, r4, asr #16 and r1, r1, #255 mov r0, #8 bl func mov r4, r4, asr #8 and r4, r4, #255 mov r1, r4 mov r0, #8 ldmea fp, {r4, fp, sp, lr} b func .size foo, .-foo .ident "GCC: (GNU) 3.3 20030508 (prerelease)" with mainline, I get this slightly different code: .file "01.i" .text .align 2 .global func .type func, %function func: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. @ lr needed for prologue mov pc, lr .size func, .-func .align 2 .global foo .type foo, %function foo: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 1, uses_anonymous_args = 0 mov ip, sp stmfd sp!, {r4, fp, ip, lr, pc} mov r1, r0, lsr #24 sub fp, ip, #4 mov r4, r0 mov r0, #8 bl func mov r1, r4, lsr #16 and r1, r1, #255 mov r0, #8 bl func mov r4, r4, lsr #8 and r1, r4, #255 mov r0, #8 ldmea fp, {r4, fp, sp, lr} b func .size foo, .-foo .ident "GCC: (GNU) 3.4 20030508 (experimental)" Is this an improvement from the code you saw (sorry, I don't know arm asm, and there are too many changes from the .s file you provided just to do a diff)? Thanks, Dara
See Dara's question.
Hello, >Is this an improvement from the code you saw (sorry, I don't know arm asm, and >there are too many changes from the .s file you provided just to do a diff)? No, the problem is still present on branch and mainline (see the differences in the attached examples). The problematic code is this: foo: ... mov r1, r4, asr #16 and r1, r1, #255 mov r0, #8 bl func mov r4, r4, asr #8 and r4, r4, #255 ... Solution (as the attached example said): foo: ... mov r5, #255 and r1, r5, r4, asr #16 mov r0, #8 bl func and r4, r5, r4, asr #8 ... Regards, Gabor Loki
According to the submitter the problem still exists.
Note to reproduce this on the mainline you have remove the definition of the function func because func is really a pure function and we remove the call to the function and the rest of the foo is really dead code.
With Mainline today gcc produces : stmfd sp!, {r4, lr} mov r1, r0, lsr #24 mov r4, r0 mov r0, #8 bl func mov r1, r4, lsr #16 and r1, r1, #255 mov r0, #8 bl func mov r1, r4, lsr #8 and r1, r1, #255 mov r0, #8 ldmfd sp!, {r4, lr} b func The problem still exists. This can't be a problem with the combiner because the combine would stop at function call boundaries.
New test case is with func defined extern, as already mentioned in comment #5: extern void func(char c, int t); void foo(int u) { func ( 8, (u >> 24) & 0xffL ); func ( 8, (u >> 16) & 0xffL ); func ( 8, (u >> 8) & 0xffL ); } Trunk today (r156595) produces this: stmfd sp!, {r4, lr} mov r4, r0 mov r1, r4, lsr #24 mov r0, #8 bl func mov r1, r4, lsr #16 mov r0, #8 and r1, r1, #255 bl func mov r1, r4, lsr #8 mov r0, #8 and r1, r1, #255 ldmfd sp!, {r4, lr} b func Can someone please explain what the expected code is?
Subject: Re: [arm] Combine cannot do its job because immediate operand is used instead of register On Mon, 2010-02-08 at 16:11 +0000, steven at gcc dot gnu dot org wrote: > Can someone please explain what the expected code is? Something like stmfd sp!, {r3, r4, r5, lr} mov r4, r0 mov r5, #255 mov r1, r4, lsr #24 mov r0, #8 bl func and r1, r5, r4, lsr #16 mov r0, #8 bl func mov r1, r5, r4, lsr #8 mov r0, #8 ldmfd sp!, {r3, r4, r5, lr} b func By putting 255 into a register we add one instruction, but remove 2; so save one overall. This is one of the rare cases where an immediate is actually more expensive on ARM than using a register.
Subject: Re: [arm] Combine cannot do its job because immediate operand is used instead of register On Mon, 2010-02-08 at 16:30 +0000, rearnsha at arm dot com wrote: > mov r1, r5, r4, lsr #8 > Should, of course, be AND.
Still happens on the trunk. Note armv7-a produces reasonable code though as it uses ubfx which is new for armv7-a: push {r4, lr} mov r4, r0 movs r0, #8 lsrs r1, r4, #24 bl func ubfx r1, r4, #16, #8 movs r0, #8 bl func ubfx r1, r4, #8, #8 movs r0, #8 pop {r4, lr} b func