[Bug rtl-optimization/59535] [4.9 regression] -Os code size regressions for Thumb1/Thumb2 with LRA
rearnsha at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Tue Dec 17 13:08:00 GMT 2013
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59535
--- Comment #6 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
Created attachment 31457
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31457&action=edit
Another testcase
Another testcase, but this one has some obvious examples of poor behaviour for
-Os.
In addtion to the options used on the previous case, this might need
-fno-strict-aliasing -fno-common -fomit-frame-pointer -fno-strength-reduce
Example one, spilling a value and then keeping a copy in a hard reg over a
call.
mov r5, r1 <= R1 copied to R5
sub sp, sp, #28
str r1, [sp, #8] <= And spilled to the stack
mov r2, #12
mov r1, #0
mov r4, r0
bl memset
mov r3, #2
mov r2, r5 <= Could reload from the stack instead
Example two, use of multiple reloads to use high register:
ldr r3, [sp, #4]
mov ip, r3 <= Copying value into high register
add ip, ip, r5 <= Arithmetic
mov r3, ip <= Copying result back to original register
str r3, [sp, #4]
ldr r3, [sp, #12]
mov ip, r3 <= And IP is dead anyway...
In this case,
mov ip, r3
add ip, ip, r5
mov r3, ip
can be replaced entirely with
add r3, r5
saving two completely unnecessary MOV instructions.
Third, related case,
mov r1, #12
mov ip, r1
add ip, ip, r4
mov r1, ip
Could be done either as
mov r1, #12
add r1, r4
mov ip, r1
or
mov r1, r4
add r1, #12
mov ip, r1
both saving one instruction, or even two if the value doesn't really need
copying to a high reg.
More information about the Gcc-bugs
mailing list