After CE3 and SCHED2 phases GCC does not optimize conditionally executed expressions. Sometimes the generated code contains two conditionally executed expressions that compute the same values in each complementary condition case (eg. in arm: "movle r3, #12" and "movgt r3, #12"). These instructions could be combined into one unconditional computation. The problem generally occurs when the source code contains the same instruction in both branches of an if-statement. --- c example --- // arm-elf-gcc -S -g0 -Os -o comp-cond.s comp-cond.c int a,c; void foo(int b) { if (c > 13) { a = 12; } else { a = 12; c = b + 3; } } --- arm code --- foo: ldr r1, .L4 ldr r3, [r1, #0] ldr r2, .L4+4 cmp r3, #13 add r0, r0, #3 movgt r3, #12 <- OLD movle r3, #12 <- OLD strgt r3, [r2, #0] <- OLD strle r3, [r2, #0] <- OLD strle r0, [r1, #0] mov pc, lr --- possible solution --- foo: ldr r1, .L4 ldr r3, [r1, #0] ldr r2, .L4+4 cmp r3, #13 add r0, r0, #3 mov r3, #12 <- NEW str r3, [r2, #0] <- NEW strle r0, [r1, #0] mov pc, lr
This is a dup of bug 5738 which has more analysis. *** This bug has been marked as a duplicate of 5738 ***
Not a duplicate of bug 5738 (?) I don't think this can be handled by "GCSE code hoisting". The previous example have the common subexpression independent of anything else, so code could be hoisted or delayed. But common code could be dependent on, and depended upon by non-common code in ways that tend to block CSE. Look at this example and the addgt and addle instructions below. --- c example --- // arm-elf-gcc -S -g0 -Os -o comp-cond.s comp-cond.c int x,y; void foo(int b) { if (b > 13) { x = y + 0xFF; } else { y = x + 0xFF; } } --- arm code --- foo: ldr r1, .L6 ldr r2, .L6+4 cmp r0, #13 ldrgt r3, [r2, #0] ldrle r3, [r1, #0] addgt r3, r3, #255 <-- DUPLICATE OF BELOW addle r3, r3, #255 <-- replace both with: add 3, r3, #255 strgt r3, [r1, #0] strle r3, [r2, #0] bx lr This is in gcc-4.2.1. I don't know the phase order of gcc, but note that this common code is generated by gcc in the ARM code generator somewhere, which may be too late for CSE? It looks like a simple peephole optimization could clean this up though.