Bug 11820 - Unoptimized complementary conditional instructions
Summary: Unoptimized complementary conditional instructions
Status: RESOLVED DUPLICATE of bug 5738
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 3.4.0
: P3 minor
Target Milestone: 3.4.0
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-08-06 07:01 UTC by Gábor Lóki
Modified: 2008-06-06 16:08 UTC (History)
2 users (show)

See Also:
Host: i686-pc-linux-gnu
Target: arm-unknown-elf
Build: i686-pc-linux-gnu
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Gábor Lóki 2003-08-06 07:01:10 UTC
After CE3 and SCHED2 phases GCC does not optimize conditionally executed
expressions.
Sometimes the generated code contains two conditionally executed expressions
that compute the same values in each complementary condition case (eg. in arm:
"movle r3, #12" and "movgt r3, #12"). These instructions could be combined into
one unconditional computation.
The problem generally occurs when the source code contains the same instruction
in both branches of an if-statement.

--- c example ---
// arm-elf-gcc -S -g0 -Os -o comp-cond.s comp-cond.c
int a,c;
void foo(int b)
{
  if (c > 13)
  {
    a = 12;
  }
  else
  {
    a = 12;
    c = b + 3;
  }
}

--- arm code ---
foo:
 ldr r1, .L4
 ldr r3, [r1, #0]
 ldr r2, .L4+4
 cmp r3, #13
 add r0, r0, #3
 movgt r3, #12 <- OLD
 movle r3, #12 <- OLD
 strgt r3, [r2, #0] <- OLD
 strle r3, [r2, #0] <- OLD
 strle r0, [r1, #0]
 mov pc, lr 

--- possible solution ---
foo:
 ldr r1, .L4
 ldr r3, [r1, #0]
 ldr r2, .L4+4
 cmp r3, #13
 add r0, r0, #3
 mov r3, #12 <- NEW
 str r3, [r2, #0] <- NEW
 strle r0, [r1, #0]
 mov pc, lr
Comment 1 Andrew Pinski 2003-08-06 12:16:56 UTC
This is a dup of bug 5738 which has more analysis.

*** This bug has been marked as a duplicate of 5738 ***
Comment 2 derek white 2008-06-06 16:08:05 UTC
Not a duplicate of bug 5738 (?)

I don't think this can be handled by "GCSE code hoisting". The previous example have the common subexpression independent of anything else, so code could be hoisted or delayed. But common code could be dependent on, and depended upon by non-common code in ways that tend to block CSE.

Look at this example and the addgt and addle instructions below.

--- c example ---
// arm-elf-gcc -S -g0 -Os -o comp-cond.s comp-cond.c
int x,y;
void foo(int b) {
  if (b > 13) {
    x = y + 0xFF;
  } else {
    y = x + 0xFF;
  }
}

--- arm code ---
foo:
  ldr	r1, .L6
  ldr	r2, .L6+4
  cmp	r0, #13
  ldrgt	r3, [r2, #0]
  ldrle	r3, [r1, #0]
  addgt	r3, r3, #255   <-- DUPLICATE OF BELOW
  addle	r3, r3, #255   <-- replace both with: add 3, r3, #255
  strgt	r3, [r1, #0]
  strle	r3, [r2, #0]
  bx	lr

This is in gcc-4.2.1.

I don't know the phase order of gcc, but note that this common code is generated by gcc in the ARM code generator somewhere, which may be too late for CSE? It looks like a simple peephole optimization could clean this up though.