Improved Global Constant/Copy Propagation

March 10, 1999

We are pleased to announce that Cygnus has donated improvements to the global constant propagation optimization.

Previously the global constant/copy propagation optimizer would never propagate a constant or copy into a conditional jump. As a result the compiler often missed important optimizations. This is best illustrated with a simple example:

int length, width, radius;
enum figure {RECTANGLE, CIRCLE};
  int area = 0, volume = 0, height;
  enum figure kind = RECTANGLE;
  for (height = 0; height < 10; height++) {
    if (kind == RECTANGLE) {
      area += length * width;
      volume += length * width * height;
    } else 
    if (kind == CIRCLE) {
      area += 3.14 * radius * radius;
      volume += 3.14 * radius * radius * height;
  process(area, volume);

Careful examination of that function shows that the two if statements are known to be true and false respectively.

The left column shows the loop from the example generated by GCC before these improvements. The right column shows the loop after the global cprop improvements (-O2 hppa1.1-hp-hpux10 target).

L$0006					L$0006
        comib,<>,n 0,%r22,L$0007
        addl %r26,%r20,%r26			addl %r25,%r21,%r25
        bl L$0005,0
        addl %r25,%r21,%r25
        comib,<>,n 1,%r22,L$0005
        stw %r19,-16(0,%r30)
        fcnvxf,sgl,dbl %fr27L,%fr26
        fldws -16(0,%r30),%fr22L
        stw %r26,-16(0,%r30)
        fcnvxf,sgl,dbl %fr22L,%fr25
        fldws -16(0,%r30),%fr23L
        fmpy,dbl %fr26,%fr28,%fr22
        stw %r25,-16(0,%r30)
        fcnvxf,sgl,dbl %fr23L,%fr24
        fmpy,dbl %fr22,%fr26,%fr22
        fldws -16(0,%r30),%fr26L
        fmpyadd,dbl %fr22,%fr25,%fr25,%fr22,%fr24
        fcnvxf,sgl,dbl %fr26L,%fr23
        fcnvfxt,dbl,sgl %fr24,%fr24L
        fstws %fr24L,-16(0,%r30)
        fadd,dbl %fr23,%fr25,%fr23
        ldw -16(0,%r30),%r26
        fcnvfxt,dbl,sgl %fr23,%fr23L
        fstws %fr23L,-16(0,%r30)
        ldw -16(0,%r30),%r25
        ldo 1(%r19),%r19
        comib,>= 9,%r19,L$0006			addib,>= -1,%r20,L$0006
        addl %r21,%r20,%r21			addl %r21,%r19,%r21

The code in the left hand column has two conditional branches which are compile-time computable, but which were not optimized by the compiler. The right column shows the code after the compiler has been improved to identify the compile-time computable conditional branches.

Note this optimization does not currently work on "cc0" machines such as the x86 or m68k. We hope someone will enhance the code further to support "cc0" machines.