orl vs. orb

Paul Buchheit paul@google.com
Wed Jan 12 23:02:00 GMT 2000


egcs does not appear to be choosing the best instruction for
bitwise ORs with small constants.

Here is a program that demostrates this problem (by taking
nearly four times as long to run!):

#include<stdio.h>

int main() {
  const int kMem = 1000000;
  int * mem = new int[kMem];
  int v = 0;
  
  for (int j = 0; j < 500; j++) {
    for (int i = 0; i < kMem; i++) {
#ifdef GO_SLOW
      v += mem[i] | 6;
#else
      v += mem[i] | 0xaabbccdd;
#endif
    }
  }
  
  return v;
}


beavis:~/code% gcc --version
egcs-2.91.66
beavis:~/code% gcc -Wall -mpentiumpro -O4 oper.cc             
beavis:~/code% time ./a.out
3.47user 0.01system 0:03.47elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (79major+11minor)pagefaults 0swaps
beavis:~/code% gcc -Wall -mpentiumpro -O4 oper.cc -DGO_SLOW   
beavis:~/code% time ./a.out
12.72user 0.01system 0:12.73elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (79major+11minor)pagefaults 0swaps


The only difference in the assembly is 'orl' vs. 'orb'.

--- GO_SLOW asm ---
        movl (%edx),%eax
        orb $6,%al
        addl %eax,%ecx
--- fast asm ---
        movl (%edx),%eax
        orl $-1430532899,%eax
        addl %eax,%ecx
---


My system:

beavis:~/code% uname -a   
Linux beavis.corp.google.com 2.2.11 #4 Mon Dec 6 18:56:10 PST 1999 i686 unknown
beavis:~/code% cat /proc/cpuinfo 
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 6
model name	: Celeron (Mendocino)
stepping	: 0
cpu MHz		: 434.330085
cache size	: 128 KB
fdiv_bug	: no
hlt_bug		: no
sep_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 2
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx osfxsr
bogomips	: 432.54



More information about the Gcc-bugs mailing list