This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug regression/44281] New: Global Register variable pessimisation and regression


I am aware developers WONTFIX GCC being a pessimising compiler with respect to
some global register variable issues:
<http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42596>

GCC is copying registers for no good reason whatsoever. Below is a very simple
example where gcc 3.3.6 does a better job of optimising the code. Unnecessary
copying of registers may also occur with local register variables.

#include <stdint.h>

register uint64_t global_flag_stack __asm__("rbx");

void push_flag_into_global_reg_var(uint64_t a, uint64_t b) {
  uint64_t flag = (a==b);
  global_flag_stack <<= 8;
  global_flag_stack  |= flag;
}

uint64_t push_flag_into_local_var(uint64_t a, uint64_t b,
                                  uint64_t local_flag_stack) {
  uint64_t flag = (a==b);
  local_flag_stack <<= 8;
  return local_flag_stack | flag;
}

int main() {
}


gcc-3.3 (GCC) 3.3.6 (Debian 1:3.3.6-15):
$ gcc-3.3 -Os flags.c && objdump -d -m i386:x86-64:intel a.out|less
...
0000000000400478 <push_flag_into_global_reg_var>:
  400478:       31 c0                   xor    eax,eax
  40047a:       48 39 f7                cmp    rdi,rsi
  40047d:       0f 94 c0                sete   al
  400480:       48 c1 e3 08             shl    rbx,0x8
  400484:       48 09 c3                or     rbx,rax
  400487:       c3                      ret    

0000000000400488 <push_flag_into_local_var>:
  400488:       31 c0                   xor    eax,eax
  40048a:       48 39 f7                cmp    rdi,rsi
  40048d:       0f 94 c0                sete   al
  400490:       48 c1 e2 08             shl    rdx,0x8
  400494:       48 09 d0                or     rax,rdx
  400497:       c3                      ret  
...

gcc-4.1 (GCC) 4.1.3 20080704 (prerelease) (Debian 4.1.2-29):
$ gcc-4.1 -Os flags.c && objdump -d -m i386:x86-64:intel a.out|less
...
0000000000400448 <push_flag_into_global_reg_var>:
  400448:       48 89 da                mov    rdx,rbx
  40044b:       31 c0                   xor    eax,eax
  40044d:       48 c1 e2 08             shl    rdx,0x8
  400451:       48 39 f7                cmp    rdi,rsi
  400454:       0f 94 c0                sete   al
  400457:       48 89 d3                mov    rbx,rdx
  40045a:       48 09 c3                or     rbx,rax
  40045d:       c3                      ret    

000000000040045e <push_flag_into_local_var>:
  40045e:       48 c1 e2 08             shl    rdx,0x8
  400462:       31 c0                   xor    eax,eax
  400464:       48 39 f7                cmp    rdi,rsi
  400467:       0f 94 c0                sete   al
  40046a:       48 09 d0                or     rax,rdx
  40046d:       c3                      ret 
...

gcc-4.5 (Debian 4.5.0-1) 4.5.0:
$ gcc-4.5 -Os flags.c && objdump -d -m i386:x86-64:intel a.out|less
...
0000000000400494 <push_flag_into_global_reg_var>:
  400494:       31 d2                   xor    edx,edx
  400496:       48 39 f7                cmp    rdi,rsi
  400499:       48 89 d8                mov    rax,rbx
  40049c:       0f 94 c2                sete   dl
  40049f:       48 c1 e0 08             shl    rax,0x8
  4004a3:       48 89 d3                mov    rbx,rdx
  4004a6:       48 09 c3                or     rbx,rax
  4004a9:       c3                      ret    

00000000004004aa <push_flag_into_local_var>:
  4004aa:       48 89 d0                mov    rax,rdx
  4004ad:       31 d2                   xor    edx,edx
  4004af:       48 c1 e0 08             shl    rax,0x8
  4004b3:       48 39 f7                cmp    rdi,rsi
  4004b6:       0f 94 c2                sete   dl
  4004b9:       48 09 d0                or     rax,rdx
  4004bc:       c3                      ret   
...

The object code that current GCC is generating is embarrassing compared with
GCC 3.3.6. Is it also necessary to increase the code footprint of
push_flag_into_local_var when optimising for size (-Os) when compared to gcc
3.3.6 and 4.1.3?


-- 
           Summary: Global Register variable pessimisation and regression
           Product: gcc
           Version: 4.5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: regression
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: adam at consulting dot net dot nz


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44281


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]