Summary: | inefficient use of registers induces size and time overhead | ||
---|---|---|---|
Product: | gcc | Reporter: | willy tarreau <willy> |
Component: | rtl-optimization | Assignee: | Not yet assigned to anyone <unassigned> |
Status: | NEW --- | ||
Severity: | enhancement | CC: | gcc-bugs, ian |
Priority: | P2 | Keywords: | missed-optimization, ra |
Version: | 3.3.1 | ||
Target Milestone: | --- | ||
Host: | Target: | i586-linux-gnu | |
Build: | Known to work: | ||
Known to fail: | Last reconfirmed: | 2011-05-22 16:27:10 | |
Bug Depends on: | 15792 | ||
Bug Blocks: |
Description
willy tarreau
2003-08-10 08:38:58 UTC
I can confirm this on the mainline (20030809). GCC is not really good at optimizing long long's, I have some improvements but it seems not to help in this case. I filed 15792 to track part of this bug. I don't know why this was put in waiting but it should not have been. This has improved (-O2 -fomit-frame-pointer): test: movl 4(%esp), %eax # 32 *movsi_1/1 [length = 4] movl 8(%esp), %edx # 44 *movsi_1/1 [length = 4] orl %eax, %edx # 6 *iorsi_1/1 [length = 2] addl $1, %eax # 35 *addsi_1/1 [length = 3] cmpl $1, %edx # 38 *cmpsi_1_insn/1 [length = 3] sbbl %edx, %edx # 39 x86_movsicc_0_m1 [length = 2] notl %edx # 40 *one_cmplsi2_1 [length = 2] andl %edx, %eax # 41 *andsi_1/1 [length = 2] ret # 47 return_internal [length = 1] .ident "GCC: (GNU) 4.3.0 20071102 (experimental)" With -Os -fomit-frame-pointer we get: test: movl 4(%esp), %edx # 32 *movsi_1/1 [length = 4] xorl %eax, %eax # 48 *movsi_xor [length = 2] movl 8(%esp), %ecx # 43 *movsi_1/1 [length = 4] orl %edx, %ecx # 7 *iorsi_3 [length = 2] je .L3 # 8 *jcc_1 [length = 2] leal 1(%edx), %eax # 44 *lea_1 [length = 3] .L3: ret # 47 return_internal [length = 1] With -O2/-Os -fomit-frame-pointer -march=pentiumpro: test: movl 4(%esp), %edx # 32 *movsi_1/1 [length = 4] xorl %eax, %eax # 46 *movsi_xor [length = 2] leal 1(%edx), %ecx # 41 *lea_1 [length = 3] orl 8(%esp), %edx # 36 *iorsi_3 [length = 4] cmovne %ecx, %eax # 38 *movsicc_noc/1 [length = 3] ret # 44 return_internal [length = 1] I would probably code it like so: movl 4(%esp), %eax ; 4 movl 8(%esp), %edx ; 4 orl %eax, %edx ; 2 addl $-1, %edx ; 3 adcl $0, %eax ; 3 ret ; 1 |