[Bug rtl-optimization/65698] New: Non-optimal code for simple compare function for x86 32-bit target
ysrumyan at gmail dot com
gcc-bugzilla@gcc.gnu.org
Wed Apr 8 11:27:00 GMT 2015
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65698
Bug ID: 65698
Summary: Non-optimal code for simple compare function for x86
32-bit target
Product: gcc
Version: 5.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ysrumyan at gmail dot com
For attached test-case in inner loop we can see the following deficiencies:
1. 2 redundant fills and one spill in comparison part of loop - I assume that
only 4 registers needs to keep the base of 'v1' and 'v2' and inexes 'i1' and
'i2', one more register is required to keep 'c1' or 's1'.
2. @ redundant lea instructions to perform multiplication on 2.
Here is optimal binaries produced by icc compiler( with deleted increment
part):
2e: 8a 04 3b mov (%ebx,%edi,1),%al
31: 3a 04 3e cmp (%esi,%edi,1),%al
34: 75 53 jne 89 <my_cmp+0x89>
36: 0f b7 04 5a movzwl (%edx,%ebx,2),%eax
3a: 0f b7 2c 72 movzwl (%edx,%esi,2),%ebp
3e: 3b c5 cmp %ebp,%eax
40: 75 47 jne 89 <my_cmp+0x89>
42: 8a 44 3b 01 mov 0x1(%ebx,%edi,1),%al
46: 3a 44 3e 01 cmp 0x1(%esi,%edi,1),%al
4a: 75 3d jne 89 <my_cmp+0x89>
4c: 0f b7 44 5a 02 movzwl 0x2(%edx,%ebx,2),%eax
51: 0f b7 6c 72 02 movzwl 0x2(%edx,%esi,2),%ebp
56: 3b c5 cmp %ebp,%eax
58: 75 2f jne 89 <my_cmp+0x89>
5a: 83 c3 02 add $0x2,%ebx
...
7b: 7f b1 jg 2e <my_cmp+0x2e>
Note aalso that if we commented out 2 lines
if (i1 > n) i1 -= n;
if (i2 > n) i2 -= n;
we get optimal code with gcc compiler.
More information about the Gcc-bugs
mailing list