Consider this java method: public int shift (int x, int y) { return x << y; } On x86, with -O3, this becomes: _ZN1t5shiftEii: .LFB2: pushl %ebp .LCFI0: movl %esp, %ebp .LCFI1: movl 16(%ebp), %ecx movl 12(%ebp), %eax popl %ebp andl $31, %ecx sall %cl, %eax ret However, on x86 (starting with 80286), the shift count is masked to 5 bits by sall, so the "andl" instruction is redundant. I haven't looked to see exactly where this buglet might lie.
But note on most other targets x << y where y > 31 is undefined and so is the tree code for SHIFT LEFT.
Yes, that's fine. The java front end generates trees like <shift op <and op 31>>. I would like the extra "and" to be optimized out somewhere along the way, on a target-specific basis.
on PPC: __ZN4test5shiftEii: rlwinm r5,r5,0,27,31 slw r3,r4,r5 blr Which is correct as r5 could be greater than 31 and being greater than 31 is undefined (4 bits are only used really but it could do something different than that). So if this were to be done correctly, we would need a target hook for the Java front-end. Note the only reason why I bring this up is because we just had a bootstrap failure where we depended on being anded by the machine and not in the code.
Then the easy way to do that would add a pattern in the machine description for (ashift:SI () (and:SI () (const_int 31 [0x1f]))) and let combine do its job but note we are masking 4 bits and not 5 out so we still will get different answers than what we are expecting.
forget that thing about 4 vs 5 bits, I cannot count. Confirmed.
This will be done properly if SHIFT_COUNT_TRUNCATED is true. The problem is that SHIFT_COUNT_TRUNCATED applies to *all* shifts and at the given mode size. Which is not true for x86, since HImode shifts are still masked to 31, not 15. Ideally we'd extend Richard Sandiford's new target.shift_truncation_mask such that it can replace SHIFT_COUNT_TRUNCATED. In the short term, Java could examine shift_truncation_mask and avoid creating the AND at the tree level in the first place. Not sure whether "middle-end" or "rtl-opt" is a better category, but this is definitely not target-specific and thus "target" doesn't apply.
Thanks for the response. I'm not too concerned in the short run -- we've had this bug for a long time. I neglected to mention that in java a long (64 bit, i.e. long << int) shift uses "& 63" (as one would expect). I didn't look to see what kind of code this generates or whether it can be further optimized.
(In reply to comment #2) > Yes, that's fine. > The java front end generates trees like <shift op <and op 31>>. > I would like the extra "and" to be optimized out somewhere > along the way, on a target-specific basis. Fixed by tree-ssa:: $ cat t.c int shift (int x, int y) { int _y = y & 31; return x << _y; } $ ./cc1 -quiet -m32 -O2 t.c -fdump-tree-optimized $ cat t.s .file "t.c" .text .p2align 4,,15 .globl shift .type shift, @function shift: .LFB0: .cfi_startproc movl 4(%esp), %eax movl 8(%esp), %ecx sall %cl, %eax ret .cfi_endproc .LFE0: .size shift, .-shift .ident "GCC: (GNU) 4.6.0 20110312 (experimental) [trunk revision 170907]" .section .note.GNU-stack,"",@progbits