Bug 31271

Summary: Missing simple optimization
Product: gcc Reporter: matt
Component: middle-endAssignee: Not yet assigned to anyone <unassigned>
Status: NEW ---    
Severity: enhancement CC: gcc-bugs, rguenth
Priority: P3 Keywords: missed-optimization
Version: 4.3.0   
Target Milestone: 4.7.0   
Host: x86_64--netbsd Target: x86_64--netbsd
Build: x86_64--netbsd Known to work:
Known to fail: 4.6.4 Last reconfirmed: 2007-03-20 10:39:11

Description matt 2007-03-19 22:22:47 UTC
The following shows a missing easy optimization for GCC:

int
in_canforward(unsigned int in)
{
        if ((in & ~0xffffff0f) == 0xf0 || (in & ~0xffffff0f) == 0xe0)
                return 0;
        return 1;
}

results in (@ -O2):

in_canforward:
        andl    $240, %edi
        cmpl    $240, %edi
        sete    %al
        cmpl    $224, %edi
        sete    %dl
        orl     %edx, %eax
        xorl    $1, %eax
        movzbl  %al, %eax
        ret

given that 0xf0 and 0xe0 only differ by one bit, there is no reason to test for that bit so the comparision could be: (in & 0xffffff1f) == 0xe0.  More generally
the optimization is:

given           (x & m) == a0 || (x & m) == a1
where m, a0, and a1 are all constant
let             b = (a0 ^ a1)
then if         (b & (b - 1)) == 0 [b is a power of 2]
rewrite to:     (x & (m|b)) == (a0 & ~b)
Comment 1 Richard Biener 2007-03-20 10:39:11 UTC
Confirmed.  This is neither done at the tree nor at the rtl level.
Comment 2 Andrew Pinski 2021-08-16 01:37:02 UTC
We produce in 4.7.0+

in_canforward(unsigned int):
.LFB0:
        .cfi_startproc
        andl    $224, %edi
        xorl    %eax, %eax
        cmpl    $224, %edi
        setne   %al
        ret


That is:
  D.2201_1 = in_2(D) & 224;
  D.2199_10 = D.2201_1 != 224;

I think we could do slightly better
((~in_2(D)) & 224) == 0

But only at exand time.
This gives:
        notl    %edi
        xorl    %eax, %eax
        testb   $-32, %dil
        setne   %al

Or for aarch64:
        mov     w8, #224
        bics    wzr, w8, w0
        cset    w0, ne
        ret
Comment 3 Andrew Pinski 2023-05-20 02:48:38 UTC
(In reply to Andrew Pinski from comment #2)
> 
> I think we could do slightly better
> ((~in_2(D)) & 224) == 0
> 
> But only at exand time.
> This gives:
>         notl    %edi
>         xorl    %eax, %eax
>         testb   $-32, %dil
>         setne   %al

x86_64 produces that in GCC 13 with r13-792-g29ae455901ac71 .

> 
> Or for aarch64:
>         mov     w8, #224
>         bics    wzr, w8, w0
>         cset    w0, ne
>         ret

For aarch64, it could define an instruction to catch:
(set (reg:CC_NZV 66 cc)
    (compare:CC_NZV (and:SI (not:SI (reg:SI 100))
            (const_int 224 [0xe0]))
        (const_int 0 [0])))


Anyways the original issue was fixed in GCC 4.7.0 and the small improvement for x86_64 is in GCC 13. The aarch64 code generation is currently:
        and     w0, w0, 224
        cmp     w0, 224
        cset    w0, ne
        ret

Which is only slightly worse than what I proposed too.