[Bug target/104610] memcmp () == 0 can be optimized better for avx512f

crazylht at gmail dot com gcc-bugzilla@gcc.gnu.org
Tue Feb 22 06:32:44 GMT 2022


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104610

--- Comment #6 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #5)
> (In reply to Hongtao.liu from comment #4)
> > (In reply to Hongtao.liu from comment #3)
> > > (In reply to Hongtao.liu from comment #2)
> > > > in Gimple, there're
> > > > 
> > > >   _1 = __builtin_memcmp_eq (a_5(D), &t[0], 32);
> > > >   _2 = _1 == 0;
> > > >   _6 = (int) _2;
> > > > 
> > > > 
> > > > So it's related to codegen optimization with vectorized codes for
> > > > __builtin_memcmp_eq, guess we can start with size multiple of 16 bytes?
> > > > 
> > > There's no optab or target_hook for backend to participate in optimization
> But there's cbranch_optab check in can_compare_p, and i386 supports
> V8SI/V4DI/V4SI/V2DI, but not for OI/TI, adding support for them?
> 
> 25899(define_expand "cbranch<mode>4"
> 25900  [(set (reg:CC FLAGS_REG)
> 25901        (compare:CC (match_operand:VI48_AVX 1 "register_operand")
> 25902                    (match_operand:VI48_AVX 2 "nonimmediate_operand")))
> 25903   (set (pc) (if_then_else
> 25904               (match_operator 0 "bt_comparison_operator"
> 25905                [(reg:CC FLAGS_REG) (const_int 0)])
> 25906               (label_ref (match_operand 3))

After supporting cbranchoi4, gcc generates

_Z1fPc:
.LFB0:
        .cfi_startproc
        vmovdqa .LC1(%rip), %ymm0
        vpxor   (%rdi), %ymm0, %ymm0
        vptest  %ymm0, %ymm0
        sete    %al
        vzeroupper

which is optimal as clang/llvm does.


More information about the Gcc-bugs mailing list