[PATCH] popcount{64,32}c pattern matching fixes (PR tree-optimization/93098)
Richard Biener
rguenther@suse.de
Tue Jan 7 08:07:00 GMT 2020
On Tue, 31 Dec 2019, Jakub Jelinek wrote:
> On Tue, Dec 31, 2019 at 05:47:54PM +0100, Richard Biener wrote:
> > >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> >
> > Ok.
>
> Thanks.
>
> > >One thing I haven't done anything about yet is that there is
> > >FAIL: gcc.dg/tree-ssa/popcount4ll.c scan-tree-dump-times optimized
> > >".POPCOUNT" 1
> > >before/after this patch with -m32/-march=skylake-avx512. That is
> > >because
> > >the popcountll effective target tests that we don't emit a call for
> > >__builtin_popcountll, which we don't on ia32 skylake-avx512, but
> > >direct_internal_fn_supported_p isn't true - that is because we expand
> > >the
> > >double word popcount using 2 word popcounts + addition. Shall the
> > >match.pd
> > >case handle that case too by allowing the optimization even if there
> > >is a
> > >type with half precision for which direct_internal_fn_supported_p?
> >
> > You mean emitting a single builtin call
> > Or an add of two ifns?
>
> I meant to do in the match.pd condition what expand_unop will do, i.e.
> - && direct_internal_fn_supported_p (IFN_POPCOUNT, type,
> - OPTIMIZE_FOR_BOTH))
> + && (direct_internal_fn_supported_p (IFN_POPCOUNT, type,
> + OPTIMIZE_FOR_BOTH)
> + /* expand_unop can handle double-word popcount using
> + two word popcounts and addition. */
> + || (TREE_CODE (type) == INTEGRAL_TYPE
> + && TYPE_PRECISION (type) == 2 * BITS_PER_WORD
> + && (optab_handler (popcount_optab, word_mode)
> + != CODE_FOR_nothing))))
> or so.
OK, that would work for me (maybe add a predicate to the optabs code
close to the actual expander).
Richard.
More information about the Gcc-patches
mailing list