[PATCH] popcount{64,32}c pattern matching fixes (PR tree-optimization/93098)

Richard Biener rguenther@suse.de
Tue Jan 7 08:07:00 GMT 2020


On Tue, 31 Dec 2019, Jakub Jelinek wrote:

> On Tue, Dec 31, 2019 at 05:47:54PM +0100, Richard Biener wrote:
> > >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > 
> > Ok. 
> 
> Thanks.
> 
> > >One thing I haven't done anything about yet is that there is
> > >FAIL: gcc.dg/tree-ssa/popcount4ll.c scan-tree-dump-times optimized
> > >".POPCOUNT" 1
> > >before/after this patch with -m32/-march=skylake-avx512.  That is
> > >because
> > >the popcountll effective target tests that we don't emit a call for
> > >__builtin_popcountll, which we don't on ia32 skylake-avx512, but
> > >direct_internal_fn_supported_p isn't true - that is because we expand
> > >the
> > >double word popcount using 2 word popcounts + addition.  Shall the
> > >match.pd
> > >case handle that case too  by allowing the optimization even if there
> > >is a
> > >type with half precision for which direct_internal_fn_supported_p?
> > 
> > You mean emitting a single builtin call
> > Or an add of two ifns? 
> 
> I meant to do in the match.pd condition what expand_unop will do, i.e.
> -	&& direct_internal_fn_supported_p (IFN_POPCOUNT, type,
> -					   OPTIMIZE_FOR_BOTH))
> +	&& (direct_internal_fn_supported_p (IFN_POPCOUNT, type,
> +					    OPTIMIZE_FOR_BOTH)
> +	    /* expand_unop can handle double-word popcount using
> +	       two word popcounts and addition.  */
> +	    || (TREE_CODE (type) == INTEGRAL_TYPE
> +		&& TYPE_PRECISION (type) == 2 * BITS_PER_WORD
> +		&& (optab_handler (popcount_optab, word_mode)
> +		    != CODE_FOR_nothing))))
> or so.

OK, that would work for me (maybe add a predicate to the optabs code
close to the actual expander).

Richard.



More information about the Gcc-patches mailing list