[PATCH] teach emit_store_flag to use clz/ctz
Richard Guenther
richard.guenther@gmail.com
Thu May 3 09:57:00 GMT 2012
On Fri, Apr 27, 2012 at 2:43 PM, Paolo Bonzini <bonzini@gnu.org> wrote:
> Il 27/04/2012 13:16, Richard Guenther ha scritto:
>> In optabs.c we compare the CTZ_DEFINED_VALUE_AT_ZERO against two,
>> is != 0 really what you want here? The docs suggest to me
>> that as you are using the optab below you should compare against two, too.
>
> Interesting, first time I hear about this...
>
> ... except that no target sets the macros to 2, and all of them could
> (as far as I could see). Looks like the code trumps the documentation;
> how does this look?
Hmm, I'll let target maintainers have a look at this - it's non-obvious
for arm (the only case I looked at) at least. My guess is that the
difference was to allow __builtin_ctz to be "optimized" and CTZ
to be always well-defined (which incidentially it is not ...!?). Hm.
Richard?
> 2012-04-27 Paolo Bonzini <bonzini@gnu.org>
>
> * optabs.c (expand_ffs): Check CTZ_DEFINED_VALUE_AT_ZERO
> against 1.
> * doc/tm.texi (Misc): Invert meaning of 1 and 2 for
> CLZ_DEFINED_VALUE_AT_ZERO and CTZ_DEFINED_VALUE_AT_ZERO.
>
> Index: optabs.c
> ===================================================================
> --- optabs.c (revisione 186903)
> +++ optabs.c (copia locale)
> @@ -2764,7 +2764,7 @@ expand_ffs
> if (!temp)
> goto fail;
>
> - defined_at_zero = (CTZ_DEFINED_VALUE_AT_ZERO (mode, val) == 2);
> + defined_at_zero = (CTZ_DEFINED_VALUE_AT_ZERO (mode, val) == 1);
> }
> else if (optab_handler (clz_optab, mode) != CODE_FOR_nothing)
> {
> @@ -2773,7 +2773,7 @@ expand_ffs
> if (!temp)
> goto fail;
>
> - if (CLZ_DEFINED_VALUE_AT_ZERO (mode, val) == 2)
> + if (CLZ_DEFINED_VALUE_AT_ZERO (mode, val) == 1)
> {
> defined_at_zero = true;
> val = (GET_MODE_PRECISION (mode) - 1) - val;
> Index: doc/tm.texi
> ===================================================================
> --- doc/tm.texi (revisione 186903)
> +++ doc/tm.texi (copia locale)
> @@ -10640,9 +10640,9 @@
> for @code{clz} or @code{ctz} with a zero operand.
> A result of @code{0} indicates the value is undefined.
> If the value is defined for only the RTL expression, the macro should
> -evaluate to @code{1}; if the value applies also to the corresponding optab
> +evaluate to @code{2}; if the value applies also to the corresponding optab
> entry (which is normally the case if it expands directly into
> -the corresponding RTL), then the macro should evaluate to @code{2}.
> +the corresponding RTL), then the macro should evaluate to @code{1}.
> In the cases where the value is defined, @var{value} should be set to
> this value.
>
> plus tweaking this patch to reject 2 as well.
>
>> What about cost considerations? We only seem to have the general
>> "branches are expensive" metric - but ctz/clz may be prohibitely expensive
>> themselves, no?
>
> Yeah, that's a general problem with this kind of tricks. In general
> however clz/ctz is getting less and less expensive, so I don't think
> it is worrisome (at least at the beginning of stage 1). We can add
> rtx_costs checks later.
>
> Among architectures I know, only i386 has an expensive bsf/bsr but
> it also has sete/setne which GCC will use instead of this trick.
>
> Looking at rtx_costs, nothing seems to mark clz/ctz as prohibitively
> expensive (Xtensa does, but only in the case when the optab handler
> will not exist). I realize though that this is not a particularly
> good statistic, since the compiler would not generate them out of
> its hat until now.
>
> Paolo
More information about the Gcc-patches
mailing list