[PATCH] teach emit_store_flag to use clz/ctz

Richard Guenther richard.guenther@gmail.com
Thu May 3 09:57:00 GMT 2012


On Fri, Apr 27, 2012 at 2:43 PM, Paolo Bonzini <bonzini@gnu.org> wrote:
> Il 27/04/2012 13:16, Richard Guenther ha scritto:
>> In optabs.c we compare the CTZ_DEFINED_VALUE_AT_ZERO against two,
>> is != 0 really what you want here?  The docs suggest to me
>> that as you are using the optab below you should compare against two, too.
>
> Interesting, first time I hear about this...
>
> ... except that no target sets the macros to 2, and all of them could
> (as far as I could see).  Looks like the code trumps the documentation;
> how does this look?

Hmm, I'll let target maintainers have a look at this - it's non-obvious
for arm (the only case I looked at) at least.  My guess is that the
difference was to allow __builtin_ctz to be "optimized" and CTZ
to be always well-defined (which incidentially it is not ...!?).  Hm.

Richard?


> 2012-04-27  Paolo Bonzini  <bonzini@gnu.org>
>
>        * optabs.c (expand_ffs): Check CTZ_DEFINED_VALUE_AT_ZERO
>        against 1.
>        * doc/tm.texi (Misc): Invert meaning of 1 and 2 for
>        CLZ_DEFINED_VALUE_AT_ZERO and CTZ_DEFINED_VALUE_AT_ZERO.
>
> Index: optabs.c
> ===================================================================
> --- optabs.c    (revisione 186903)
> +++ optabs.c    (copia locale)
> @@ -2764,7 +2764,7 @@ expand_ffs
>       if (!temp)
>        goto fail;
>
> -      defined_at_zero = (CTZ_DEFINED_VALUE_AT_ZERO (mode, val) == 2);
> +      defined_at_zero = (CTZ_DEFINED_VALUE_AT_ZERO (mode, val) == 1);
>     }
>   else if (optab_handler (clz_optab, mode) != CODE_FOR_nothing)
>     {
> @@ -2773,7 +2773,7 @@ expand_ffs
>       if (!temp)
>        goto fail;
>
> -      if (CLZ_DEFINED_VALUE_AT_ZERO (mode, val) == 2)
> +      if (CLZ_DEFINED_VALUE_AT_ZERO (mode, val) == 1)
>        {
>          defined_at_zero = true;
>          val = (GET_MODE_PRECISION (mode) - 1) - val;
> Index: doc/tm.texi
> ===================================================================
> --- doc/tm.texi (revisione 186903)
> +++ doc/tm.texi (copia locale)
> @@ -10640,9 +10640,9 @@
>  for @code{clz} or @code{ctz} with a zero operand.
>  A result of @code{0} indicates the value is undefined.
>  If the value is defined for only the RTL expression, the macro should
> -evaluate to @code{1}; if the value applies also to the corresponding optab
> +evaluate to @code{2}; if the value applies also to the corresponding optab
>  entry (which is normally the case if it expands directly into
> -the corresponding RTL), then the macro should evaluate to @code{2}.
> +the corresponding RTL), then the macro should evaluate to @code{1}.
>  In the cases where the value is defined, @var{value} should be set to
>  this value.
>
> plus tweaking this patch to reject 2 as well.
>
>> What about cost considerations?  We only seem to have the general
>> "branches are expensive" metric - but ctz/clz may be prohibitely expensive
>> themselves, no?
>
> Yeah, that's a general problem with this kind of tricks.  In general
> however clz/ctz is getting less and less expensive, so I don't think
> it is worrisome (at least at the beginning of stage 1).  We can add
> rtx_costs checks later.
>
> Among architectures I know, only i386 has an expensive bsf/bsr but
> it also has sete/setne which GCC will use instead of this trick.
>
> Looking at rtx_costs, nothing seems to mark clz/ctz as prohibitively
> expensive (Xtensa does, but only in the case when the optab handler
> will not exist).  I realize though that this is not a particularly
> good statistic, since the compiler would not generate them out of
> its hat until now.
>
> Paolo



More information about the Gcc-patches mailing list