This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] teach emit_store_flag to use clz/ctz


Il 27/04/2012 13:16, Richard Guenther ha scritto:
> In optabs.c we compare the CTZ_DEFINED_VALUE_AT_ZERO against two,
> is != 0 really what you want here?  The docs suggest to me
> that as you are using the optab below you should compare against two, too.

Interesting, first time I hear about this...

... except that no target sets the macros to 2, and all of them could
(as far as I could see).  Looks like the code trumps the documentation;
how does this look?

2012-04-27  Paolo Bonzini  <bonzini@gnu.org>

	* optabs.c (expand_ffs): Check CTZ_DEFINED_VALUE_AT_ZERO
	against 1.
	* doc/tm.texi (Misc): Invert meaning of 1 and 2 for
	CLZ_DEFINED_VALUE_AT_ZERO and CTZ_DEFINED_VALUE_AT_ZERO.

Index: optabs.c
===================================================================
--- optabs.c	(revisione 186903)
+++ optabs.c	(copia locale)
@@ -2764,7 +2764,7 @@ expand_ffs
       if (!temp)
 	goto fail;
 
-      defined_at_zero = (CTZ_DEFINED_VALUE_AT_ZERO (mode, val) == 2);
+      defined_at_zero = (CTZ_DEFINED_VALUE_AT_ZERO (mode, val) == 1);
     }
   else if (optab_handler (clz_optab, mode) != CODE_FOR_nothing)
     {
@@ -2773,7 +2773,7 @@ expand_ffs
       if (!temp)
 	goto fail;
 
-      if (CLZ_DEFINED_VALUE_AT_ZERO (mode, val) == 2)
+      if (CLZ_DEFINED_VALUE_AT_ZERO (mode, val) == 1)
 	{
 	  defined_at_zero = true;
 	  val = (GET_MODE_PRECISION (mode) - 1) - val;
Index: doc/tm.texi
===================================================================
--- doc/tm.texi	(revisione 186903)
+++ doc/tm.texi	(copia locale)
@@ -10640,9 +10640,9 @@
 for @code{clz} or @code{ctz} with a zero operand.
 A result of @code{0} indicates the value is undefined.
 If the value is defined for only the RTL expression, the macro should
-evaluate to @code{1}; if the value applies also to the corresponding optab
+evaluate to @code{2}; if the value applies also to the corresponding optab
 entry (which is normally the case if it expands directly into
-the corresponding RTL), then the macro should evaluate to @code{2}.
+the corresponding RTL), then the macro should evaluate to @code{1}.
 In the cases where the value is defined, @var{value} should be set to
 this value.
 
plus tweaking this patch to reject 2 as well.

> What about cost considerations?  We only seem to have the general
> "branches are expensive" metric - but ctz/clz may be prohibitely expensive
> themselves, no?

Yeah, that's a general problem with this kind of tricks.  In general
however clz/ctz is getting less and less expensive, so I don't think
it is worrisome (at least at the beginning of stage 1).  We can add
rtx_costs checks later.

Among architectures I know, only i386 has an expensive bsf/bsr but
it also has sete/setne which GCC will use instead of this trick.

Looking at rtx_costs, nothing seems to mark clz/ctz as prohibitively
expensive (Xtensa does, but only in the case when the optab handler
will not exist).  I realize though that this is not a particularly
good statistic, since the compiler would not generate them out of
its hat until now.

Paolo


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]