This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [v3] Add tr1::poisson_distribution


Paolo Carlini <pcarlini@suse.de> writes:

> 	* include/tr1/random.tcc (mersenne_twister<>::operator()): Tweak
> 	a bit for efficiency.

> --- include/tr1/random.tcc	(revision 116148)
> +++ include/tr1/random.tcc	(working copy)
> @@ -285,13 +285,13 @@
>  	{
>  	  const _UIntType __upper_mask = (~_UIntType()) << __r;
>  	  const _UIntType __lower_mask = ~__upper_mask;
> +	  const _UIntType __fx[2] = { 0, __a };
>  
>  	  for (int __k = 0; __k < (__n - __m); ++__k)
>  	    {
>  	      _UIntType __y = ((_M_x[__k] & __upper_mask)
>  			       | (_M_x[__k + 1] & __lower_mask));
> -	      _M_x[__k] = (_M_x[__k + __m] ^ (__y >> 1)
> -			   ^ ((__y & 0x01) ? __a : 0));
> +	      _M_x[__k] = _M_x[__k + __m] ^ (__y >> 1) ^ __fx[__y & 0x01];
>  	    }
>  
>  	  for (int __k = (__n - __m); __k < (__n - 1); ++__k)

I think this is actually going to be slower on many architectures,
since modern architectures tend to have long latency load/store, but a
conditional move to compensate (for example on alphaev6, "y & 1 ? a : 0"
takes 1 insn/2 cycles, and "fx[y&1]" takes 2/4). Also, it really looks
like something the compiler should do itself if it is a win...

-- 
	Falk


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]