This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [v3] Add tr1::poisson_distribution
- From: Falk Hueffner <falk at debian dot org>
- To: Paolo Carlini <pcarlini at suse dot de>
- Cc: "'gcc-patches at gcc dot gnu dot org'" <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 15 Aug 2006 08:24:06 +0200
- Subject: Re: [v3] Add tr1::poisson_distribution
- References: <44E131E7.6020307@suse.de>
Paolo Carlini <pcarlini@suse.de> writes:
> * include/tr1/random.tcc (mersenne_twister<>::operator()): Tweak
> a bit for efficiency.
> --- include/tr1/random.tcc (revision 116148)
> +++ include/tr1/random.tcc (working copy)
> @@ -285,13 +285,13 @@
> {
> const _UIntType __upper_mask = (~_UIntType()) << __r;
> const _UIntType __lower_mask = ~__upper_mask;
> + const _UIntType __fx[2] = { 0, __a };
>
> for (int __k = 0; __k < (__n - __m); ++__k)
> {
> _UIntType __y = ((_M_x[__k] & __upper_mask)
> | (_M_x[__k + 1] & __lower_mask));
> - _M_x[__k] = (_M_x[__k + __m] ^ (__y >> 1)
> - ^ ((__y & 0x01) ? __a : 0));
> + _M_x[__k] = _M_x[__k + __m] ^ (__y >> 1) ^ __fx[__y & 0x01];
> }
>
> for (int __k = (__n - __m); __k < (__n - 1); ++__k)
I think this is actually going to be slower on many architectures,
since modern architectures tend to have long latency load/store, but a
conditional move to compensate (for example on alphaev6, "y & 1 ? a : 0"
takes 1 insn/2 cycles, and "fx[y&1]" takes 2/4). Also, it really looks
like something the compiler should do itself if it is a win...
--
Falk