This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [v3] Add tr1::poisson_distribution


On Tue, 2006-08-15 at 08:24 +0200, Falk Hueffner wrote:
> Paolo Carlini <pcarlini@suse.de> writes:
> 
> > 	* include/tr1/random.tcc (mersenne_twister<>::operator()): Tweak
> > 	a bit for efficiency.


> > +	  const _UIntType __fx[2] = { 0, __a };
> >  
> >  	  for (int __k = 0; __k < (__n - __m); ++__k)
> >  	    {
> >  	      _UIntType __y = ((_M_x[__k] & __upper_mask)
> >  			       | (_M_x[__k + 1] & __lower_mask));
> > -	      _M_x[__k] = (_M_x[__k + __m] ^ (__y >> 1)
> > -			   ^ ((__y & 0x01) ? __a : 0));
> > +	      _M_x[__k] = _M_x[__k + __m] ^ (__y >> 1) ^ __fx[__y & 0x01];
> >  	    }
> >  
> >  	  for (int __k = (__n - __m); __k < (__n - 1); ++__k)
> 
> I think this is actually going to be slower on many architectures,
> since modern architectures tend to have long latency load/store, but a
> conditional move to compensate (for example on alphaev6, "y & 1 ? a : 0"
> takes 1 insn/2 cycles, and "fx[y&1]" takes 2/4). Also, it really looks
> like something the compiler should do itself if it is a win..

I bet there is a better way to optimize this without a branch or a load.

Something like:
(-(__y & 0x01)) & __a

Oh, I just looked at the GCC's output of "(y & 1) ? a : 0" and it knows
how to convert that into the above so really you cause a de-optimization
to happen on 95% of the targets.

Thanks,
Andrew Pinski


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]