[PATCH] normal_distribution<double> performance improvement with SSE

Jakub Jelinek jakub@redhat.com
Wed Sep 26 12:12:00 GMT 2012


On Wed, Sep 26, 2012 at 07:16:09AM -0400, Ulrich Drepper wrote:
> Here is a patch to accelerate the __generate function for the
> normal_distribution<double> class.  The speed-up is quite significant,
> the amount depending on which random number engine is used.
> 
> mt19937        +20%
> 
> mt19937_64     +30%
> 
> sfmt19937      +30%
> 
> sfmt19937_64   +30%
> 
> 
> This patch introduces a header with optimizations for <random>.  No
> changes to existing code needed, this is a straight-forward
> specialization.  Tested on x86_64-linux.  More optimizations follow,
> there is still quite a bit of inefficiency in the existing interfaces.
>  OK to commit?

Have you considered also an __AVX__ version handling 4 elements at a time?
Without __AVX2__ one would need to cast __m256i to __m256d for and/or, as
AVX1 doesn't have _mm256_and_si256 or _mm256_or_si256, but _mm256_and_pd
or _mm256_or_pd could be used instead.

	Jakub



More information about the Gcc-patches mailing list