[PATCH] normal_distribution<double> performance improvement with SSE
Jakub Jelinek
jakub@redhat.com
Wed Sep 26 12:12:00 GMT 2012
On Wed, Sep 26, 2012 at 07:16:09AM -0400, Ulrich Drepper wrote:
> Here is a patch to accelerate the __generate function for the
> normal_distribution<double> class. The speed-up is quite significant,
> the amount depending on which random number engine is used.
>
> mt19937 +20%
>
> mt19937_64 +30%
>
> sfmt19937 +30%
>
> sfmt19937_64 +30%
>
>
> This patch introduces a header with optimizations for <random>. No
> changes to existing code needed, this is a straight-forward
> specialization. Tested on x86_64-linux. More optimizations follow,
> there is still quite a bit of inefficiency in the existing interfaces.
> OK to commit?
Have you considered also an __AVX__ version handling 4 elements at a time?
Without __AVX2__ one would need to cast __m256i to __m256d for and/or, as
AVX1 doesn't have _mm256_and_si256 or _mm256_or_si256, but _mm256_and_pd
or _mm256_or_pd could be used instead.
Jakub
More information about the Gcc-patches
mailing list