This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug libstdc++/66302] Wrong output sequence of double precision uniform C++ RNG distribution


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66302

--- Comment #2 from Andrey Kolesov <andrey.kolesov at intel dot com> ---
(In reply to Jonathan Wakely from comment #1)
> (In reply to Andrey Kolesov from comment #0)
> > Double precision uniform distribution of C++ random number generators from
> > libstdc++ produces sequence which is significantly different from floating
> > point and integer (direct engine) generators.
> > Double precision sequence contains only every second (odd: 1,3,5,7...)
> > element from float and integer sequences. Generally generator output
> > shouldn't depend on output data type up to precision bounds.
> 
> Where does it say that in the standard?
> 
> Your code says:
> 
>     /* All three sequences expected to be equal up to precision bounds */
> 
> Where does the standard say you should expect that?

Right, the C++ standard says that "The algorithms for producing each of the
specified distributions are implementation-defined" (25.8.1.3). 
The Standard has strict requirements for engines to satisfy equations (for
example, for rand0 LCG :  x[i+1] â (a*x[i]+c) mod m ) but not for the
distributions based on these engines.
Formally it is not a "bug", I agree, you may close the issue.

>From perspective of data scientist or analytic application developer, the way
in which double precision output of the uniform distribution generator is
produced is questionable.
Let's consider the following scenario: a data scientist designs a stochastic
model and uses RNG for the model based Monte Carlo simulations.
To tune the parameters of the model he/she needs to fix a seed and, say, single
precision random number sequence. 
During tuning of the parameters, the researcher understands that single
precision is not sufficient for modeling goals and he needs to switch double
precision sequence produced with the same RNG/seed.
However, switching to double precision with C++ RNGs will result in different
values of the parameters. You can imagine amount of efforts necessary to
understand what went wrong with the model, tuning, and simulations.

Pseudo random generators are indeed deterministic algorithms (almost like other
math functions - sin, exp...) which produce sequences which look like random. 
But (float)sin(x1) is always equal to (double)sin(x1) up to precision. The same
behavior we can expect from RNGs, though the standard doesn't guarantee this. 
Our team is responsible for statistical features including random number
generators in Intel(R) Math Kernel Library. Intel(R) MKL RNGs were designed
keeping in mind multiple requirements including similar up to precision
sequences produced by the double and single versions of the same distribution
relying on the same algorithm and fixed seed.

Does it make sense?
Does it make sense to approach C++ Standard WG to get their perspective and
understand whether this specific behavior of the generators should be clearly
described in the standard?

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]