Bug 32812 - random_seed and date_and_time
Summary: random_seed and date_and_time
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: libfortran (show other bugs)
Version: 4.3.0
: P3 enhancement
Target Milestone: 4.4.0
Assignee: Francois-Xavier Coudert
URL: http://gcc.gnu.org/ml/gcc-patches/200...
Keywords: patch
Depends on:
Blocks:
 
Reported: 2007-07-18 21:30 UTC by Thomas Koenig
Modified: 2008-03-11 10:50 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2007-08-10 21:28:07


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Koenig 2007-07-18 21:30:20 UTC
As described in <OfqdnaPWZ-DR5wPbnZ2dnUVZ_qm3nZ2d@comcast.com>,
the interaction between date_and_time and random_seed
when getting a seed from date_and_time.

Currently, we use the first four values of the put argument
to random_seed for the most significant digits.  Unfortunately,
these correspond to the most slowly varying values from
date_and_time.

Some reshuffling could take care of this.
Comment 2 Francois-Xavier Coudert 2007-08-10 19:04:15 UTC
I guess we have to scramble the seed given by the user, something like this:

Index: intrinsics/random.c
===================================================================
--- intrinsics/random.c (revision 127331)
+++ intrinsics/random.c (working copy)
@@ -32,6 +32,7 @@ Boston, MA 02110-1301, USA.  */
 #include "config.h"
 #include "libgfortran.h"
 #include <gthr.h>
+#include <string.h>
 
 extern void random_r4 (GFC_REAL_4 *);
 iexport_proto(random_r4);
@@ -639,6 +640,29 @@ arandom_r16 (gfc_array_r16 *x)
 
 #endif
 
+
+
+static void
+scramble_seed (unsigned char *dest, unsigned char *src, int size)
+{
+  int i;
+
+  for (i = 0; i < size; i++)
+    dest[(i % 2) * (size / 2) + i / 2] = src[i];
+}
+
+
+static void
+unscramble_seed (unsigned char *dest, unsigned char *src, int size)
+{
+  int i;
+
+  for (i = 0; i < size; i++)
+    dest[i] = src[(i % 2) * (size / 2) + i / 2];
+}
+
+
+
 /* random_seed is used to seed the PRNG with either a default
    set of seeds or user specified set of seeds.  random_seed
    must be called with no argument or exactly one argument.  */
@@ -647,6 +671,7 @@ void
 random_seed (GFC_INTEGER_4 *size, gfc_array_i4 *put, gfc_array_i4 *get)
 {
   int i;
+  unsigned char seed[4*kiss_size];
 
   __gthread_mutex_lock (&random_lock);
 
@@ -654,10 +679,8 @@ random_seed (GFC_INTEGER_4 *size, gfc_ar
     {
       /* From the standard: "If no argument is present, the processor assigns
          a processor-dependent value to the seed."  */
-
       for (i=0; i<kiss_size; i++)
        kiss_seed[i] = kiss_default_seed[i];
-
     }
 
   if (size != NULL)
@@ -673,9 +696,15 @@ random_seed (GFC_INTEGER_4 *size, gfc_ar
       if (((put->dim[0].ubound + 1 - put->dim[0].lbound)) < kiss_size)
         runtime_error ("Array size of PUT is too small.");
 
-      /*  This code now should do correct strides.  */
+      /*  We copy the seed given by the user.  */
       for (i = 0; i < kiss_size; i++)
-       kiss_seed[i] =(GFC_UINTEGER_4) put->data[i * put->dim[0].stride];
+       memcpy (seed + i * sizeof(GFC_UINTEGER_4),
+               &(put->data[(kiss_size - 1 - i) * put->dim[0].stride]),
+               sizeof(GFC_UINTEGER_4));
+
+      /* We put it after scrambling the bytes, to paper around users who
+        provide low-quality seeds.  */
+      scramble_seed ((unsigned char *) kiss_seed, seed, 4*kiss_size);
     }
 
   /* Return the seed to GET data.  */
@@ -689,9 +718,14 @@ random_seed (GFC_INTEGER_4 *size, gfc_ar
       if (((get->dim[0].ubound + 1 - get->dim[0].lbound)) < kiss_size)
        runtime_error ("Array size of GET is too small.");
 
+      /* Unscramble the seed.  */
+      unscramble_seed (seed, (unsigned char *) kiss_seed, 4*kiss_size);
+
       /*  This code now should do correct strides.  */
       for (i = 0; i < kiss_size; i++)
-        get->data[i * get->dim[0].stride] = (GFC_INTEGER_4) kiss_seed[i];
+       memcpy (&(get->data[(kiss_size - 1 - i) * get->dim[0].stride]),
+               seed + i * sizeof(GFC_UINTEGER_4),
+               sizeof(GFC_UINTEGER_4));
     }
 
   __gthread_mutex_unlock (&random_lock);


This doesn't make miracles (i.e. provide you with a good seed when you input a particularly poor one), but at least it makes using the VALUES of DATE_AND_TIME less frustrating (by generating visibly different streams of PRN).
Comment 3 kargls 2007-08-10 19:10:16 UTC
UGH.

I'd rather document gfortran's behavior than add additional hacks
into the compiler.
Comment 4 Francois-Xavier Coudert 2007-08-10 19:23:47 UTC
(In reply to comment #3)
> UGH.
> 
> I'd rather document gfortran's behavior than add additional hacks
> into the compiler.

Hum, I was of the same opinion at first. What somehow convinced me that it might be worth the price is that the use of DATE_AND_TIME to initialize the random seed is actually in the Metcalf book. There is a nice chance that it will end up in most user codes.

But, as you said, it's a hack, I'm not submitting a patch yet. I'd like to have Thomas' opinion first.
Comment 5 kargls 2007-08-10 20:47:52 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > UGH.
> > 
> > I'd rather document gfortran's behavior than add additional hacks
> > into the compiler.
> 
> Hum, I was of the same opinion at first. What somehow convinced me that it
> might be worth the price is that the use of DATE_AND_TIME to initialize the
> random seed is actually in the Metcalf book. There is a nice chance that it
> will end up in most user codes.

Ugh**2 :-)

I loaned my copy of M&R to colleague and it never returned.  I can't imagine
that M&R did not add some qualifier to quality of seeds that one gets from
DATE_AND_TIME.

Your attempt to randomize the seeds really isn't all that good because
there is such a limited amount of entropy in year, month, day, and hour.
I have analyzed your algorthim closely enough, so what is the effect
on someone who actually knows how to properly seed gfortran's PRNG.
Are there any degenerate side effects?

The person of c.l.f is simply doing a rather crappy test of randomness.
See my replies.
Comment 6 Francois-Xavier Coudert 2007-08-10 21:28:06 UTC
(In reply to comment #5)
> Your attempt to randomize the seeds really isn't all that good because
> there is such a limited amount of entropy in year, month, day, and hour.

I do agree. There's no way I'm adding entropy just by a bijective function :)

> I have analyzed your algorthim closely enough, so what is the effect
> on someone who actually knows how to properly seed gfortran's PRNG.
> Are there any degenerate side effects?

There shouldn't be any side effect. What I'm doing is that I'm shuffling the bytes between user-provided and internal KISS seed so that the first bytes of the user seed end up distributed in both parts of the KISS seed (used for MSB and LSB of the final random numbers); same thing for the last bytes. In that way, whether the user seed has stronger first or last elements, it will not reflect on the quality of the MSB/LSB generated.

So, since it's just shuffling bytes (and it's a bijective operation, of course), I don't think I'm messing with any of the properties of the generator.
Comment 7 Thomas Koenig 2007-08-12 11:19:07 UTC
> So, since it's just shuffling bytes (and it's a bijective operation, of
> course), I don't think I'm messing with any of the properties of the generator.

I concur.  It might even be better to do bit-wise shuffling to get
the most out of the millisecond-field in date_and_time.  You'll have
to watch for the real(kind=16) case, though.
Comment 8 Francois-Xavier Coudert 2007-08-12 19:50:27 UTC
(In reply to comment #7)
> It might even be better to do bit-wise shuffling to get
> the most out of the millisecond-field in date_and_time.

I thought about that at the beginning, but then it really gets messy :(  We have to draw the line somewhere.

> You'll have
> to watch for the real(kind=16) case, though.

Yes, but since the KISS seeds are gathered together nicely, it shouldn't be too hard.
Comment 9 Francois-Xavier Coudert 2008-03-11 10:49:57 UTC
Subject: Bug 32812

Author: fxcoudert
Date: Tue Mar 11 10:49:13 2008
New Revision: 133104

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=133104
Log:
	PR libfortran/32812
	* intrinsics/random.c (scramble_seed, unscramble_seed): New
	functions.
	(random_seed_i4): Scramble the seed the user gives us before
	storing it, and unscramble it when we return it back later.

Modified:
    trunk/libgfortran/ChangeLog
    trunk/libgfortran/intrinsics/random.c

Comment 10 Francois-Xavier Coudert 2008-03-11 10:50:39 UTC
Fixed on mainline.