PR fortran/51727: make module files reproducible, question on C++ in gcc

Tobias Schlüter tobias.schlueter@physik.uni-muenchen.de
Sat Oct 13 13:41:00 GMT 2012


Hi,

first a question also to non-gfortraners: if I want to use std::map, 
where do I "#include <map>"?  In system.h?

Now to the patch-specific part: in this PR, module files are produced 
with random changes because the order in which symbols are written can 
depend on the memory layout.  This patch fixes this by recording which 
symbols need to be written and then processing them in order.  The patch 
doesn't make the more involved effort of putting all symbols into the 
module in an easily predicted order, instead it only makes sure that the 
order remains fixed across the compiler invocations.  The reason why the 
former is difficult is that during the process of writing a symbol, it 
can turn out that other symbols will have to be written as well (say, 
because they appear in array specifications).  Since the module-writing 
code determines which symbols to output while actually writing the file, 
recording all the symbols that need to be written before writing to the 
file would mean a lot of surgery.

I'm putting forward two patches.  One uses a C++ map to very concisely 
build up and handle the ordered list of symbols.  This has three problems:
1) gfortran maintainers may not want C++isms (even though in this case
    it's very localized, and in my opinion very transparent), and
2) it can't be backported to old release branches which are still
    compiled as C.  Joost expressed interested in a backport.
3) I don't know where to #include <map> (see above)
Therefore I also propose a patch where I added the necessary ~50 lines 
of boilerplate code and added the necessary traversal function to use 
gfortran's GFC_BBT to maintain the ordered tree of symbols.

Both patches pass the testsuite and Joost confirms that they fix the 
problem with CP2K.  I also verified with a few examples that they both 
produce identical .mod files as they should.

Is the C++ patch, modified to do the #include correctly, ok for the 
trunk?  If not, the C-only patch?  Can I put the C-only patch on the 
release branches?  And which?

Cheers,
- Tobi
-------------- next part --------------
2012-10-13  Tobias Schlüter  <tobi@gcc.gnu.org>

	PR fortran/51727
	* module.c (sorted_pointer_info): New.
	(gfc_get_sorted_pointer_info): New.
	(free_sorted_pointer_info_tree): New.
	(compare_sorted_pointer_info): New.
	(find_symbols_to_write): New.
	(write_symbol1_recursion): New.
	(write_symbol1): Collect symbols that need writing, output in order.
	(write_generic): Traverse tree in order.

diff --git a/gcc/fortran/module.c b/gcc/fortran/module.c
index 5cfc335..4cfcae4 100644
--- a/gcc/fortran/module.c
+++ b/gcc/fortran/module.c
@@ -5150,32 +5150,122 @@ write_symbol0 (gfc_symtree *st)
 }
 
 
-/* Recursive traversal function to write the secondary set of symbols
-   to the module file.  These are symbols that were not public yet are
-   needed by the public symbols or another dependent symbol.  The act
-   of writing a symbol can modify the pointer_info tree, so we cease
-   traversal if we find a symbol to write.  We return nonzero if a
-   symbol was written and pass that information upwards.  */
+/* Type for the temporary tree used when writing secondary symbols.  */
+
+struct sorted_pointer_info
+{
+  BBT_HEADER (sorted_pointer_info);
+
+  pointer_info *p;
+};
+
+#define gfc_get_sorted_pointer_info() XCNEW (sorted_pointer_info)
+
+/* Recursively traverse the temporary tree, free its contents.  */
+
+static void
+free_sorted_pointer_info_tree (sorted_pointer_info *p)
+{
+  if (!p)
+    return;
+
+  free_sorted_pointer_info_tree (p->left);
+  free_sorted_pointer_info_tree (p->right);
+
+  free (p);
+}
+
+/* Comparison function for the temporary tree.  */
 
 static int
-write_symbol1 (pointer_info *p)
+compare_sorted_pointer_info (void *_spi1, void *_spi2)
 {
-  int result;
+  sorted_pointer_info *spi1, *spi2;
+  spi1 = (sorted_pointer_info *)_spi1;
+  spi2 = (sorted_pointer_info *)_spi2;
 
+  if (spi1->p->integer < spi2->p->integer)
+    return -1;
+  if (spi1->p->integer > spi2->p->integer)
+    return 1;
+  return 0;
+}
+
+
+/* Finds the symbols that need to be written and collects them in the
+   sorted_pi tree so that they can be traversed in an order
+   independent of memory addresses.  */
+
+static void
+find_symbols_to_write(sorted_pointer_info **tree, pointer_info *p)
+{
+  if (!p)
+    return;
+
+  if (p->type == P_SYMBOL && p->u.wsym.state == NEEDS_WRITE)
+    {
+      sorted_pointer_info *sp = gfc_get_sorted_pointer_info();
+      sp->p = p; 
+ 
+      gfc_insert_bbt (tree, sp, compare_sorted_pointer_info);
+   }
+
+  find_symbols_to_write (tree, p->left);
+  find_symbols_to_write (tree, p->right);
+}
+
+
+/* Recursive function that traverses the tree of symbols that need to be
+   written and writes them in order.  */
+
+static void
+write_symbol1_recursion (sorted_pointer_info *sp)
+{
+  if (!sp)
+    return;
+
+  write_symbol1_recursion (sp->left);
+
+  pointer_info *p1 = sp->p;
+  gcc_assert (p1->type == P_SYMBOL && p1->u.wsym.state == NEEDS_WRITE);
+
+  p1->u.wsym.state = WRITTEN;
+  write_symbol (p1->integer, p1->u.wsym.sym);
+ 
+  write_symbol1_recursion (sp->right);
+}
+
+
+/* Write the secondary set of symbols to the module file.  These are
+   symbols that were not public yet are needed by the public symbols
+   or another dependent symbol.  The act of writing a symbol can add
+   symbols to the pointer_info tree, so we return nonzero if a symbol
+   was written and pass that information upwards.  The caller will
+   then call this function again until nothing was written.  It uses
+   the utility functions and a temporary tree to ensure a reproducible
+   ordering of the symbol output and thus the module file.  */
+
+static int
+write_symbol1 (pointer_info *p)
+{
   if (!p)
     return 0;
 
-  result = write_symbol1 (p->left);
+  /* Put symbols that need to be written into a tree sorted on the
+     integer field.  */
 
-  if (!(p->type != P_SYMBOL || p->u.wsym.state != NEEDS_WRITE))
-    {
-      p->u.wsym.state = WRITTEN;
-      write_symbol (p->integer, p->u.wsym.sym);
-      result = 1;
-    }
+  sorted_pointer_info *spi_root = NULL;
+  find_symbols_to_write (&spi_root, p);
+
+  /* No symbols to write, return.  */
+  if (!spi_root)
+    return 0;
+
+  /* Otherwise, write and free the tree again.  */
+  write_symbol1_recursion (spi_root);
+  free_sorted_pointer_info_tree (spi_root);
 
-  result |= write_symbol1 (p->right);
-  return result;
+  return 1;
 }
 
 
@@ -5205,19 +5295,18 @@ write_generic (gfc_symtree *st)
     return;
 
   write_generic (st->left);
-  write_generic (st->right);
 
   sym = st->n.sym;
-  if (!sym || check_unique_name (st->name))
-    return;
-
-  if (sym->generic == NULL || !gfc_check_symbol_access (sym))
-    return;
+  if (sym && !check_unique_name (st->name)
+      && sym->generic && gfc_check_symbol_access (sym))
+    {
+      if (!sym->module)
+	sym->module = module_name;
 
-  if (sym->module == NULL)
-    sym->module = module_name;
+      mio_symbol_interface (&st->name, &sym->module, &sym->generic);
+    }
 
-  mio_symbol_interface (&st->name, &sym->module, &sym->generic);
+  write_generic (st->right);
 }
 
 
-------------- next part --------------
2012-10-10  Tobias Schlüter  <tobi@gcc.gnu.org>

	    PR fortran/51727
	    * module.c (pointer_info_to_map): New.
	    (write_symbol1): Write symbols in order of integer field.
	    (write_generic): Traverse tree in left-to-right order.

diff --git a/gcc/fortran/module.c b/gcc/fortran/module.c
index 5cfc335..6d37144 100644
--- a/gcc/fortran/module.c
+++ b/gcc/fortran/module.c
@@ -78,6 +78,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "cpp.h"
 #include "tree.h"
 
+#include <map>
+
 #define MODULE_EXTENSION ".mod"
 
 /* Don't put any single quote (') in MOD_VERSION, 
@@ -5150,6 +5152,23 @@ write_symbol0 (gfc_symtree *st)
 }
 
 
+/* Recursively collects the symbols that need to be written, stores
+   them into a map sorted on their integer.  */
+
+static void
+find_symbols_to_write(std::map<int, pointer_info*>& m, pointer_info *p)
+{
+  if (!p)
+    return;
+
+  if (p->type == P_SYMBOL && p->u.wsym.state == NEEDS_WRITE)
+    m[p->integer] = p;
+
+  find_symbols_to_write (m, p->left);
+  find_symbols_to_write (m, p->right);
+}
+
+
 /* Recursive traversal function to write the secondary set of symbols
    to the module file.  These are symbols that were not public yet are
    needed by the public symbols or another dependent symbol.  The act
@@ -5157,24 +5176,30 @@ write_symbol0 (gfc_symtree *st)
    traversal if we find a symbol to write.  We return nonzero if a
    symbol was written and pass that information upwards.  */
 
-static int
+static bool
 write_symbol1 (pointer_info *p)
 {
-  int result;
-
   if (!p)
     return 0;
 
-  result = write_symbol1 (p->left);
+  int result = 0;
+
+  /* Put symbols into a map, traversal is then sorted.  */
+  std::map <int, pointer_info *> mpointers;
+  find_symbols_to_write (mpointers, p);
 
-  if (!(p->type != P_SYMBOL || p->u.wsym.state != NEEDS_WRITE))
+  for (std::map <int, pointer_info *>::iterator it = mpointers.begin();
+       it != mpointers.end(); it++)
     {
-      p->u.wsym.state = WRITTEN;
-      write_symbol (p->integer, p->u.wsym.sym);
+      pointer_info *p1 = it->second;
+      gcc_assert (p1 && it->first == p1->integer);
+      gcc_assert (p1->type == P_SYMBOL && p1->u.wsym.state == NEEDS_WRITE);
+
+      p1->u.wsym.state = WRITTEN;
+      write_symbol (p1->integer, p1->u.wsym.sym);
       result = 1;
     }
 
-  result |= write_symbol1 (p->right);
   return result;
 }
 
@@ -5205,19 +5230,18 @@ write_generic (gfc_symtree *st)
     return;
 
   write_generic (st->left);
-  write_generic (st->right);
 
   sym = st->n.sym;
-  if (!sym || check_unique_name (st->name))
-    return;
-
-  if (sym->generic == NULL || !gfc_check_symbol_access (sym))
-    return;
+  if (sym && !check_unique_name (st->name)
+      && sym->generic && gfc_check_symbol_access (sym))
+    {
+      if (!sym->module)
+	sym->module = module_name;
 
-  if (sym->module == NULL)
-    sym->module = module_name;
+      mio_symbol_interface (&st->name, &sym->module, &sym->generic);
+    }
 
-  mio_symbol_interface (&st->name, &sym->module, &sym->generic);
+  write_generic (st->right);
 }
 
 


More information about the Gcc-patches mailing list