Bug 51727 - Changing module files
Summary: Changing module files
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: fortran (show other bugs)
Version: 4.7.0
: P3 normal
Target Milestone: ---
Assignee: Tobias Schlüter
URL: http://gcc.gnu.org/ml/fortran/2012-10...
Keywords:
Depends on:
Blocks: 25708
  Show dependency treegraph
 
Reported: 2012-01-02 00:19 UTC by Mikael Morin
Modified: 2012-11-29 07:30 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2012-10-11 00:00:00


Attachments
Candidate patch (352 bytes, patch)
2012-10-06 11:41 UTC, Tobias Schlüter
Details | Diff
bad module (1.30 KB, application/octet-stream)
2012-10-06 12:46 UTC, Joost VandeVondele
Details
good module (1.32 KB, application/octet-stream)
2012-10-06 12:47 UTC, Joost VandeVondele
Details
Patch to output generic interfaces in defined order (407 bytes, patch)
2012-10-10 09:48 UTC, Tobias Schlüter
Details | Diff
Patch for testing (1.18 KB, text/plain)
2012-10-11 13:03 UTC, Tobias Schlüter
Details
Patch for testing (1.24 KB, patch)
2012-10-11 13:29 UTC, Tobias Schlüter
Details | Diff
patch that doesn't use c++ (1.28 KB, patch)
2012-10-13 11:29 UTC, Tobias Schlüter
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mikael Morin 2012-01-02 00:19:15 UTC
Reported at http://gcc.gnu.org/ml/fortran/2011-12/msg00148.html

For a given fortran source file with module(s), repeated compilations may not create always the same module file.  The URL above provides an example.

It seems that the symbols are not output in the same order.  In the example provided, there are two module variants, the difference between them being that one single symbol is at a different place in the file ( as reported at http://gcc.gnu.org/ml/fortran/2011-12/msg00163.html ).

Possible causes:
 - unitialized memory, though valgrind reported nothing for me.
 - tree rebalancing.  Module I/O manages structures to be serialized using a global tree ordered (when writing) by structures' addresses.  Could an unrepeatable order just mean an unrepeatable address?.
Comment 1 Joost VandeVondele 2012-10-05 18:16:41 UTC
Also reported here:

http://gcc.gnu.org/ml/gcc/2012-10/msg00075.html

this is the source of recompilation cascades sometimes seen in CP2K as well.

I'm wondering if a very naive hack like sorting .mod content (like in cat old.mod 1 | sort -s > new.mod) could not paper over this problem sufficiently well to make it irrelevant in reality.
Comment 2 Tobias Schlüter 2012-10-05 19:35:36 UTC
I'm not in a position to test, but write_generic in module.c makes no effort to traverse the tree in a left to right order, i.e. the line
  write_generic ((gfc_symtree *)st->right);
should be moved to the bottom of the function.
Comment 3 Tobias Schlüter 2012-10-06 11:41:23 UTC
Created attachment 28372 [details]
Candidate patch

Here's a patch which fixes one case of module contents not being written in a defined order.  I cannot test this right now because of bootstrap failures.  There may be other places in the module writing code, where structures aren't traversed in a defined order.

2012-10-06  Tobias Schlüter  <tobi@gcc.gnu.org>

	PR fortran/51727
	* module.c (write_generic): Traverse tree in left-to-right order.
Comment 4 Joost VandeVondele 2012-10-06 12:42:13 UTC
(In reply to comment #3)

> 
> 2012-10-06  Tobias Schlüter  <tobi@gcc.gnu.org>
> 
>     PR fortran/51727
>     * module.c (write_generic): Traverse tree in left-to-right order.

If tested that this patch fixes the problem for omp_lib.mod, so would likely also fix recompilation cascades..

However, testing it on CP2K I'm finding that compilation fails with this patch (and passes without), so something seems wrong. The difference between the generated modules is rather large.
Comment 5 Joost VandeVondele 2012-10-06 12:46:36 UTC
Created attachment 28373 [details]
bad module
Comment 6 Joost VandeVondele 2012-10-06 12:47:19 UTC
Created attachment 28374 [details]
good module
Comment 7 Joost VandeVondele 2012-10-06 12:48:39 UTC
The main difference between 'good' and 'bad' seems to be the 'header' lines 

bad:
()

(('arch_topology' 'machine_architecture_types' 2))

()

good:
()

(('arch_topology' 'machine_architecture_types' 2) ('ma_mp_type'
'machine_architecture_types' 3) ('ma_process' 'machine_architecture_types'
4) ('machine_output' 'machine_architecture_types' 5) ('thread_inf'
'machine_architecture_types' 6))

()
Comment 8 Joost VandeVondele 2012-10-06 12:52:09 UTC
(In reply to comment #3)
> Created attachment 28372 [details]
> Candidate patch

actually... looking at the patch, don't you need to deal with the if statements that return ?
Comment 9 Simon Baldwin 2012-10-08 09:32:55 UTC
(In reply to comment #1)
> Also reported here:
> 
> http://gcc.gnu.org/ml/gcc/2012-10/msg00075.html

To add a little more detail, here is one diff seen between two .mod files:

$ diff omp_lib.mod.first omp_lib.mod
2c2
< MD5:39fa3d71e0f92c936ff97e590e8d74cf -- If you edit this, you'll get what you deserve.
---
> MD5:893e632666abe58013fd14806b18696f -- If you edit this, you'll get what you deserve.
174,175d173
< 20 'lock' '' '' 19 ((VARIABLE INOUT UNKNOWN-PROC UNKNOWN UNKNOWN 0 0
< DUMMY) (INTEGER 4 0 0 0 INTEGER ()) 0 0 () () 0 () () () 0 0)
177a176,177
> 20 'lock' '' '' 19 ((VARIABLE INOUT UNKNOWN-PROC UNKNOWN UNKNOWN 0 0
> DUMMY) (INTEGER 4 0 0 0 INTEGER ()) 0 0 () () 0 () () () 0 0)

This is an ordering difference in the output of write_symbol1, which traverses a pointer tree whose compare function compares pointers with greaterthan and lessthan.  The tree is augmented during traversal, so there are repeated calls to write_symbol1 along with a 'written' marker at each node.  Ordering by integer 'tag' assigned to each pointer may be problematic because new tags are assigned during traversals, meaning that the sequence in which tags arrive is itself partially a function of the pointer ordering.


> I'm wondering if a very naive hack like sorting .mod content (like in cat
> old.mod 1 | sort -s > new.mod) could not paper over this problem sufficiently
> well to make it irrelevant in reality.

Perhaps, but it would have to be less naive than this in practice.  A .mod file is structured into sections that a simple sort could fragment.  It also contains an MD5 checksum of its contents, but I'm not sure how important this is.  If it's not critical, removing it might be an acceptable price to pay for simpler code elsewhere.
Comment 10 Tobias Schlüter 2012-10-10 09:10:42 UTC
(In reply to comment #8)
> (In reply to comment #3)
> > Created attachment 28372 [details]
> > Candidate patch
> 
> actually... looking at the patch, don't you need to deal with the if statements
> that return ?

Yes, you're right.  I'm now in a position to actually test, and I'll try a modified patch.

(and I somehow forgot to add myself to CC, so I only now see your comments, sorry about that)
Comment 11 Tobias Schlüter 2012-10-10 09:48:28 UTC
Created attachment 28410 [details]
Patch to output generic interfaces in defined order

Here's a correct version of the previous patch.  Maybe it's useful, but as Simon correctly points out, the real issue appears to be due to the way the pointer_info tree is built.
Comment 12 Tobias Schlüter 2012-10-11 12:37:06 UTC
I have a patch which is currently running the testsuite.
Comment 13 Tobias Schlüter 2012-10-11 13:03:25 UTC
Created attachment 28424 [details]
Patch for testing

This patch should make output deterministic.  Joost, can you please try it out?  I would also be interested in the compile-time impact on module-heavy code.  Because the pointer_info tree can change during writing, I sometimes have to sort the tree several times, and I have no idea if this can become important in files with large modules.

Actually, now that I wrote this, I can already think of a trivial speedup, which I will implement before the final submission.  Nevertheless, I would still be interested if the change disappears in the noise.
Comment 14 Tobias Schlüter 2012-10-11 13:29:57 UTC
Created attachment 28425 [details]
Patch for testing

Ok, the testsuite doesn't take so much time, and the modified test passed immediately.  Here is the patch, sped up by only including the relevant symbols in the map.

NB even with this patch, the symbols will not necessarily appear in numeric order in the module file, because dependencies may only become apparent during the writing of the symbols.  Nevertheless, the order is now deterministic.
Comment 15 Tobias Schlüter 2012-10-11 14:01:27 UTC
I'm sorry that I'm spamming your inboxes, but I only now read the comment in front of write_symbol1, and it says something that I was wondering about all this time, so I want to at least have this recorded.  The comment says:
"""
   The act
   of writing a symbol can modify the pointer_info tree, so we cease
   traversal if we find a symbol to write.
"""
This has not been the case since 2007, even though it appears to make a lot of sense: if symbols are added to the balanced binary tree during symbol writing, it can happen that the layout of the tree changes.  I don't think elements can be removed from the tree during symbol writing, so the code wouldn't fail (as for all we know it didn't), but it would traverse the tree in an unpredictable order.  With my patch, the situation certainly doesn't become worse, but it would be preferable to have a two-step mechanism, where first all necessary symbols are identified, and then written instead of the current intermingled process.  This would also allow writing the symbols in strictly numerical order.  I don't know how much surgery this would require.
Comment 16 Simon Baldwin 2012-10-13 08:08:30 UTC
(In reply to comment #15)
> ...
> This has not been the case since 2007, even though it appears to make a lot of
> sense...

I noticed the same thing while investigating this.  write_symbol1 visits every node and writes all unwritten ones, marking them written.  Because writing a node may add new nodes, write_symbol1's caller iterates it until no more writes occur.

Before reporting the issue I tried several speculative solutions, with no real success.  I experimented with the two-step mechanism you mention, but could not convince myself that it would solve the problem.  A full tree built before really writing any node still seemed to leave something ordered by (unpredictable) pointer and with integer "tags" that are partially a function of the pointers they represent.  I'm not quite seeing how your patch fixes this; it orders writes by tag *within* write_symbol1 calls, but not *across* them.

It is hard to reproduce this problem at all.  I see it in only one of multiple gcc configurations.  It is also hard to evaluate possible solutions -- adding a failed "non-solution" can change the relationship between pointers enough to make the symptoms disappear while not solving the actual problem.
Comment 17 Ollie Wild 2012-10-13 08:08:49 UTC
I'm on vacation until Mon, Oct. 15.

For compiler related questions, please email c-compiler-team@google.com.

If you need to contact a manager, please email lp-mgrs@google.com.

Thanks,
Ollie
Comment 18 Joost VandeVondele 2012-10-13 08:13:14 UTC
(In reply to comment #14)
> Created attachment 28425 [details]
> Patch for testing


thanks... now repeated CP2K compiles give identical '.mod's, and of course also
omp_lib.mod is fixed. This will very much improve the user experience for those
working on large code bases.

since you're using C++, I guess a backport to older branches is out of
question....
Comment 19 Tobias Schlüter 2012-10-13 08:31:39 UTC
Simon,
I don't think the 'integer's are functions of the pointers once you process the symbols in a defined order.  The non-determinism was caused by traversing the pointer_info tree based on the non-deterministic pointers.  With the patch, the tree is copied to a map sorted by the integers.  In the first iteration of write_symbol1 these are given by the traversal of the namespace, which is deterministic.  Since we now process these symbols in a deterministic order, the integers of the symbols added during each iteration are also deterministic.  I might be missing something, of course.

Joost,
it wouldn't be much work to implement the same thing using GFC_BBT instead of a std::map.  It would require a typedef and one would have to cleanup the tree manually, but if fixing old versions of gcc is valuable, I can do it.  I might have to do this anyway, as I don't know if gfortran maintainers like C++isms (even though in this particular case it's probably non-controversial).
Comment 20 Simon Baldwin 2012-10-13 09:26:35 UTC
(In reply to comment #19)
> ...Since we now process these symbols in a deterministic order,
> the integers of the symbols added during each iteration are also deterministic.

Thanks for the note and the fix.  I guess my worry is that the order tags are assigned to symbols is a function of pointer ordering.  I see the issue with what look like anonymous symbols, but can't convince myself it couldn't occur with something less anonymous.

Suppose two anonymous symbols 'X' differentiated only by their addresses 101 and 102.  After tagging there are two X's, call them X1 and X2, that will produce two lines that are identical apart from tag.  Ordering by tag makes the order the same even if you reverse their addresses.  However, two named symbols 'A' and 'B' at 101 and 102 will become A1 and B2 and produce two lines that differ in both tag and other fields, written in tag order as A first and B second.  If you reverse their addresses the tags are assigned differently, A2 and B1.  Here B has the smaller tag and so is written before A.

At this stage though, I'm out of my depth on fortran, so it may well be that I'm the one missing something.
Comment 21 Tobias Schlüter 2012-10-13 09:32:10 UTC
Hm, I don't know about anonymous symbols.  If they exist and end up in modules (which I honestly don't know), I would hope that they would obtain their number from some counter, which should still be deterministic as it would be incremented as code and symbols are processed (which is hopefully all deterministic).

My idea would be: submit the patch (maybe a non-C++ version as explained above), if the bug ever re-appears we'll need a more invasive solution.
Comment 22 Tobias Schlüter 2012-10-13 11:29:41 UTC
Created attachment 28440 [details]
patch that doesn't use c++

Here's a patch that works essentially the same way, but doesn't use C++.  It's about 50 lines longer than the other patch and produces identical module files in the small examples.  It's not yet ready for submission, as I haven't yet added all the necessary comments, also maybe some variables could be named better, but I would do this if you think it's worthy fixing this on the old branches as well.

I would then submit the C++ patch for the trunk and the C version for the older branches.
Comment 23 Joost VandeVondele 2012-10-13 12:28:12 UTC
(In reply to comment #22)
> Created attachment 28440 [details]
> patch that doesn't use c++

I've tested the patch with (an older version of) the 4.7 branch, and it works fine for CP2K.
Comment 24 Joost VandeVondele 2012-10-13 12:45:11 UTC
(In reply to comment #23)

> I've tested the patch with (an older version of) the 4.7 branch, and it works
> fine for CP2K.

it doesn't apply cleanly to 4.6, so no testing there unfortunately.
Comment 25 Joost VandeVondele 2012-10-15 14:14:12 UTC
Just to provide some additional numbers on how important this patch is for practical development (and of course to +1 on backports).... for a 'typical code change' on a CP2K tree (add an unused local variable to a subroutine) the speedup due to avoided recompilation (on a 32 core server) can be obtained from the following compile timings (repeatable for various tries):

4.6(unpatched)
real    1m45.064s

4.7(patched)
real    0m14.958s

I really think this is a pretty substantial bug fix of an existing feature.
Comment 26 Joost VandeVondele 2012-10-28 11:11:19 UTC
The patch has been posted some time ago, with an OK for trunk..

http://gcc.gnu.org/ml/fortran/2012-10/msg00067.html

Maybe it is a good time to commit before the next stage starts ?
Comment 27 Tobias Schlüter 2012-10-28 11:15:24 UTC
There were concerns about error handling in out-of-memory conditions which I addressed in a separate patch <http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01470.html>.  This patch got no reply so far.  May plan was to submit the non-C++ patch if the patch for C++ error handling is not approved during stage 1.
Comment 28 Tobias Schlüter 2012-11-08 15:46:13 UTC
Author: tobi
Date: Thu Nov  8 15:46:07 2012
New Revision: 193329

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=193329
Log:
PR fortran/51727
* module.c (sorted_pointer_info): New.
(gfc_get_sorted_pointer_info): New.
(free_sorted_pointer_info_tree): New.
(compare_sorted_pointer_info): New.
(find_symbols_to_write): New.
(write_symbol1_recursion): New.
(write_symbol1): Collect symbols that need writing, output in order.
(write_generic): Traverse tree in order.

Modified:
    trunk/gcc/fortran/ChangeLog
    trunk/gcc/fortran/module.c
Comment 29 Tobias Schlüter 2012-11-08 22:21:48 UTC
I committed the C-only version of the patch as the issues mentioned in comment #27 couldn't be addressed before stage3.
Comment 30 Joost VandeVondele 2012-11-09 07:31:28 UTC
(In reply to comment #29)
> I committed the C-only version of the patch as the issues mentioned in comment
> #27 couldn't be addressed before stage3.

Thanks Tobi!

I have been using your C-only patch for a couple of weeks now for the 4.7 branch, and it is greatly improving our edit/compile-cycles. For one of my students, it yields an effective 10x speedup in building CP2K after a typical code change, greatly facilitating the programming project he is on. I would suggest that after a couple of weeks on trunk, this should be reconsidered again for backporting to 4.7.
Comment 31 Tobias Schlüter 2012-11-09 09:43:56 UTC
(In reply to comment #30)
> (In reply to comment #29)
> > I committed the C-only version of the patch as the issues mentioned in comment
> > #27 couldn't be addressed before stage3.
> 
> Thanks Tobi!
> 
> I have been using your C-only patch for a couple of weeks now for the 4.7
> branch, and it is greatly improving our edit/compile-cycles. For one of my
> students, it yields an effective 10x speedup in building CP2K after a typical
> code change, greatly facilitating the programming project he is on. I would
> suggest that after a couple of weeks on trunk, this should be reconsidered
> again for backporting to 4.7.

Good to hear that.  If you can use the additional free time to walk over to my brother's office, then please say 'Hi' to him.  Otherwise the faculty meeting will have to do :-)

As for the backport, I think the patch is absolutely risk-free, and it should have been approved for 4.7 even though it doesn't fulfill the formal requirements.  I think it's worth another try once it's not only saved ethz.ch a lot of electricity but also survived for some time on the 4.8 branch.  Please ping the patch in a few weeks so it's not forgotten.  I also put a note into my calendar.
Comment 32 Joost VandeVondele 2012-11-09 10:05:18 UTC
> If you can use the additional free time to walk over to my
> brother's office, then please say 'Hi' to him.  Otherwise the faculty meeting
> will have to do :-)

Let's call it a small world... I will meet him next week.
Comment 33 Joost VandeVondele 2012-11-29 07:30:58 UTC
(In reply to comment #31)
> As for the backport, I think the patch is absolutely risk-free, and it should
> have been approved for 4.7 even though it doesn't fulfill the formal
> requirements. Please ping the patch in a few weeks so it's not forgotten.

Ping....