Bug 25708

Summary: [F95] Module loading is not good at all
Product: gcc Reporter: Andrew Pinski <pinskia>
Component: fortranAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal CC: gcc-bugs, janus, jb, Joost.VandeVondele, jvdelisle2, P.Schaffnit, paul.richard.thomas, w6ws
Priority: P3 Keywords: compile-time-hog
Version: 4.2.0   
Target Milestone: ---   
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed: 2007-01-15 10:27:52
Bug Depends on: 40958, 51727    
Bug Blocks: 21130, 25391, 30285, 32817    

Description Andrew Pinski 2006-01-07 05:05:46 UTC
doing something like:
function f()
use modulef
end function
function g()
use modulef
end function
function h()
use modulef
end function
function i()
use modulef
end function

-----
Causes us to load in the module 4 times which is just wrong.
Comment 1 Andrew Pinski 2006-01-07 05:10:08 UTC
Looking at the profile for PR 21130 makes me think fixing this bug will also fix that one.
Oh. it is just as bad if we have module modulef declared in the same file as we have to save it and then reload it which is just a waste of time. 

To Paul:
Also fixing up the derived types after the fact is just wrong that was PR 25391.
Comment 2 Paul Thomas 2006-01-09 07:36:57 UTC
Subject: RE:  Module loading is not good at all

Andrew,

> ------- Comment #1 from pinskia at gcc dot gnu dot org  
> 2006-01-07 05:10 -------
> Looking at the profile for PR 21130 makes me think fixing 
> this bug will also
> fix that one.
> Oh. it is just as bad if we have module modulef declared in 
> the same file as we
> have to save it and then reload it which is just a waste of time. 
>

I have proposed to introduce module namespaces that are built just once per compiled file per module; either from source or a mod file.  Subsequent usage can lift the symbol information from the appropriate namespace.
 
> To Paul:
> Also fixing up the derived types after the fact is just wrong 
> that was PR
> 25391.
> 

What makes you say that?  We have a choice about the way that this should be done:

(i) Catch equal derived types at the read_module + match stage;
(ii) At resolution; or
(iii) At the translation stage.

I came to the conclusion that (iii) was the most economical, in terms of code, and the simplest to implement.  There is nothing "wrong" in having separate symbols for the same object in different scopes, just as long as the tree declaration is the same.

Paul
Comment 3 Francois-Xavier Coudert 2006-10-02 11:23:54 UTC
Confirmed and marked as an enhancement. After all, it's working :)
Comment 4 Francois-Xavier Coudert 2006-10-31 07:00:40 UTC
(In reply to comment #2)
> I have proposed to introduce module namespaces that are built just once per
> compiled file per module; either from source or a mod file.  Subsequent usage
> can lift the symbol information from the appropriate namespace.

I like that idea. I've looked at the code in symbol.c, and don't see clearly how to get it done (where do you get the master symtree from, eg?)
Comment 5 Paul Thomas 2006-10-31 08:29:37 UTC
Subject: RE:  Module loading is not good at all

FX

> -----Message d'origine-----
> De : fxcoudert at gcc dot gnu dot org 
> [mailto:gcc-bugzilla@gcc.gnu.org]
> Envoyé : mardi 31 octobre 2006 08:01
>
> I like that idea. I've looked at the code in symbol.c, and 
> don't see clearly
> how to get it done (where do you get the master symtree from, eg?)
> 

I haven't really thought about this enough yet but.....

(i) I think that a namespace will have to be built for each module.  This will allow the initial part of read_module to be left completely unmodified. *grin*

(ii) The namespaces should be contained in a linked list or binary tree; the structures should be something like

typedef struct gfc_module_namespace
{
  const char *name;
  struct gfc_module_namespace *next;
  maybe some attributes?;
}
gfc_module_namespace;

I think that a list should be sufficient, since numbers of modules are likely to remain limited for real-life codes.... aren't they?  Or is your 25Mb monster an indication that I am wrong?

(iii) As the modules get USEd, new symtrees are added to the current namespace and I think that it is OK to point to the symbols in the module namespaces. This would simplify derived type association a lot.  I have not the foggiest idea what to do about interfaces; they are on my list of urgent things to try to understand.  I am about to create a meta-bug of interface PRs. I had a notion to look at modules after I had had a stab at that.

(There is a PR on doubly USEd interfaces that I had a stab at in the airport last Friday - I got absolutely nowhere; even though I thought that I was touching the right places, it had no effect on the fault!  This is one and the same problem of doubly USEd variables that I fixed by explicitly going through and checking each against the other.  To do likewise for interfaces requires some understanding!  I note, however, that these problems are fixed in g95 and there is no sign whatsoever of the corresponding fixes... It is a crying shame that Andy Vaught is not on-side.) 

(iv) As at present, the symtree carries the local name and the symbol the true name - that's why I think that the symbols can be shared.

Regards

Paul

(v) Clearly, a clean-up at the end of the compilation of the file will have to be done but all the mechanisms to do that exist already.

As I say, the problems are a) interfaces, b) interfaces and c) interfaces.

Ciao

Paul

Comment 6 Paul Thomas 2007-01-15 10:27:52 UTC
This is the one that I said that I would take on for the next few months.  I will try to implement the manifesto in #5 and have the F2003 specification for sub-modules with me.

Paul
Comment 7 Janne Blomqvist 2007-04-25 09:41:50 UTC
BTW, here's some slides describing NAG:s experience, they use lazy symbol lookup combined with caching, and claim it is up to 1000 times faster than non-lazy (which gfortran uses AFAICS).

http://www.fortran.bcs.org/2007/jubilee/f50.pdf
Comment 8 Paul Thomas 2007-05-02 08:41:40 UTC
(In reply to comment #7)
> BTW, here's some slides describing NAG:s experience, they use lazy symbol
> lookup combined with caching, and claim it is up to 1000 times faster than
> non-lazy (which gfortran uses AFAICS).
> http://www.fortran.bcs.org/2007/jubilee/f50.pdf
Janne,

I had noted that, when I read Cohen's talk.  Your comment led me to research lazy and non-lazy symbols and to have a think about how they might apply to module.c.

What I had a mind to do was to load and decode the .mod file into its own namespace. Then, use association consists of copying and, if needed, renaming the symbols into the target namespace.  This requires that the formal and other secondary namespaces be scanned to see if they are required.

The lazy symbol mechanism would retain the existing pointer_info tree for each module , resetting the NEEDED and USED flags before use.  The existing loading mechanism could then be recycled.

It is my belief that building module namespaces will involve the least work and will be the most efficient way to load symbols but I will need to analyse this more thoroughly.  In either case, symbols have to be copied.

Watch this space - I am working on it, albeit slowly.

Paul

Comment 9 Tobias Burnus 2010-09-29 12:48:16 UTC
Cf. also thread started at http://gcc.gnu.org/ml/fortran/2010-09/threads.html#00499
Comment 10 Jerry DeLisle 2011-04-03 18:32:40 UTC
I would like to promote this one.  I have run into an application that is taking about 2 to 3 minutes to compile while other compilers can do so in a matter of seconds.  The delta here is huge.

I would also like to take a crack at it. Paul any progress on your end?
Comment 11 Paul Thomas 2011-04-04 11:49:52 UTC
(In reply to comment #10)
> I would like to promote this one.  I have run into an application that is
> taking about 2 to 3 minutes to compile while other compilers can do so in a
> matter of seconds.  The delta here is huge.
> 
> I would also like to take a crack at it. Paul any progress on your end?

Dear Jerry,

I have to confess that I have not thought about it for quite a while.

It strikes me that it would be useful to profile module reading to find out what takes the time.  In the case of the original example, it is clear that repeating the same use statement cannot help; however, that begs that question why it is so slow for each time.  Is it the IO, the lexing, or.....?

An associated issue is the size of module files.  Clearly, where a module uses another module, we could help by inserting use statements in the module file.

Cheers

Paul
Comment 12 Tobias Burnus 2011-04-04 12:27:28 UTC
(In reply to comment #11)
> An associated issue is the size of module files.

Joost suggested to cut down the string tags "d" instead of "dimension", "al" instead of "allocatable" or something like that. That would help with the file size, the I/O (also: I/O caching), and the string parsing.

> Clearly, where a module uses
> another module, we could help by inserting use statements in the module file.

There are pro and cons doing so. For binary-only code, it is much more convenient to ship only a single ".mod" instead of several - that was at least a complaint I have heard about NAG's compiler. Though, having by default a simple "use" in the .mod file would help.

Another thing which could help is to read a .mod file only once per translation unit (source file) instead of every time a USE statement is encountered.
Comment 13 Jerry DeLisle 2011-04-05 04:47:35 UTC
Believe it or not, I have an 8.2 megabyte .mod file.  We are working on splitting this up into more files.  The module(s) contain interfaces for existing c libraries which are quite large.

Since I have a good test case, I can use it to profile as well as do some basic things to reduce the file size. I will take assignment for a while.
Comment 14 Jerry DeLisle 2011-04-05 05:24:46 UTC
Here is an strace summary.  There is something afoot here:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 54.41    0.052854           3     17229     17218 stat
 18.41    0.017886           8      2277           write
  7.24    0.007028           2      4651           times
  6.55    0.006364           4      1473           brk
  6.18    0.006000        6000         1           unlink
  5.65    0.005488           8       664           read
  0.98    0.000948         948         1           execve
  0.23    0.000227           3        68        52 open
  0.20    0.000190           4        45           mmap
  0.16    0.000152          19         8           mprotect
  0.00    0.000000           0        16           close
  0.00    0.000000           0        17           fstat
  0.00    0.000000           0         3           lseek
  0.00    0.000000           0        11           munmap
  0.00    0.000000           0         5           rt_sigaction
  0.00    0.000000           0         1         1 access
  0.00    0.000000           0         1           arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00    0.097137                 26471     17271 total
Comment 15 Janne Blomqvist 2011-04-05 05:34:57 UTC
(In reply to comment #12)
> (In reply to comment #11)
> > An associated issue is the size of module files.
> 
> Joost suggested to cut down the string tags "d" instead of "dimension", "al"
> instead of "allocatable" or something like that. That would help with the file
> size, the I/O (also: I/O caching), and the string parsing.

FWIW, this is http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40958

Another option would be to compress the mod file with zlib; there is a copy in the gcc tree already.

There also seems to be a bit of redundancy in the module files, e.g. things like null atoms "()" which I guess can be removed? I'm not sure how big an effect this would have, though.
 
> > Clearly, where a module uses
> > another module, we could help by inserting use statements in the module file.
> 
> There are pro and cons doing so. For binary-only code, it is much more
> convenient to ship only a single ".mod" instead of several - that was at least
> a complaint I have heard about NAG's compiler. Though, having by default a
> simple "use" in the .mod file would help.

To reiterate my comments from 

http://gcc.gnu.org/ml/fortran/2010-09/msg00512.html 

the go developers claim that including the transitive dependencies is a major reason why go compilation is fast. See pages 9-10 on 

http://assets.en.oreilly.com/1/event/45/Another%20Go%20at%20Language%20Design%20Presentation.pdf

In the Fortran case, additionally by doing it this way one can take care of ONLY and renamed symbols when generating the .mod file.

> Another thing which could help is to read a .mod file only once per translation
> unit (source file) instead of every time a USE statement is encountered.

I think this is the major issue, yes.
Comment 16 Jerry DeLisle 2011-04-10 17:47:07 UTC
Paul, taking this one if you do not mind.

I have run some tests.  I replaced the long strings in the various minit invocations with a 2 or 3 character mnemonic. For a particularly large module the size of the .mod file created is:

un-patched ->         8210691 bytes

gzipped ->             724854 bytes

patched ->            6280606 bytes

patched and zipped ->  714777 bytes

Compressing the file is quite good.  There is no particular advantage to changing the minit strings if one is going to compress the file.  The question is then what cost in time do we have of actually compressing and decompressing.

The above just deals with raw file size and I think compression is a good idea.  If we leave the strings alone, we could allow manual decompressing the files to look at them for debugging purposes.

The next thing I would like to consider is some sort of module caching. Let's say we create a module namespace for each module file.  I would suggest allowing a fixed number of these to conserve memory usage.  Modules that are USEed repeatedly would be retained, free up ones not used if more are needed.  I am thinking of some sort of leased recently used scheme.

Another thing I wonder about is how efficiently do we retrieve from this name space. (I have more to study on the internals of module.c)
Comment 17 Joost VandeVondele 2011-11-30 19:50:37 UTC
Janne's lseek patch:
http://gcc.gnu.org/ml/fortran/2011-11/msg00251.html
has further nice results on CP2K (CP2K_2009-05-01.f90)

Thomas (trunk):
 92.08    4.963429           0  19557182           lseek
  5.91    0.318514           1    523064           read
  0.61    0.032888           3     11208           munmap
  0.37    0.020212           2     11969       757 open
  0.24    0.012753           1     11212           close
  0.21    0.011314           1     10533        21 stat
  0.17    0.009117           0     25154           mmap
  0.16    0.008425           0     56715           write
  0.15    0.008353           1     12138           brk
  0.05    0.002811           0     11211           fstat
  0.02    0.001068           2       684           rename
Janne (trunk+patch):
 77.60    1.316715           0   5265206           lseek
  9.12    0.154767           0    466059           read
  4.07    0.069073           0    242658           madvise
  2.77    0.046965           4     11969        74 open
  1.82    0.030845           3     11891           munmap
  1.47    0.024943          36       684           unlink
  0.72    0.012244           1     11895           close
  0.63    0.010689           1     10533        21 stat
  0.56    0.009533           0     56715           write
  0.51    0.008707           0     25837           mmap
  0.40    0.006794           1     12117           brk
  0.15    0.002542           0     11894           fstat
Comment 18 Joost VandeVondele 2011-12-01 07:29:25 UTC
Janne's latest patch now effectively 'removes' lseek:

 26.84    0.108906           0    242658           madvise
 20.12    0.081608           0    459999           read
 19.27    0.078198           0    512288           lseek
 12.33    0.050038          73       684           unlink
  5.99    0.024315           2     11969        74 open
  4.57    0.018544           2     11891           munmap

(512288 down from >198000000 a few days ago).
Comment 19 Janne Blomqvist 2011-12-01 14:12:45 UTC
Author: jb
Date: Thu Dec  1 14:12:37 2011
New Revision: 181879

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=181879
Log:
PR 25708 Avoid seeking when parsing strings and when peeking.

2011-12-01  Janne Blomqvist  <jb@gcc.gnu.org>

	PR fortran/25708
	* module.c (parse_string): Read string into resizable array
	instead of parsing twice and seeking.
	(peek_atom): New implementation avoiding seeks.
	(require_atom): Save and set column and line explicitly for error
	handling.


Modified:
    trunk/gcc/fortran/ChangeLog
    trunk/gcc/fortran/module.c
Comment 20 Jerry DeLisle 2011-12-03 13:28:30 UTC
I don't have time now and others are making good progress.  Unassigning myself.
Comment 21 Joost VandeVondele 2012-08-24 14:00:40 UTC
I did another timing experiment on compiling CP2K. I found that on my server, compiling with -fsyntax-only is as fast as just compiling at -O0. I believe the reason for this is that module reading is dominating the compile time. In CP2K each module is included only once per file, so I think it is the efficiency of reading the module that matters most. My guess would be that the human readable format of the .mod file is the source of most inefficiency. Is it still important to the development of gfortran that the .mod file is in this form ? If I count the number of times a module is used, and multiply that with the size, I have about 1Gb of .mod files being parsed per CP2K compile (for about 35Mb of Fortran).
Comment 22 Joost VandeVondele 2013-03-29 08:33:31 UTC
Improved in part by

http://gcc.gnu.org/ml/fortran/2013-03/msg00143.html

as r197124 for 4.9.0
Comment 23 Dominique d'Humieres 2013-06-24 06:59:59 UTC
Was marked as ASSIGNED, but actually "Not yet assigned to anyone". Set to NEW.
Comment 24 Dominique d'Humieres 2015-09-08 09:35:44 UTC
I wonder if this PR has not been fixed by recent changes.
Comment 25 Dominique d'Humieres 2015-10-10 09:11:48 UTC
> I wonder if this PR has not been fixed by recent changes.

No feedback, Closing as FIXED.