doing something like:
Causes us to load in the module 4 times which is just wrong.
Looking at the profile for PR 21130 makes me think fixing this bug will also fix that one.
Oh. it is just as bad if we have module modulef declared in the same file as we have to save it and then reload it which is just a waste of time.
Also fixing up the derived types after the fact is just wrong that was PR 25391.
Subject: RE: Module loading is not good at all
> ------- Comment #1 from pinskia at gcc dot gnu dot org
> 2006-01-07 05:10 -------
> Looking at the profile for PR 21130 makes me think fixing
> this bug will also
> fix that one.
> Oh. it is just as bad if we have module modulef declared in
> the same file as we
> have to save it and then reload it which is just a waste of time.
I have proposed to introduce module namespaces that are built just once per compiled file per module; either from source or a mod file. Subsequent usage can lift the symbol information from the appropriate namespace.
> To Paul:
> Also fixing up the derived types after the fact is just wrong
> that was PR
What makes you say that? We have a choice about the way that this should be done:
(i) Catch equal derived types at the read_module + match stage;
(ii) At resolution; or
(iii) At the translation stage.
I came to the conclusion that (iii) was the most economical, in terms of code, and the simplest to implement. There is nothing "wrong" in having separate symbols for the same object in different scopes, just as long as the tree declaration is the same.
Confirmed and marked as an enhancement. After all, it's working :)
(In reply to comment #2)
> I have proposed to introduce module namespaces that are built just once per
> compiled file per module; either from source or a mod file. Subsequent usage
> can lift the symbol information from the appropriate namespace.
I like that idea. I've looked at the code in symbol.c, and don't see clearly how to get it done (where do you get the master symtree from, eg?)
Subject: RE: Module loading is not good at all
> -----Message d'origine-----
> De : fxcoudert at gcc dot gnu dot org
> Envoyé : mardi 31 octobre 2006 08:01
> I like that idea. I've looked at the code in symbol.c, and
> don't see clearly
> how to get it done (where do you get the master symtree from, eg?)
I haven't really thought about this enough yet but.....
(i) I think that a namespace will have to be built for each module. This will allow the initial part of read_module to be left completely unmodified. *grin*
(ii) The namespaces should be contained in a linked list or binary tree; the structures should be something like
typedef struct gfc_module_namespace
const char *name;
struct gfc_module_namespace *next;
maybe some attributes?;
I think that a list should be sufficient, since numbers of modules are likely to remain limited for real-life codes.... aren't they? Or is your 25Mb monster an indication that I am wrong?
(iii) As the modules get USEd, new symtrees are added to the current namespace and I think that it is OK to point to the symbols in the module namespaces. This would simplify derived type association a lot. I have not the foggiest idea what to do about interfaces; they are on my list of urgent things to try to understand. I am about to create a meta-bug of interface PRs. I had a notion to look at modules after I had had a stab at that.
(There is a PR on doubly USEd interfaces that I had a stab at in the airport last Friday - I got absolutely nowhere; even though I thought that I was touching the right places, it had no effect on the fault! This is one and the same problem of doubly USEd variables that I fixed by explicitly going through and checking each against the other. To do likewise for interfaces requires some understanding! I note, however, that these problems are fixed in g95 and there is no sign whatsoever of the corresponding fixes... It is a crying shame that Andy Vaught is not on-side.)
(iv) As at present, the symtree carries the local name and the symbol the true name - that's why I think that the symbols can be shared.
(v) Clearly, a clean-up at the end of the compilation of the file will have to be done but all the mechanisms to do that exist already.
As I say, the problems are a) interfaces, b) interfaces and c) interfaces.
This is the one that I said that I would take on for the next few months. I will try to implement the manifesto in #5 and have the F2003 specification for sub-modules with me.
BTW, here's some slides describing NAG:s experience, they use lazy symbol lookup combined with caching, and claim it is up to 1000 times faster than non-lazy (which gfortran uses AFAICS).
(In reply to comment #7)
> BTW, here's some slides describing NAG:s experience, they use lazy symbol
> lookup combined with caching, and claim it is up to 1000 times faster than
> non-lazy (which gfortran uses AFAICS).
I had noted that, when I read Cohen's talk. Your comment led me to research lazy and non-lazy symbols and to have a think about how they might apply to module.c.
What I had a mind to do was to load and decode the .mod file into its own namespace. Then, use association consists of copying and, if needed, renaming the symbols into the target namespace. This requires that the formal and other secondary namespaces be scanned to see if they are required.
The lazy symbol mechanism would retain the existing pointer_info tree for each module , resetting the NEEDED and USED flags before use. The existing loading mechanism could then be recycled.
It is my belief that building module namespaces will involve the least work and will be the most efficient way to load symbols but I will need to analyse this more thoroughly. In either case, symbols have to be copied.
Watch this space - I am working on it, albeit slowly.
Cf. also thread started at http://gcc.gnu.org/ml/fortran/2010-09/threads.html#00499
I would like to promote this one. I have run into an application that is taking about 2 to 3 minutes to compile while other compilers can do so in a matter of seconds. The delta here is huge.
I would also like to take a crack at it. Paul any progress on your end?
(In reply to comment #10)
> I would like to promote this one. I have run into an application that is
> taking about 2 to 3 minutes to compile while other compilers can do so in a
> matter of seconds. The delta here is huge.
> I would also like to take a crack at it. Paul any progress on your end?
I have to confess that I have not thought about it for quite a while.
It strikes me that it would be useful to profile module reading to find out what takes the time. In the case of the original example, it is clear that repeating the same use statement cannot help; however, that begs that question why it is so slow for each time. Is it the IO, the lexing, or.....?
An associated issue is the size of module files. Clearly, where a module uses another module, we could help by inserting use statements in the module file.
(In reply to comment #11)
> An associated issue is the size of module files.
Joost suggested to cut down the string tags "d" instead of "dimension", "al" instead of "allocatable" or something like that. That would help with the file size, the I/O (also: I/O caching), and the string parsing.
> Clearly, where a module uses
> another module, we could help by inserting use statements in the module file.
There are pro and cons doing so. For binary-only code, it is much more convenient to ship only a single ".mod" instead of several - that was at least a complaint I have heard about NAG's compiler. Though, having by default a simple "use" in the .mod file would help.
Another thing which could help is to read a .mod file only once per translation unit (source file) instead of every time a USE statement is encountered.
Believe it or not, I have an 8.2 megabyte .mod file. We are working on splitting this up into more files. The module(s) contain interfaces for existing c libraries which are quite large.
Since I have a good test case, I can use it to profile as well as do some basic things to reduce the file size. I will take assignment for a while.
Here is an strace summary. There is something afoot here:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
54.41 0.052854 3 17229 17218 stat
18.41 0.017886 8 2277 write
7.24 0.007028 2 4651 times
6.55 0.006364 4 1473 brk
6.18 0.006000 6000 1 unlink
5.65 0.005488 8 664 read
0.98 0.000948 948 1 execve
0.23 0.000227 3 68 52 open
0.20 0.000190 4 45 mmap
0.16 0.000152 19 8 mprotect
0.00 0.000000 0 16 close
0.00 0.000000 0 17 fstat
0.00 0.000000 0 3 lseek
0.00 0.000000 0 11 munmap
0.00 0.000000 0 5 rt_sigaction
0.00 0.000000 0 1 1 access
0.00 0.000000 0 1 arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00 0.097137 26471 17271 total
(In reply to comment #12)
> (In reply to comment #11)
> > An associated issue is the size of module files.
> Joost suggested to cut down the string tags "d" instead of "dimension", "al"
> instead of "allocatable" or something like that. That would help with the file
> size, the I/O (also: I/O caching), and the string parsing.
FWIW, this is http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40958
Another option would be to compress the mod file with zlib; there is a copy in the gcc tree already.
There also seems to be a bit of redundancy in the module files, e.g. things like null atoms "()" which I guess can be removed? I'm not sure how big an effect this would have, though.
> > Clearly, where a module uses
> > another module, we could help by inserting use statements in the module file.
> There are pro and cons doing so. For binary-only code, it is much more
> convenient to ship only a single ".mod" instead of several - that was at least
> a complaint I have heard about NAG's compiler. Though, having by default a
> simple "use" in the .mod file would help.
To reiterate my comments from
the go developers claim that including the transitive dependencies is a major reason why go compilation is fast. See pages 9-10 on
In the Fortran case, additionally by doing it this way one can take care of ONLY and renamed symbols when generating the .mod file.
> Another thing which could help is to read a .mod file only once per translation
> unit (source file) instead of every time a USE statement is encountered.
I think this is the major issue, yes.
Paul, taking this one if you do not mind.
I have run some tests. I replaced the long strings in the various minit invocations with a 2 or 3 character mnemonic. For a particularly large module the size of the .mod file created is:
un-patched -> 8210691 bytes
gzipped -> 724854 bytes
patched -> 6280606 bytes
patched and zipped -> 714777 bytes
Compressing the file is quite good. There is no particular advantage to changing the minit strings if one is going to compress the file. The question is then what cost in time do we have of actually compressing and decompressing.
The above just deals with raw file size and I think compression is a good idea. If we leave the strings alone, we could allow manual decompressing the files to look at them for debugging purposes.
The next thing I would like to consider is some sort of module caching. Let's say we create a module namespace for each module file. I would suggest allowing a fixed number of these to conserve memory usage. Modules that are USEed repeatedly would be retained, free up ones not used if more are needed. I am thinking of some sort of leased recently used scheme.
Another thing I wonder about is how efficiently do we retrieve from this name space. (I have more to study on the internals of module.c)
Janne's lseek patch:
has further nice results on CP2K (CP2K_2009-05-01.f90)
92.08 4.963429 0 19557182 lseek
5.91 0.318514 1 523064 read
0.61 0.032888 3 11208 munmap
0.37 0.020212 2 11969 757 open
0.24 0.012753 1 11212 close
0.21 0.011314 1 10533 21 stat
0.17 0.009117 0 25154 mmap
0.16 0.008425 0 56715 write
0.15 0.008353 1 12138 brk
0.05 0.002811 0 11211 fstat
0.02 0.001068 2 684 rename
77.60 1.316715 0 5265206 lseek
9.12 0.154767 0 466059 read
4.07 0.069073 0 242658 madvise
2.77 0.046965 4 11969 74 open
1.82 0.030845 3 11891 munmap
1.47 0.024943 36 684 unlink
0.72 0.012244 1 11895 close
0.63 0.010689 1 10533 21 stat
0.56 0.009533 0 56715 write
0.51 0.008707 0 25837 mmap
0.40 0.006794 1 12117 brk
0.15 0.002542 0 11894 fstat
Janne's latest patch now effectively 'removes' lseek:
26.84 0.108906 0 242658 madvise
20.12 0.081608 0 459999 read
19.27 0.078198 0 512288 lseek
12.33 0.050038 73 684 unlink
5.99 0.024315 2 11969 74 open
4.57 0.018544 2 11891 munmap
(512288 down from >198000000 a few days ago).
Date: Thu Dec 1 14:12:37 2011
New Revision: 181879
PR 25708 Avoid seeking when parsing strings and when peeking.
2011-12-01 Janne Blomqvist <firstname.lastname@example.org>
* module.c (parse_string): Read string into resizable array
instead of parsing twice and seeking.
(peek_atom): New implementation avoiding seeks.
(require_atom): Save and set column and line explicitly for error
I don't have time now and others are making good progress. Unassigning myself.
I did another timing experiment on compiling CP2K. I found that on my server, compiling with -fsyntax-only is as fast as just compiling at -O0. I believe the reason for this is that module reading is dominating the compile time. In CP2K each module is included only once per file, so I think it is the efficiency of reading the module that matters most. My guess would be that the human readable format of the .mod file is the source of most inefficiency. Is it still important to the development of gfortran that the .mod file is in this form ? If I count the number of times a module is used, and multiply that with the size, I have about 1Gb of .mod files being parsed per CP2K compile (for about 35Mb of Fortran).
Improved in part by
as r197124 for 4.9.0
Was marked as ASSIGNED, but actually "Not yet assigned to anyone". Set to NEW.
I wonder if this PR has not been fixed by recent changes.
> I wonder if this PR has not been fixed by recent changes.
No feedback, Closing as FIXED.