I notice that the module files gfortran generates are really large, and believe that this could maybe be improved easily, which would reduce disk usage and presumably improve compile time. The observation is that a compilation of CP2K generates 130Mb of .mod files for 27Mb sources. Doing a 'cat *.mod > modall' and using that file for analysis, I find that 'bzip2 modall' -> 6Mb file, so 20-fold compression. Compile time seems to also have a large factor of system time which is presumably disk access. With a single process on a fast RAID disk, I get the following time: > gfortran -c -fsyntax-only -ftime-report CP2K_2009-05-01.f90 Execution times (seconds) garbage collection : 8.20 ( 7%) usr 0.34 ( 1%) sys 9.33 ( 4%) wall 0 kB ( 0%) ggc callgraph construction: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc rebuild jump labels : 1.13 ( 1%) usr 0.11 ( 0%) sys 1.32 ( 1%) wall 7 kB ( 0%) ggc parser : 80.34 (64%) usr 27.25 (88%) sys 157.87 (74%) wall 1272352 kB (34%) ggc inline heuristics : 4.58 ( 4%) usr 0.42 ( 1%) sys 5.63 ( 3%) wall 38 kB ( 0%) ggc tree gimplify : 7.69 ( 6%) usr 0.71 ( 2%) sys 9.73 ( 5%) wall 724206 kB (19%) ggc tree eh : 0.27 ( 0%) usr 0.00 ( 0%) sys 0.42 ( 0%) wall 0 kB ( 0%) ggc tree CFG construction : 1.55 ( 1%) usr 0.18 ( 1%) sys 2.05 ( 1%) wall 388675 kB (10%) ggc tree CFG cleanup : 0.46 ( 0%) usr 0.03 ( 0%) sys 0.53 ( 0%) wall 1901 kB ( 0%) ggc dominance computation : 0.25 ( 0%) usr 0.03 ( 0%) sys 0.42 ( 0%) wall 0 kB ( 0%) ggc expand : 19.47 (16%) usr 1.82 ( 6%) sys 24.43 (11%) wall 1348665 kB (36%) ggc varconst : 0.02 ( 0%) usr 0.01 ( 0%) sys 0.06 ( 0%) wall 263 kB ( 0%) ggc final : 0.16 ( 0%) usr 0.02 ( 0%) sys 0.16 ( 0%) wall 0 kB ( 0%) ggc symout : 0.03 ( 0%) usr 0.02 ( 0%) sys 0.05 ( 0%) wall 32 kB ( 0%) ggc TOTAL : 124.67 31.08 212.75 3736334 kB so a relatively large fraction of compile time is system time in the parser (cpu usage during compilation is also rather far from 100%). Presumably a parallel compile would be even more impacted. While I don't have a good general solution, a nearly 50% improvement in mod file size can be easily obtained by by-hand compression: cat modall | sed "s/UNKNOWN/U/g" | sed "s/INTEGER/I/g" | sed "s/VARIABLE/V/g" | sed "s/PROCEDURE/P/g" | sed "s/DERIVED/D/g" | sed "s/CHARACTER/C/g" | sed "s/SUBROUTINE/S/g" | sed "s/FUNCTION/F/g" | sed "s/ASSUMED_SHAPE/AS/g" | sed "s/REAL/R/g" | sed "s/CONSTANT/CST/g" | sed "s/DIMENSION/M/g" | sed "s/DUMMY/Y/g" | sed "s/LOGICAL/L/g" | sed "s/EXPLICIT/X/g" | sed "s/INTENT/T/g" | sed "s/ACCESS/XS/g" | sed "s/POINTER/PT/g" | sed "s/DEFERRED/FR/g" > modall.new yields a 75Mb file (so roughly half of the original). Presumably that would reduce the time needed to read the mod files by half, and might be faster to parse. The down-side is that the module files are somewhat harder human-readable, but that can't be their primary purpose. Looking at module.c, such a change would be relatively easy to implement.
Yup
It looks like also module.c contains minit calls that need to be modified. But for a 50% savings in module size, that seems still a localized effort
A link to the single file CP2K version mentioned below http://dl.dropbox.com/u/40478020/CP2K_2009-05-01.f90.gz gfortran -c CP2K_2009-05-01.f90 needs about 5m38s on my machine, 122Mb of mod files with current 4.7.
Just for reference, compiling CP2K_2009-05-01.f90 results in 684 modules, stracing yields something like 12000 calls to open, and 148'847'399 calls to lseek. Clearly anything reducing the number of seeks is likely to have a good impact on compile time. For this particular case, caching modules would help a lot as well. However, our usual pattern is to have a single module per file, and all use statements at the top of the module. Caching would be of little help for this style. An efficient encoding of the information in the module would help. The idea of writing the module compressed, and decompressing it as a big string to memory for reading and parsing, seems appealing to me. Concerning a change of format, it would be important to keep one of gfortran's nice features, that is, the ability to use the modification time of the .mod files to avoid recompilation cascades. If .mod files would contain a reference to other .mod files (instead of containing the info directly), this property might be at risk.
(In reply to comment #4) > Just for reference, compiling CP2K_2009-05-01.f90 results in 684 modules, > stracing yields something like 12000 calls to open, and 148'847'399 calls to > lseek. With Thomas patch (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40958) the number of seeks drops to 19'557'182, which quite an improvement. In the trace output, there are still very long sequences of identical lseek, without any other intermediate call.
Author: tkoenig Date: Tue Nov 29 17:49:24 2011 New Revision: 181810 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=181810 Log: 2011-11-29 Thomas Koenig <tkoenig@gcc.gnu.org> PR fortran/40958 * module.c (prev_module_line): New variable. (prev_module_column): New variable. (prev_character): New variable. (module_char): Update the new variables. (module_unget_char): New function. (parse_string): Use module_unget_char. (parse_integer): Likewise. (parse_name): Likewise. Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/module.c
see part 2/3 in the message here: http://gcc.gnu.org/ml/fortran/2013-03/msg00125.html
Patch for compressing module files with zlib at http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00554.html .
Author: jb Date: Tue Mar 26 22:08:17 2013 New Revision: 197124 URL: http://gcc.gnu.org/viewcvs?rev=197124&root=gcc&view=rev Log: PR 25708 Use a temporary buffer when parsing module files. 2013-03-27 Janne Blomqvist <jb@gcc.gnu.org> PR fortran/25708 * module.c (module_locus): Use long for position. (module_content): New variable. (module_pos): Likewise. (prev_character): Remove. (bad_module): Free data instead of closing mod file. (set_module_locus): Use module_pos. (get_module_locus): Likewise. (module_char): use buffer rather than stdio file. (module_unget_char): Likewise. (read_module_to_tmpbuf): New function. (gfc_use_module): Call read_module_to_tmpbuf. Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/module.c
Author: jb Date: Wed Apr 17 10:19:40 2013 New Revision: 198023 URL: http://gcc.gnu.org/viewcvs?rev=198023&root=gcc&view=rev Log: PR 40958 Compress module files with zlib. frontend ChangeLog: 2013-04-17 Janne Blomqvist <jb@gcc.gnu.org> PR fortran/40958 * scanner.h: New file. * Make-lang.in: Dependencies on scanner.h. * scanner.c (gfc_directorylist): Move to scanner.h. * module.c: Don't include md5.h, include scanner.h and zlib.h. (MOD_VERSION): Add comment about backwards compatibility. (module_fp): Change type to gzFile. (ctx): Remove. (gzopen_included_file_1): New function. (gzopen_included_file): New function. (gzopen_intrinsic_module): New function. (write_char): Use gzputc. (read_crc32_from_module_file): New function. (read_md5_from_module_file): Remove. (gfc_dump_module): Use gz* functions instead of stdio, check gzip crc32 instead of md5. (read_module_to_tmpbuf): Use gz* functions instead of stdio. (gfc_use_module): Use gz* functions. testsuite ChangeLog: 2013-04-17 Janne Blomqvist <jb@gcc.gnu.org> PR fortran/40958 * lib/gcc-dg.exp (scan-module): Uncompress module file before scanning. * gfortran.dg/module_md5_1.f90: Remove. Added: trunk/gcc/fortran/scanner.h Removed: trunk/gcc/testsuite/gfortran.dg/module_md5_1.f90 Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/Make-lang.in trunk/gcc/fortran/module.c trunk/gcc/fortran/scanner.c trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/lib/gcc-dg.exp
With these patches in, parallel compilation of multi-file cp2k becomes significantly faster. Time for a full build goes from 70s to 50s. I think that in a parallel build the IO bottleneck (bandwidth) was significant, while this is now much improved. The effect will likely be even larger on mounted filesystems.
Joost, is it fixed after revision 198023? If yes, could you close the PR as FIXED?
(In reply to Dominique d'Humieres from comment #12) > Joost, is it fixed after revision 198023? If yes, could you close the PR as > FIXED? With the introduction of the temp buffer for parsing modules the excessive lseek() calls have been eliminated, and with the introduction of compressed module files the total size of module files on disk have been reduced by an order of magnitude for large projects. However, the fundamental(?) issue of module sizes growing exponentially with deep module hierarchies still remains. The solution to that is to not include transitive dependencies, which in turn would require a module cache for good performance. Whether that is worth doing, and who is willing and able to do it, is unclear.
(In reply to Janne Blomqvist from comment #13) I believe a lot of progress has been made indeed. > However, the fundamental(?) issue of module sizes growing exponentially with > deep module hierarchies still remains. The solution to that is to not > include transitive dependencies, which in turn would require a module cache > for good performance. Whether that is worth doing, and who is willing and > able to do it, is unclear. note that there could also be disadvantages for that solution. For example, dependencies for a given .F would be more difficult to find (i.e. not just the USE statements). I'm not sure what implications that would have e.g. for 'smart' recompilation (i.e. based on time stamps of .mod). The module cache would also not work very well for the (common, I guess) case of having a single module per file, and all USE statements on top. It might thus be that the current state is the sweet spot.
> However, the fundamental(?) issue of module sizes growing exponentially > with deep module hierarchies still remains. The solution to that is to > not include transitive dependencies, which in turn would require a module > cache for good performance. Whether that is worth doing, and who is willing > and able to do it, is unclear. Would not it be simpler to tell the users what they should do to avoid this issue? If yes, what would be the basic rules?
(In reply to Dominique d'Humieres from comment #15) > > However, the fundamental(?) issue of module sizes growing exponentially > > with deep module hierarchies still remains. The solution to that is to > > not include transitive dependencies, which in turn would require a module > > cache for good performance. Whether that is worth doing, and who is willing > > and able to do it, is unclear. > > Would not it be simpler to tell the users what they should do to avoid this > issue? If yes, what would be the basic rules? I doubt that this is the right answer. The user wants to write maintainable and portable code. The paradigm of object-oriented programming will more often lead to deeper module hierarchies than simple code. You'd had a hard time to tell users that gfortran requires to flatten those hierarchies when other compilers don't (assuming that the others perform acceptably).
> However, the fundamental(?) issue of module sizes growing exponentially with > deep module hierarchies still remains. The solution to that is to not > include transitive dependencies, which in turn would require a module cache > for good performance. Whether that is worth doing, and who is willing and > able to do it, is unclear. See pr60780 for an example.
Created attachment 33703 [details] Proposed patch to fix module equivalence duplicates Here is a proposed fix for the problem related to equivalence statements in nested/recursive use (see also PR 60780). I've run 'make check-fortran' on rev. 216017 with and without this patch had a FAIL on gfortran.dg/typebound_operator_3.f03 (as discussed in PR 63404) both ways.
Author: kargl Date: Fri Jun 5 16:54:53 2015 New Revision: 224159 URL: https://gcc.gnu.org/viewcvs?rev=224159&root=gcc&view=rev Log: 2015-06-03 Russell Whitesides <russelldub@gmail.com> Steven G. Kargl <kargl@gcc.gnu.org> PR fortran/40958 PR fortran/60780 PR fortran/66377 * module.c (load_equiv): Add check for loading duplicate EQUIVALENCEs from different modules. Eliminate the pruning of unused equivalence-objects 2015-06-03 Steven G. Kargl <kargl@gcc.gnu.org> PR fortran/66377 gfortran.dg/equiv_9.f90: New test. Added: trunk/gcc/testsuite/gfortran.dg/equiv_9.f90 Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/module.c trunk/gcc/testsuite/ChangeLog
Author: kargl Date: Fri Jun 5 20:40:35 2015 New Revision: 224171 URL: https://gcc.gnu.org/viewcvs?rev=224171&root=gcc&view=rev Log: 2015-06-03 Russell Whitesides <russelldub@gmail.com> Steven G. Kargl <kargl@gcc.gnu.org> PR fortran/40958 PR fortran/60780 PR fortran/66377 * module.c (load_equiv): Add check for loading duplicate EQUIVALENCEs from different modules. Eliminate the pruning of unused equivalence-objects 2015-06-03 Steven G. Kargl <kargl@gcc.gnu.org> PR fortran/66377 gfortran.dg/equiv_9.f90: New test. Added: branches/gcc-5-branch/gcc/testsuite/gfortran.dg/equiv_9.f90 Modified: branches/gcc-5-branch/gcc/fortran/ChangeLog branches/gcc-5-branch/gcc/fortran/module.c branches/gcc-5-branch/gcc/testsuite/ChangeLog
I suggest we close this PR, as situation has significantly improved. Please open new PR(s) for specific issues related to module size.