Currently only the file from one profiling run can be used for PGO. Especially for MPI programs it would be nice if several folders containing profiling files could be merged or several directories could be used together for -fprofile-use. For saving the profiling files it would be great if the folder name could contain an environment variable or could be set by an environment variable. Thus I suggest that one could either say: -fprofile-dir /some/path/%q{SOME_ENV} #same syntax as valgrind or export GCC_PROFILE_DIR=/some/path/$SOME_ENV This would be very useful because MPI implementation provide the MPI rank as a environment variable. Thus with the suggestion one could store the profile of each MPI rank in a different folder.
A tool to merge multiple gcda files shoulnd't be very difficult to write. I don't think this should be done by the compiler itself, that would greatly complicate things. But a separate tool, gcov-merge say, would work, and this isn't a big job to create using libgcov (and gcov-dump as an example). You'd also be able to merge profile information from different directories. Would something like the above work for you?
We have one internally at Cavium which is designed to run afterwards and merge a few gcda file. It is designed for how we run multi-core programs and write a gcda file for each run. And there one here: http://gcc.gnu.org/ml/gcc-patches/2007-06/msg00423.html
(In reply to comment #1) > A tool to merge multiple gcda files shoulnd't be very difficult to write. I > don't think this should be done by the compiler itself, that would greatly > complicate things. But a separate tool, gcov-merge say, would work, and this > isn't a big job to create using libgcov (and gcov-dump as an example). You'd > also be able to merge profile information from different directories. > > Would something like the above work for you? But for VC and Intel Compiler they can auto merge all PGO information. Will we make gcc to have the similar behavior?
-fprofile-dir= is already implemented.
(In reply to comment #3) > (In reply to comment #1) > > A tool to merge multiple gcda files shoulnd't be very difficult to write. I > > don't think this should be done by the compiler itself, that would greatly > > complicate things. But a separate tool, gcov-merge say, would work, and this > > isn't a big job to create using libgcov (and gcov-dump as an example). You'd > > also be able to merge profile information from different directories. > > > > Would something like the above work for you? > > But for VC and Intel Compiler > > they can auto merge all PGO information. > > Will we make gcc to have the similar behavior? xunxun, GCC does merge profile information from different runs into one gcda file. It works differently from ICC in that ICC produces one .dyn file per test run and uses prof_merge to generate merge multiple .dyn files into a summary file. GCC does this merging from multiple runs automatically. What GCC does not do, is merge multiple gcda files (which would be the equivalent of merging multiple pgopti.dpi files with ICC). The issue in this problem report, is that with MPI there will be multiple images of the same program running simultaneously. The different images can't share the same set of gcda files (you'd have races) so each image generates its own set of gcda files. For that, a new merge tool is necessary. Ideally, this tool would also run transparently. One way to do this could be to take multiple arguments for -fprofile-dir and merge profile info from each directory.
(In reply to comment #2) > We have one internally at Cavium which is designed to run afterwards and merge > a few gcda file. It is designed for how we run multi-core programs and write a > gcda file for each run. And now, of course, you're going to contribute that? ;-) > And there one here: > http://gcc.gnu.org/ml/gcc-patches/2007-06/msg00423.html This merges results for files without their own gcno file but mentioned more than once in gcda files for multiple source files (e.g. for inline functions in headers). You can't merge multiple gcda files for one source file, but the patch does provide the infrastructure to support this.
Created attachment 27869 [details] Patch for adding merge-gcda here is the patch which adds merge-gcda . I don't add any testcases as it is currently designed only for how Cavium's Simple-exec works in that each core writes out its own gcda file.
(In reply to comment #7) > Created attachment 27869 [details] > Patch for adding merge-gcda I am changing the copyright over to the FSF based on the fact Cavium (Networks) has a blanket copyright assignment in place. I just forgot to do it in the patch itself.
I think a tool to merge would be a good partial solution. As far as I can see what would still be missing for user-friendly usage, is a mechanism to guarantee that all pre-merged files are saved with different names, so that different processes don't overwrite each others output files. In the case of MPI one would want to have the mpi rank as part of the output folder to guarantee unique file names. Thus my suggestion to support -fprofile-dir /some/path/%q{SOME_ENV}, where SOME_ENV would be the environment variable containing the mpi rank. Without being able to make the output path depending on a environment variable one would be required to write some wrapper scripts and that might not even be possible in all cases.
> so that different processes don't overwrite each others output files. They don't overwrite each other, rather they are merged together at write out time.
Steven wrote that they are not merged but that race conditions occur. That is also what I observed. To clarify: Message Passing Interface (MPI) is a parallelization method which executes the same binary multiple times in parallel (with support for messages for communication). Allowing to merge the output into one file at runtime would require file-locking (often over network file-systems) and would not scale because MPI applications are often used with more than >10000 (or even >1M) parallel processes simultaneous.
(In reply to comment #9) > I think a tool to merge would be a good partial solution. We will go with the tool solution. I'll take care of the tool before GCC 4.8, if that's OK with apinski. I think we shouldn't have a new tool, though. I'd prefer to teach the gcov program to do it instead. What would you prefer? > As far as I can see what would still be missing for user-friendly usage, is a > mechanism to guarantee that all pre-merged files are saved with different > names, so that different processes don't overwrite each others output files. Deeply berried in the GCC manuals is this section: http://gcc.gnu.org/onlinedocs/gcc-4.7.1/gcc/Cross_002dprofiling.html With the right combination of GCOV_PREFIX_STRIP and GCOV_PREFIX, it should be possible to send the gcda files to unique directories per MPI rank. But I think that a more practical solution is necessary. (I also don't know how these environment variables interact with -profile-dir. I doubt anyone looked into this before now...) I like the %q (and %p) variables from Valgrind, and I don't think it's very hard to add support for them in libgcov. (http://valgrind.org/docs/manual/manual-core.html)
The issue is quite old, however it's probably still valid. Implementing similar to what valgrind does with '%p' and '%q{VAR}' is elegant solution. I can work on that for GCC8 when there's an interest?
Created attachment 41481 [details] Patch candidate I'm attaching patch that supports following expansion of -fprofile-dir (or arguments of -fprofile-generate and -fprofile-use) option value: %w - expands during compile time to working directory; it's handy when one wants to preserve tree hierarchy of gcda files corresponding to another build directory %p - expands during run-time to PID %q{ENV} - expands to value of environmental variable 'ENV' during run-time Having that, I guess we can eventually drop GCOV_PREFIX_STRIP and GCOV_PREFIX as one can use -fprofile-dir="%q{PREFIX}" and then e.g. set PREFIX="../my/folder/". Feel free to comment the patch.
Adding Andrew, may I ask you for your opinion about suggested patch/approach?
I see any feedback, leaving the PR then ...
I found this bug while searching for a way to solve exactly this problem, so for the record: It sounds like very good and useful addition. Thank you!
(In reply to Petr Špaček from comment #17) > I found this bug while searching for a way to solve exactly this problem, so > for the record: It sounds like very good and useful addition. Thank you! Good to hear. Unfortunately the patch will be possible to land in GCC 9.x. Is it acceptable for you?
Sure, I would be happy with any version, thank you! For people who want to generate code coverage reports for parallel executions, beware of https://github.com/linux-test-project/lcov/issues/37.
(In reply to Petr Špaček from comment #19) > Sure, I would be happy with any version, thank you! > > For people who want to generate code coverage reports for parallel > executions, beware of https://github.com/linux-test-project/lcov/issues/37. Good. I will do it in timeframe of stage1 of GCC 9.
Author: marxin Date: Tue Jun 5 12:10:22 2018 New Revision: 261199 URL: https://gcc.gnu.org/viewcvs?rev=261199&root=gcc&view=rev Log: Support variables in expansion of -fprofile-generate option (PR gcov-profile/47618). 2018-06-05 Martin Liska <mliska@suse.cz> PR gcov-profile/47618 * doc/invoke.texi: Document how -fprofile-dir format is extended. 2018-06-05 Martin Liska <mliska@suse.cz> PR gcov-profile/47618 * libgcov-driver-system.c (replace_filename_variables): New function. (gcov_exit_open_gcda_file): Use it. Modified: trunk/gcc/ChangeLog trunk/gcc/doc/invoke.texi trunk/libgcc/ChangeLog trunk/libgcc/libgcov-driver-system.c
Implemented.
(In reply to Andrew Pinski from comment #7) > Created attachment 27869 [details] > Patch for adding merge-gcda > > here is the patch which adds merge-gcda . I don't add any testcases as it > is currently designed only for how Cavium's Simple-exec works in that each > core writes out its own gcda file. I recently found this bug due to a similar problem. looks like that there are two parts of work for this problem: 1. GCC's new feature to guarantee that all pre-merged files are saved with different names for different instances of the same process. 2. a merge tool to merge all the gcda files afterwards. from my understanding, the patch for the above 1 has been committed into GCC9. How about the patch for the above 2? has it been committed?
(In reply to qinzhao from comment #23) > (In reply to Andrew Pinski from comment #7) > > Created attachment 27869 [details] > > Patch for adding merge-gcda > > > > here is the patch which adds merge-gcda . I don't add any testcases as it > > is currently designed only for how Cavium's Simple-exec works in that each > > core writes out its own gcda file. > > I recently found this bug due to a similar problem. looks like that there > are two parts of work for this problem: > > 1. GCC's new feature to guarantee that all pre-merged files are saved with > different names for different instances of the same process. > 2. a merge tool to merge all the gcda files afterwards. > > from my understanding, the patch for the above 1 has been committed into > GCC9. Yes. > How about the patch for the above 2? has it been committed? It has been there for a while, please take a look at: $ gcov-tool merge --help merge: unrecognized option '--help' Merge subcomand usage: merge [options] <dir1> <dir2> Merge coverage file contents -o, --output <dir> Output directory -v, --verbose Verbose mode -w, --weight <w1,w2> Set weights (float point values)
(In reply to Martin Liška from comment #24) > > > How about the patch for the above 2? has it been committed? > > It has been there for a while, please take a look at: > > $ gcov-tool merge --help > merge: unrecognized option '--help' > Merge subcomand usage: merge [options] <dir1> <dir2> Merge coverage > file contents > -o, --output <dir> Output directory > -v, --verbose Verbose mode > -w, --weight <w1,w2> Set weights (float point values) two more questions on this merge tool: 1. it can only merge two directories at one time. So, for multiple directories, for example "n", we have to invoke gcov-tool merge n-1 times in order to merge all of them? 2. Intel compiler (icc)'s profmerge is able to merge all the .dyn files under one directory, does gcc have such functionality currently?
(In reply to qinzhao from comment #25) > (In reply to Martin Liška from comment #24) > > > > > How about the patch for the above 2? has it been committed? > > > > It has been there for a while, please take a look at: > > > > $ gcov-tool merge --help > > merge: unrecognized option '--help' > > Merge subcomand usage: merge [options] <dir1> <dir2> Merge coverage > > file contents > > -o, --output <dir> Output directory > > -v, --verbose Verbose mode > > -w, --weight <w1,w2> Set weights (float point values) > > two more questions on this merge tool: > 1. it can only merge two directories at one time. So, for multiple > directories, for example "n", we have to invoke gcov-tool merge n-1 times in > order to merge all of them? Yep. I guess one can write a simple bash script that does that. > 2. Intel compiler (icc)'s profmerge is able to merge all the .dyn files > under one directory, does gcc have such functionality currently? We have folder-base merging where we search for all .gcda files and we merge them to a destination folder.
> --- Comment #26 from Martin Liška <marxin at gcc dot gnu.org> --- > >> 2. Intel compiler (icc)'s profmerge is able to merge all the .dyn files >> under one directory, does gcc have such functionality currently? > > We have folder-base merging where we search for all .gcda files and we merge > them to a destination folder. could you please point me which command does this? thanks.
(In reply to Martin Liška from comment #26) > (In reply to qinzhao from comment #25) > > (In reply to Martin Liška from comment #24) > > > > > > > How about the patch for the above 2? has it been committed? > > > > > > It has been there for a while, please take a look at: > > > > > > $ gcov-tool merge --help > > > merge: unrecognized option '--help' > > > Merge subcomand usage: merge [options] <dir1> <dir2> Merge coverage > > > file contents > > > -o, --output <dir> Output directory > > > -v, --verbose Verbose mode > > > -w, --weight <w1,w2> Set weights (float point values) > > > > two more questions on this merge tool: > > 1. it can only merge two directories at one time. So, for multiple > > directories, for example "n", we have to invoke gcov-tool merge n-1 times in > > order to merge all of them? > > Yep. I guess one can write a simple bash script that does that. > > > 2. Intel compiler (icc)'s profmerge is able to merge all the .dyn files > > under one directory, does gcc have such functionality currently? > > We have folder-base merging where we search for all .gcda files and we merge > them to a destination folder. $ echo "int main() {return 0;}" >> main.c && gcc --coverage main.c && ./a.out $ mkdir a && mkdir b && cp main.gcda c && cp main.gcda b $ gcov-tool merge a b -o a+b -v reading file: ./main.gcda tag one function id=108032747 reading file: ./main.gcda tag one function id=108032747 $ ls a+b main.gcda $ gcov-dump a+b/main.gcda a+b/main.gcda:data:magic `gcda':version `A83*' a+b/main.gcda:stamp 2031787297 a+b/main.gcda: a3000000: 22:PROGRAM_SUMMARY checksum=0x33c369a8 a+b/main.gcda: counts=1, runs=1, sum_all=2, run_max=2, sum_max=2 a+b/main.gcda: counter histogram: a+b/main.gcda: 2: num counts=1, min counter=2, cum_counter=2 a+b/main.gcda: 01000000: 3:FUNCTION ident=108032747, lineno_checksum=0x3b5ee2be, cfg_checksum=0xdb5de9e8 a+b/main.gcda: 01a10000: 2:COUNTERS arcs 1 counts
> > 1. it can only merge two directories at one time. So, for multiple > > directories, for example "n", we have to invoke gcov-tool merge n-1 times in > > order to merge all of them? > > Yep. I guess one can write a simple bash script that does that. I've added one at https://github.com/yugr/maintainer-scripts/blob/master/gcov-tool-many