This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH] Convert more passes to new dump framework
- From: Teresa Johnson <tejohnson at google dot com>
- To: Xinliang David Li <davidxl at google dot com>
- Cc: Richard Biener <richard dot guenther at gmail dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Dehao Chen <dehao at google dot com>, Sharad Singhai <singhai at google dot com>
- Date: Fri, 30 Aug 2013 12:51:09 -0700
- Subject: Re: [PATCH] Convert more passes to new dump framework
- Authentication-results: sourceware.org; auth=none
- References: <CAAe5K+W=brFLz3kZYFXTYAE+T2yJVPF=z+vp+uKh4d+9vQXJKQ at mail dot gmail dot com> <20130806123714 dot GD3166 at virgil dot suse> <CAAe5K+VSDkg5sKB2nvraHFv66ZKGO84W6obYj+sBfGZ7F3-hpw at mail dot gmail dot com> <20130806160102 dot GE3166 at virgil dot suse> <CAAe5K+VCwEUz1BWAnAvPrSBT3DHsu1P-eF4-nQb9+aFyDyRORw at mail dot gmail dot com> <CAAe5K+Vu0en=WFJp5bw-Vu=n05LYGxvmmfctnZSWeS-5PKT=Rg at mail dot gmail dot com> <CAFiYyc17odz-dL10BcNcKc8B_QNE2KB538dunz45=1Kn67d-gw at mail dot gmail dot com> <CAAe5K+W2rXph_7YkB1DgyXK55s4XQVtV+yvGY3q30-1DEkSK7A at mail dot gmail dot com> <CAFiYyc1DUmGZ1qeBL8mKRGy_1bd1whTW_XmmkOc-PU4WmGMoEg at mail dot gmail dot com> <CAAe5K+WBuaShbbA_hre_LK90NoShJ8yCRF0NLLnGcoQuZ6P-Nw at mail dot gmail dot com> <CAFiYyc2GNvLbsuWeBAsXHTt1kBY_ksDeXN0Spb9XwhuYd4MP0w at mail dot gmail dot com> <CAAkRFZLGvTPzFFLMif1Qw+1w8uUDa7mA00Ur7YOvLSFYTDp84w at mail dot gmail dot com>
On Fri, Aug 30, 2013 at 9:27 AM, Xinliang David Li <firstname.lastname@example.org> wrote:
> Except that in this form, the dump will be extremely large and not
> suitable for very large applications.
Yes. I did some measurements for both a fairly large source file that
is heavily optimized with LIPO and for a simple toy example that has
some inlining. For the large source file, the output from
-fdump-ipa-inline=stderr was almost 100x the line count of the
-fopt-info output. For the toy source file it was 43x. The size of the
-details output was 250x and 100x, respectively. Which is untenable
for a large app.
The issue I am having here is that I want a more verbose message, not
a more voluminous set of messages. Using either -fopt-info-all or
-fdump-ipa-inline to provoke the more verbose inline message will give
me a much greater volume of output.
One compromise could be to emit the more verbose inliner message under
a param (and a more concise "foo inlined into bar" by default with
-fopt-info). Or we could do some variant of what David talks about
> Besides, we might also want to
> use the same machinery (dump_printf_loc etc) for dump file dumping.
> The current behavior of using '-details' to turn on opt-info-all
> messages for dump files are not desirable.
Interestingly, this doesn't even work. When I do
-fdump-ipa-inline-details=stderr (with my patch containing the inliner
messages) I am not getting those inliner messages emitted to stderr.
Even though in dumpfile.c "details" is set to (TDF_DETAILS |
MSG_OPTIMIZED_LOCATIONS | MSG_MISSED_OPTIMIZATION | MSG_NOTE). I'm not
sure why, but will need to debug this.
> How about the following:
> 1) add a new dump_kind modifier so that when that modifier is
> specified, the messages won't goto the alt_dumpfile (controlled by
> -fopt-info), but only to primary dump file. With this, the inline
> messages can be dumped via:
> dump_printf_loc (OPT_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY, .....)
(you mean (MSG_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY) )
Typically OR-ing together flags like this indicates dump under any of
those conditions. But we could implement special handling for
OPT_DUMP_FILE_ONLY, which in the above case would mean dump only to
the primary dump file, and only under the other conditions specified
in the flag (here under "-optimized")
> 2) add more flags in -fdump- support:
> -fdump-ipa-inline-opt --> turn on opt-info messages only
> -fdump-ipa-inline-optall --> turn on opt-info-all messages
According to the documentation (see the -fdump-tree- documentation on
the above are already supposed to be there (-optimized, -missed, -note
and -optall). However, specifying any of these gives a warning like:
cc1: warning: ignoring unknown option ‘optimized’ in
‘-fdump-ipa-inline’ [enabled by default]
Probably because none is listed in the dump_options array in dumpfile.c.
However, I don't think there is currently a way to use -fdump- options
and *only* get one of these, as much of the current dump output is
emitted whenever there is a dump_file defined. Until everything is
migrated to the new framework it may be difficult to get this to work.
> -fdump-tree-pre-ir --> turn on GIMPLE dump only
> -fdump-tree-pre-details --> turn on everything (ir, optall, trace)
> With this, developers can really just use
> -fdump-ipa-inline-opt=stderr for inline messages.
Yes, if we can figure out a good way to get this to work (i.e. only
emit the optimized messages and not the rest of the dump messages).
And unfortunately to get them all you need to specify
-fdump-rtl-all-optimized" instead of just -fopt-info. Unless we can
> On Fri, Aug 30, 2013 at 1:30 AM, Richard Biener
> <email@example.com> wrote:
>> On Thu, Aug 29, 2013 at 5:15 PM, Teresa Johnson <firstname.lastname@example.org> wrote:
>>> On Thu, Aug 29, 2013 at 3:04 AM, Richard Biener
>>> <email@example.com> wrote:
>>>>>>> New patch below that removes this global variable, and also outputs
>>>>>>> the node->symbol.order (in square brackets after the function name so
>>>>>>> as to not clutter it). Inline messages with profile data look look:
>>>>>>> test.c:8:3: note: foobar  (99999000) inlined into foo  (1000)
>>>>>>> with call count 99999000 (via inline instance bar  (99999000))
>>>>>> Ick. This looks both redundant and cluttered. This is supposed to be
>>>>>> understandable by GCC users, not only GCC developers.
>>>>> The main part that is only useful/understandable to gcc developers is
>>>>> the node->symbol.order in square brackes, requested by Martin. One
>>>>> possibility is that I could put that part under a param, disabled by
>>>>> default. We have something similar on the google branches that emits
>>>>> LIPO module info in the message, enabled via a param.
>>>> But we have _dump files_ for that. That's the developer-consumed
>>>> form of opt-info. -fopt-info is purely user sugar and for usual translation
>>>> units it shouldn't exceed a single terminal full of output.
>>> But as a developer I don't want to have to parse lots of dump files
>>> for a summary of the major optimizations performed (e.g. inlining,
>>> unrolling) for an application, unless I am diving into the reasons for
>>> why or why not one of those optimizations occurred in a particular
>>> location. I really do want a summary emitted to stderr so that it is
>>> easily searchable/summarizable for the app as a whole.
>>> For example, some of the apps I am interested in have thousands of
>>> input files, and trying to collect and parse dump files for each and
>>> every one is overwhelming (it probably would be even if my input files
>>> numbered in the hundreds). What has been very useful is having these
>>> high level summary messages of inlines and unrolls emitted to stderr
>>> by -fopt-info. Then it is easy to search and sort by hotness to get a
>>> feel for things like what inlines are missing when moving to a new
>>> compiler, or compiling a new version of the source, for example. Then
>>> you know which files to focus on and collect dump files for.
>> I thought we can direct dump files to stderr now? So, just use
>> and grep its contents.
>>>>> I'd argue that the other information (the profile counts, emitted only
>>>>> when using -fprofile-use, and the inline call chains) are useful if
>>>>> you want to understand whether and how critical inlines are occurring.
>>>>> I think this is the type of information that users focused on
>>>>> optimizations, as well as gcc developers, want when they use
>>>>> -fopt-info. Otherwise it is difficult to make sense of the inline
>>>> Well, I doubt that inline information is interesting to users unless we are
>>>> able to aggressively filter it to what users are interested in. Which IMHO
>>>> isn't possible - users are interested in "I have not inlined this even though
>>>> inlining would severely improve performance" which would indicate a bug
>>>> in the heuristics we can reliably detect and thus it wouldn't be there.
>>> I have interacted with users who are aware of optimizations such as
>>> inlining and unrolling and want to look at that information to
>>> diagnose performance differences when refactoring code or using a new
>>> compiler version. I also think inlining (especially cross-module) is
>>> one example of an optimization that is still being tuned, and user
>>> reports of performance issues related to that have been useful.
>>> I really think that the two groups of people who will find -fopt-info
>>> useful are gcc developers and savvy performance-hungry users. For the
>>> former group the additional info is extremely useful. For the latter
>>> group some of the extra information may not be required (although a
>>> call count is useful for those using profile feedback), but IMO is not
>> well, your proposed output wrecks my 80x24 terminal already due to overly
>> long lines.
>> In the end we may up with a verbosity level for each sub-set of opt-info
>> messages. Ick.
>>> Teresa Johnson | Software Engineer | firstname.lastname@example.org | 408-460-2413
Teresa Johnson | Software Engineer | email@example.com | 408-460-2413