This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH] Convert more passes to new dump framework
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Teresa Johnson <tejohnson at google dot com>
- Cc: Xinliang David Li <davidxl at google dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Dehao Chen <dehao at google dot com>, Sharad Singhai <singhai at google dot com>
- Date: Mon, 2 Sep 2013 11:01:50 +0200
- Subject: Re: [PATCH] Convert more passes to new dump framework
- Authentication-results: sourceware.org; auth=none
- References: <CAAe5K+W=brFLz3kZYFXTYAE+T2yJVPF=z+vp+uKh4d+9vQXJKQ at mail dot gmail dot com> <20130806123714 dot GD3166 at virgil dot suse> <CAAe5K+VSDkg5sKB2nvraHFv66ZKGO84W6obYj+sBfGZ7F3-hpw at mail dot gmail dot com> <20130806160102 dot GE3166 at virgil dot suse> <CAAe5K+VCwEUz1BWAnAvPrSBT3DHsu1P-eF4-nQb9+aFyDyRORw at mail dot gmail dot com> <CAAe5K+Vu0en=WFJp5bw-Vu=n05LYGxvmmfctnZSWeS-5PKT=Rg at mail dot gmail dot com> <CAFiYyc17odz-dL10BcNcKc8B_QNE2KB538dunz45=1Kn67d-gw at mail dot gmail dot com> <CAAe5K+W2rXph_7YkB1DgyXK55s4XQVtV+yvGY3q30-1DEkSK7A at mail dot gmail dot com> <CAFiYyc1DUmGZ1qeBL8mKRGy_1bd1whTW_XmmkOc-PU4WmGMoEg at mail dot gmail dot com> <CAAe5K+WBuaShbbA_hre_LK90NoShJ8yCRF0NLLnGcoQuZ6P-Nw at mail dot gmail dot com> <CAFiYyc2GNvLbsuWeBAsXHTt1kBY_ksDeXN0Spb9XwhuYd4MP0w at mail dot gmail dot com> <CAAkRFZLGvTPzFFLMif1Qw+1w8uUDa7mA00Ur7YOvLSFYTDp84w at mail dot gmail dot com> <CAAe5K+XjXYMMSJBy_W8B-ccZ8GddAnJXnUonzQzegCJzE3XGgA at mail dot gmail dot com>
On Fri, Aug 30, 2013 at 9:51 PM, Teresa Johnson <firstname.lastname@example.org> wrote:
> On Fri, Aug 30, 2013 at 9:27 AM, Xinliang David Li <email@example.com> wrote:
>> Except that in this form, the dump will be extremely large and not
>> suitable for very large applications.
> Yes. I did some measurements for both a fairly large source file that
> is heavily optimized with LIPO and for a simple toy example that has
> some inlining. For the large source file, the output from
> -fdump-ipa-inline=stderr was almost 100x the line count of the
> -fopt-info output. For the toy source file it was 43x. The size of the
> -details output was 250x and 100x, respectively. Which is untenable
> for a large app.
> The issue I am having here is that I want a more verbose message, not
> a more voluminous set of messages. Using either -fopt-info-all or
> -fdump-ipa-inline to provoke the more verbose inline message will give
> me a much greater volume of output.
I think we will never reach the state where the dumping is exactly what
each developer wants (because their wants will differ). Developers can
easily post-process the stderr output with piping through grep.
> One compromise could be to emit the more verbose inliner message under
> a param (and a more concise "foo inlined into bar" by default with
> -fopt-info). Or we could do some variant of what David talks about
>> Besides, we might also want to
>> use the same machinery (dump_printf_loc etc) for dump file dumping.
>> The current behavior of using '-details' to turn on opt-info-all
>> messages for dump files are not desirable.
> Interestingly, this doesn't even work. When I do
> -fdump-ipa-inline-details=stderr (with my patch containing the inliner
> messages) I am not getting those inliner messages emitted to stderr.
> Even though in dumpfile.c "details" is set to (TDF_DETAILS |
> MSG_OPTIMIZED_LOCATIONS | MSG_MISSED_OPTIMIZATION | MSG_NOTE). I'm not
> sure why, but will need to debug this.
>> How about the following:
>> 1) add a new dump_kind modifier so that when that modifier is
>> specified, the messages won't goto the alt_dumpfile (controlled by
>> -fopt-info), but only to primary dump file. With this, the inline
>> messages can be dumped via:
>> dump_printf_loc (OPT_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY, .....)
> (you mean (MSG_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY) )
> Typically OR-ing together flags like this indicates dump under any of
> those conditions. But we could implement special handling for
> OPT_DUMP_FILE_ONLY, which in the above case would mean dump only to
> the primary dump file, and only under the other conditions specified
> in the flag (here under "-optimized")
>> 2) add more flags in -fdump- support:
>> -fdump-ipa-inline-opt --> turn on opt-info messages only
>> -fdump-ipa-inline-optall --> turn on opt-info-all messages
> According to the documentation (see the -fdump-tree- documentation on
> the above are already supposed to be there (-optimized, -missed, -note
> and -optall). However, specifying any of these gives a warning like:
> cc1: warning: ignoring unknown option ‘optimized’ in
> ‘-fdump-ipa-inline’ [enabled by default]
> Probably because none is listed in the dump_options array in dumpfile.c.
> However, I don't think there is currently a way to use -fdump- options
> and *only* get one of these, as much of the current dump output is
> emitted whenever there is a dump_file defined. Until everything is
> migrated to the new framework it may be difficult to get this to work.
>> -fdump-tree-pre-ir --> turn on GIMPLE dump only
>> -fdump-tree-pre-details --> turn on everything (ir, optall, trace)
>> With this, developers can really just use
>> -fdump-ipa-inline-opt=stderr for inline messages.
> Yes, if we can figure out a good way to get this to work (i.e. only
> emit the optimized messages and not the rest of the dump messages).
> And unfortunately to get them all you need to specify
> "-fdump-ipa-all-optimized -fdump-tree-all-optimized
> -fdump-rtl-all-optimized" instead of just -fopt-info. Unless we can
> add -fdump-all-all-optimized.
>> On Fri, Aug 30, 2013 at 1:30 AM, Richard Biener
>> <firstname.lastname@example.org> wrote:
>>> On Thu, Aug 29, 2013 at 5:15 PM, Teresa Johnson <email@example.com> wrote:
>>>> On Thu, Aug 29, 2013 at 3:04 AM, Richard Biener
>>>> <firstname.lastname@example.org> wrote:
>>>>>>>> New patch below that removes this global variable, and also outputs
>>>>>>>> the node->symbol.order (in square brackets after the function name so
>>>>>>>> as to not clutter it). Inline messages with profile data look look:
>>>>>>>> test.c:8:3: note: foobar  (99999000) inlined into foo  (1000)
>>>>>>>> with call count 99999000 (via inline instance bar  (99999000))
>>>>>>> Ick. This looks both redundant and cluttered. This is supposed to be
>>>>>>> understandable by GCC users, not only GCC developers.
>>>>>> The main part that is only useful/understandable to gcc developers is
>>>>>> the node->symbol.order in square brackes, requested by Martin. One
>>>>>> possibility is that I could put that part under a param, disabled by
>>>>>> default. We have something similar on the google branches that emits
>>>>>> LIPO module info in the message, enabled via a param.
>>>>> But we have _dump files_ for that. That's the developer-consumed
>>>>> form of opt-info. -fopt-info is purely user sugar and for usual translation
>>>>> units it shouldn't exceed a single terminal full of output.
>>>> But as a developer I don't want to have to parse lots of dump files
>>>> for a summary of the major optimizations performed (e.g. inlining,
>>>> unrolling) for an application, unless I am diving into the reasons for
>>>> why or why not one of those optimizations occurred in a particular
>>>> location. I really do want a summary emitted to stderr so that it is
>>>> easily searchable/summarizable for the app as a whole.
>>>> For example, some of the apps I am interested in have thousands of
>>>> input files, and trying to collect and parse dump files for each and
>>>> every one is overwhelming (it probably would be even if my input files
>>>> numbered in the hundreds). What has been very useful is having these
>>>> high level summary messages of inlines and unrolls emitted to stderr
>>>> by -fopt-info. Then it is easy to search and sort by hotness to get a
>>>> feel for things like what inlines are missing when moving to a new
>>>> compiler, or compiling a new version of the source, for example. Then
>>>> you know which files to focus on and collect dump files for.
>>> I thought we can direct dump files to stderr now? So, just use
>>> and grep its contents.
>>>>>> I'd argue that the other information (the profile counts, emitted only
>>>>>> when using -fprofile-use, and the inline call chains) are useful if
>>>>>> you want to understand whether and how critical inlines are occurring.
>>>>>> I think this is the type of information that users focused on
>>>>>> optimizations, as well as gcc developers, want when they use
>>>>>> -fopt-info. Otherwise it is difficult to make sense of the inline
>>>>> Well, I doubt that inline information is interesting to users unless we are
>>>>> able to aggressively filter it to what users are interested in. Which IMHO
>>>>> isn't possible - users are interested in "I have not inlined this even though
>>>>> inlining would severely improve performance" which would indicate a bug
>>>>> in the heuristics we can reliably detect and thus it wouldn't be there.
>>>> I have interacted with users who are aware of optimizations such as
>>>> inlining and unrolling and want to look at that information to
>>>> diagnose performance differences when refactoring code or using a new
>>>> compiler version. I also think inlining (especially cross-module) is
>>>> one example of an optimization that is still being tuned, and user
>>>> reports of performance issues related to that have been useful.
>>>> I really think that the two groups of people who will find -fopt-info
>>>> useful are gcc developers and savvy performance-hungry users. For the
>>>> former group the additional info is extremely useful. For the latter
>>>> group some of the extra information may not be required (although a
>>>> call count is useful for those using profile feedback), but IMO is not
>>> well, your proposed output wrecks my 80x24 terminal already due to overly
>>> long lines.
>>> In the end we may up with a verbosity level for each sub-set of opt-info
>>> messages. Ick.
>>>> Teresa Johnson | Software Engineer | email@example.com | 408-460-2413
> Teresa Johnson | Software Engineer | firstname.lastname@example.org | 408-460-2413