[PATCH] Convert more passes to new dump framework

Richard Biener richard.guenther@gmail.com
Thu Aug 29 13:58:00 GMT 2013


Ok.

Richard.

On Thu, Aug 29, 2013 at 3:18 PM, Teresa Johnson <tejohnson@google.com> wrote:
> On Wed, Aug 28, 2013 at 9:07 AM, Teresa Johnson <tejohnson@google.com> wrote:
>> On Wed, Aug 28, 2013 at 7:09 AM, Teresa Johnson <tejohnson@google.com> wrote:
>>> On Wed, Aug 28, 2013 at 4:01 AM, Richard Biener
>>> <richard.guenther@gmail.com> wrote:
>>>> On Wed, Aug 7, 2013 at 7:23 AM, Teresa Johnson <tejohnson@google.com> wrote:
>>>>> On Tue, Aug 6, 2013 at 9:29 AM, Teresa Johnson <tejohnson@google.com> wrote:
>>>>>> On Tue, Aug 6, 2013 at 9:01 AM, Martin Jambor <mjambor@suse.cz> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> On Tue, Aug 06, 2013 at 07:14:42AM -0700, Teresa Johnson wrote:
>>>>>>>> On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor <mjambor@suse.cz> wrote:
>>>>>>>> > On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote:
>>>>>>>> >> This patch ports messages to the new dump framework,
>>>>>>>> >
>>>>>>>> > It would be great this new framework was documented somewhere.  I lost
>>>>>>>> > track of what was agreed it would be and from the uses in the
>>>>>>>> > vectorizer I was never quite sure how to utilize it in other passes.
>>>>>>>>
>>>>>>>> Cc'ing Sharad who implemented this - Sharad, is this documented on a
>>>>>>>> wiki or elsewhere?
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>>>
>>>>>>>> >
>>>>>>>> > I'd also like to point out two other minor things inline:
>>>>>>>> >
>>>>>>>> > [...]
>>>>>>>> >
>>>>>>>> >> 2013-08-06  Teresa Johnson  <tejohnson@google.com>
>>>>>>>> >>             Dehao Chen  <dehao@google.com>
>>>>>>>> >>
>>>>>>>> >>         * dumpfile.c (dump_loc): Add column number to output, make newlines
>>>>>>>> >>         consistent.
>>>>>>>> >>         * dumpfile.h (OPTGROUP_OTHER): Add and enable under OPTGROUP_ALL.
>>>>>>>> >>         * ipa-inline-transform.c (clone_inlined_nodes):
>>>>>>>> >>         (cgraph_node_opt_info): New function.
>>>>>>>> >>         (cgraph_node_call_chain): Ditto.
>>>>>>>> >>         (dump_inline_decision): Ditto.
>>>>>>>> >>         (inline_call): Invoke dump_inline_decision.
>>>>>>>> >>         * doc/invoke.texi: Document optall -fopt-info flag.
>>>>>>>> >>         * profile.c (read_profile_edge_counts): Use new dump framework.
>>>>>>>> >>         (compute_branch_probabilities): Ditto.
>>>>>>>> >>         * passes.c (pass_manager::register_one_dump_file): Use OPTGROUP_OTHER
>>>>>>>> >>         when pass not in any opt group.
>>>>>>>> >>         * value-prof.c (check_counter): Use new dump framework.
>>>>>>>> >>         (find_func_by_funcdef_no): Ditto.
>>>>>>>> >>         (check_ic_target): Ditto.
>>>>>>>> >>         * coverage.c (get_coverage_counts): Ditto.
>>>>>>>> >>         (coverage_init): Setup new dump framework.
>>>>>>>> >>         * ipa-inline.c (inline_small_functions): Set is_in_ipa_inline.
>>>>>>>> >>         * ipa-inline.h (is_in_ipa_inline): Declare.
>>>>>>>> >>
>>>>>>>> >>         * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
>>>>>>>> >>         * testsuite/gcc.dg/pr26570.c: Ditto.
>>>>>>>> >>         * testsuite/gcc.dg/pr32773.c: Ditto.
>>>>>>>> >>         * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto.
>>>>>>>> >>
>>>>>>>> >
>>>>>>>> > [...]
>>>>>>>> >
>>>>>>>> >> Index: ipa-inline-transform.c
>>>>>>>> >> ===================================================================
>>>>>>>> >> --- ipa-inline-transform.c      (revision 201461)
>>>>>>>> >> +++ ipa-inline-transform.c      (working copy)
>>>>>>>> >> @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d
>>>>>>>> >>  }
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >> +#define MAX_INT_LENGTH 20
>>>>>>>> >> +
>>>>>>>> >> +/* Return NODE's name and profile count, if available.  */
>>>>>>>> >> +
>>>>>>>> >> +static const char *
>>>>>>>> >> +cgraph_node_opt_info (struct cgraph_node *node)
>>>>>>>> >> +{
>>>>>>>> >> +  char *buf;
>>>>>>>> >> +  size_t buf_size;
>>>>>>>> >> +  const char *bfd_name = lang_hooks.dwarf_name (node->symbol.decl, 0);
>>>>>>>> >> +
>>>>>>>> >> +  if (!bfd_name)
>>>>>>>> >> +    bfd_name = "unknown";
>>>>>>>> >> +
>>>>>>>> >> +  buf_size = strlen (bfd_name) + 1;
>>>>>>>> >> +  if (profile_info)
>>>>>>>> >> +    buf_size += (MAX_INT_LENGTH + 3);
>>>>>>>> >> +
>>>>>>>> >> +  buf = (char *) xmalloc (buf_size);
>>>>>>>> >> +
>>>>>>>> >> +  strcpy (buf, bfd_name);
>>>>>>>> >> +
>>>>>>>> >> +  if (profile_info)
>>>>>>>> >> +    sprintf (buf, "%s ("HOST_WIDEST_INT_PRINT_DEC")", buf, node->count);
>>>>>>>> >> +  return buf;
>>>>>>>> >> +}
>>>>>>>> >
>>>>>>>> > I'm not sure if output of this function is aimed only at the user or
>>>>>>>> > if it is supposed to be used by gcc developers as well.  If the
>>>>>>>> > latter, an incredibly useful thing is to also dump node->symbol.order
>>>>>>>> > too.  We usually dump it after "/" sign separating it from node name.
>>>>>>>> > It is invaluable when examining decisions in C++ code where you can
>>>>>>>> > have lots of clones of a node (and also because existing dumps print
>>>>>>>> > it, it is easy to combine them).
>>>>>>>>
>>>>>>>> The output is useful for both power users doing performance tuning of
>>>>>>>> their application, and by gcc developers. Adding the id is not so
>>>>>>>> useful for the former, but I agree that it is very useful for compiler
>>>>>>>> developers. In fact, in the google branch version we emit more verbose
>>>>>>>> information (the lipo module id and the funcdef_no) to help uniquely
>>>>>>>> identify the routines and to aid in post-processing by humans and
>>>>>>>> tools. So it is probably useful to add something similar here too. Is
>>>>>>>> the node->symbol.order more or less unique than the funcdef_no? I see
>>>>>>>> that you added a patch a few months ago to print the
>>>>>>>> node->symbol.order in the function header, and it also has the
>>>>>>>> advantage as you note of matching up with existing ipa dumps.
>>>>>>>
>>>>>>> node->symbol.order is unique and if I remember correctly, it is not
>>>>>>> even recycled.  Clones, inline clones, thunks, every symbol table node
>>>>>>> gets its own symbol order so it should be more unique than funcdef_no.
>>>>>>> On the other hand it may be a bit cryptic for users but at the same
>>>>>>> time it is only one number.
>>>>>>
>>>>>> Ok, I am going to go ahead and add this to the output.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> >
>>>>>>>> > [...]
>>>>>>>> >
>>>>>>>> >> Index: ipa-inline.c
>>>>>>>> >> ===================================================================
>>>>>>>> >> --- ipa-inline.c        (revision 201461)
>>>>>>>> >> +++ ipa-inline.c        (working copy)
>>>>>>>> >> @@ -118,6 +118,9 @@ along with GCC; see the file COPYING3.  If not see
>>>>>>>> >>  static int overall_size;
>>>>>>>> >>  static gcov_type max_count;
>>>>>>>> >>
>>>>>>>> >> +/* Global variable to denote if it is in ipa-inline pass. */
>>>>>>>> >> +bool is_in_ipa_inline = false;
>>>>>>>> >> +
>>>>>>>> >>  /* Return false when inlining edge E would lead to violating
>>>>>>>> >>     limits on function unit growth or stack usage growth.
>>>>>>>> >>
>>>>>>>> >
>>>>>>>> > In this age of removing global variables, are you sure you need this?
>>>>>>>> > The only user of this seems to be a function that is only being called
>>>>>>>> > from inline_call... can that ever happen when not inlining?  If you
>>>>>>>> > plan to use this function also elsewhere, perhaps the callers will
>>>>>>>> > know whether we are inlining or not and can provide this in a
>>>>>>>> > parameter?
>>>>>>>>
>>>>>>>> This is to distinguish early inlining from ipa inlining.
>>>>>>>
>>>>>>> Oh, right, I did not realize that the IPA part was the important bit
>>>>>>> of the name.
>>>>>>>
>>>>>>>> The volume of
>>>>>>>> early inlining messages is too high to be on for the default setting
>>>>>>>> of -fopt-info, and are not as interesting usually for performance
>>>>>>>> tuning. The dumper will only emit the early inline messages under a
>>>>>>>> more verbose setting (MSG_NOTE):
>>>>>>>>       dump_printf_loc (is_in_ipa_inline ? MSG_OPTIMIZED_LOCATIONS : MSG_NOTE ...
>>>>>>>> The other way I can see to distinguish this would be to check the
>>>>>>>> always_inline_functions_inlined flag on the caller's function. It
>>>>>>>> could also be possible to pass down a flag from the callers of
>>>>>>>> inline_call, but at least one caller (flatten_functions) is shared
>>>>>>>> between early and late inlining, so the flag needs to be passed
>>>>>>>> through that as well. WDYT?
>>>>>>>
>>>>>>> Did you mean flatten_function?  It already has a bool "early"
>>>>>>> parameter.  But I can see that being able to quickly figure out
>>>>>>> whether we are in early inliner or ipa inliner without much hassle is
>>>>>>> useful enough to justify a global variable a month ago, however I
>>>>>>> suppose we should not be introducing them now and so you'd have to put
>>>>>>> such stuff into... well, you'd probably have to put into the universe
>>>>>>> object somewhere because it is basically shared between two passes.
>>>>>>> Another option, even though somewhat hackish, would be to look at
>>>>>>> current_pass and see which pass it is.  I don't know, do what is
>>>>>>> easier or what you like more, just be aware of the problem.
>>>>>>
>>>>>> After thinking about this some more, I think passing down an early
>>>>>> flag from callers is the cleanest way to go.
>>>>>>
>>>>>> I'll fix these and post a new patch later today.
>>>>>
>>>>> New patch below that removes this global variable, and also outputs
>>>>> the node->symbol.order (in square brackets after the function name so
>>>>> as to not clutter it). Inline messages with profile data look look:
>>>>>
>>>>> test.c:8:3: note: foobar [0] (99999000) inlined into foo [2] (1000)
>>>>> with call count 99999000 (via inline instance bar [3] (99999000))
>>>>
>>>> Ick.  This looks both redundant and cluttered.  This is supposed to be
>>>> understandable by GCC users, not only GCC developers.
>>>
>>> The main part that is only useful/understandable to gcc developers is
>>> the node->symbol.order in square brackes, requested by Martin. One
>>> possibility is that I could put that part under a param, disabled by
>>> default. We have something similar on the google branches that emits
>>> LIPO module info in the message, enabled via a param.
>>>
>>> I'd argue that the other information (the profile counts, emitted only
>>> when using -fprofile-use, and the inline call chains) are useful if
>>> you want to understand whether and how critical inlines are occurring.
>>> I think this is the type of information that users focused on
>>> optimizations, as well as gcc developers, want when they use
>>> -fopt-info. Otherwise it is difficult to make sense of the inline
>>> information.
>>>
>>>>
>>>>> (without FDO the counts in parentheses and the call count would not be
>>>>> included).
>>>>>
>>>>> Ok for trunk?
>>>>
>>>> Let's split this patch.
>>>
>>> Ok.
>>>
>>>>
>>>>> Thanks,
>>>>> Teresa
>>>>>
>>>>> 013-08-06  Teresa Johnson  <tejohnson@google.com>
>>>>>             Dehao Chen  <dehao@google.com>
>>>>>
>>>>>         * dumpfile.c (dump_loc): Output column number, make newlines consistent.
>>>>
>>>> I don't like column numbers, they are of not much use generally.
>>>
>>> I added these here to get consistency with other messages (notes
>>> emitted via inform(), warnings, errors). Plus the dg-message testing
>>> was failing for the test cases that parse this output, since it
>>> expects the column to exist.
>>
>> The above change (output column number) and the changes in the
>> testsuite go with the change you have approved below (due to moving
>> some profile messages to the new framework). Ok to commit these along
>> with that approved portion?
>
> Richard is this part ok since it goes with the part you approved below?
>
> Thanks,
> Teresa
>
>>
>> Thanks,
>> Teresa
>>
>>>
>>>> Does
>>>> 'make newlines consitent' avoid all the spurious vertical spacing I see with
>>>> -fopt-info?
>>>
>>> Well, it helps get us there. The problem was that before, since
>>> dump_loc was not consistently emitting newlines, the calls had to emit
>>> their own newlines manually in the string to ensure there was a
>>> newline at all. I was thinking that once this is fixed I could go back
>>> and clean up all those calls by removing the newlines in the string. I
>>> could split this part into a separate patch and do both at once.
>>>
>>> However, after thinking about this some more this morning, I am
>>> wondering whether it is better to remove the newline emission
>>> completely from dump_loc and rely on the caller to put the newline in
>>> the string. The reason is that there are 2 high level interfaces to
>>> the new dump infrastructure, dump_printf() and dump_printf_loc(). Only
>>> the latter invokes dump_loc and gets the newline at the start of the
>>> message. The typical usage seems to be to start a message via
>>> dump_printf_loc, and then use dump_printf to emit parts of the message
>>> (thus not requiring a newline), but I think it may lead to problems to
>>> rely on this assumption.
>>>
>>> So if you agree, I will simply remove the newline altogether from
>>> dump_loc, and ensure that all clients of dump_printf/dump_printf_loc
>>> include a newline char as appropriate in the string they pass.
>>>
>>>>
>>>>>         * dumpfile.h (OPTGROUP_OTHER): Add and enable under OPTGROUP_ALL.
>>>>
>>>> Good change - please split this out (with the related changes) and commit it.
>>>
>>> Ok, thanks. Will do.
>>>
>>>>
>>>>>         * ipa-inline-transform.c (cgraph_node_opt_info): New function.
>>>>>         (cgraph_node_call_chain): Ditto.
>>>>>         (dump_inline_decision): Ditto.
>>>>>         (inline_call): Invoke dump_inline_decision, new parameter.
>>>>
>>>> The inline stuff should be split and re-sent, it's non-obvious to me (extra
>>>> function parameters are not documented for example).  I'd rather have
>>>> inline_and_report_call () for example instead of an extra bool parameter.
>>>> But let's iterate over this once it's split out.
>>>
>>> Ok, I will send this separately. I guess we could have a separate
>>> interface inline_and_report_call that is a wrapper around inline_call
>>> and simply invokes the dumper. Note that flatten_function will need to
>>> conditionally call one of the two interfaces based on the value of its
>>> bool early parameter though.
>>>
>>>>
>>>>>         * doc/invoke.texi: Document optall -fopt-info flag.
>>>>>         * profile.c (read_profile_edge_counts): Use new dump framework.
>>>>>         (compute_branch_probabilities): Ditto.
>>>>>         * passes.c (pass_manager::register_one_dump_file): Use OPTGROUP_OTHER
>>>>>         when pass not in any opt group.
>>>>>         * value-prof.c (check_counter): Use new dump framework.
>>>>>         (find_func_by_funcdef_no): Ditto.
>>>>>         (check_ic_target): Ditto.
>>>>>         * coverage.c (get_coverage_counts): Ditto.
>>>>>         (coverage_init): Setup new dump framework.
>>>>
>>>> These pieces look good to me.
>>>>
>>>>>         * ipa-inline.c (recursive_inlining): New inline_call parameter.
>>>>>         (inline_small_functions): Ditto.
>>>>>         (flatten_function): Ditto.
>>>>>         (ipa_inline): Ditto.
>>>>>         (inline_always_inline_functions): Ditto.
>>>>>         (early_inline_small_functions): Ditto.
>>>>>         * ipa-inline.h: Ditto.
>>>>>
>>>>>         * testsuite/gcc.dg/pr40209.c: Use -fopt-info.
>>>>>         * testsuite/gcc.dg/pr26570.c: Ditto.
>>>>>         * testsuite/gcc.dg/pr32773.c: Ditto.
>>>>>         * testsuite/g++.dg/tree-ssa/dom-invalid.C: Ditto.
>>>>
>>>> Why?  Just remove the stray dg- annotations that deal with the unwanted output?
>>>
>>> Because there are dg-message annotations that want to confirm this output.
>>>
>>> Teresa
>>>
>>>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>>>         * testsuite/gcc.dg/inline-dump.c: New test.
>>>>>
>>>>> Index: dumpfile.c
>>>>> ===================================================================
>>>>> --- dumpfile.c  (revision 201461)
>>>>> +++ dumpfile.c  (working copy)
>>>>> @@ -257,16 +257,18 @@ dump_open_alternate_stream (struct dump_file_info
>>>>>  void
>>>>>  dump_loc (int dump_kind, FILE *dfile, source_location loc)
>>>>>  {
>>>>> -  /* Currently vectorization passes print location information.  */
>>>>>    if (dump_kind)
>>>>>      {
>>>>> +      /* Ensure dump message starts on a new line.  */
>>>>> +      fprintf (dfile, "\n");
>>>>>        if (LOCATION_LOCUS (loc) > BUILTINS_LOCATION)
>>>>> -        fprintf (dfile, "\n%s:%d: note: ", LOCATION_FILE (loc),
>>>>> -                 LOCATION_LINE (loc));
>>>>> +        fprintf (dfile, "%s:%d:%d: note: ", LOCATION_FILE (loc),
>>>>> +                 LOCATION_LINE (loc), LOCATION_COLUMN (loc));
>>>>>        else if (current_function_decl)
>>>>> -        fprintf (dfile, "\n%s:%d: note: ",
>>>>> +        fprintf (dfile, "%s:%d:%d: note: ",
>>>>>                   DECL_SOURCE_FILE (current_function_decl),
>>>>> -                 DECL_SOURCE_LINE (current_function_decl));
>>>>> +                 DECL_SOURCE_LINE (current_function_decl),
>>>>> +                 DECL_SOURCE_COLUMN (current_function_decl));
>>>>>      }
>>>>>  }
>>>>>
>>>>> Index: dumpfile.h
>>>>> ===================================================================
>>>>> --- dumpfile.h  (revision 201461)
>>>>> +++ dumpfile.h  (working copy)
>>>>> @@ -97,8 +97,9 @@ enum tree_dump_index
>>>>>  #define OPTGROUP_LOOP        (1 << 2)   /* Loop optimization passes */
>>>>>  #define OPTGROUP_INLINE      (1 << 3)   /* Inlining passes */
>>>>>  #define OPTGROUP_VEC         (1 << 4)   /* Vectorization passes */
>>>>> +#define OPTGROUP_OTHER       (1 << 5)   /* All other passes */
>>>>>  #define OPTGROUP_ALL        (OPTGROUP_IPA | OPTGROUP_LOOP | OPTGROUP_INLINE \
>>>>> -                              | OPTGROUP_VEC)
>>>>> +                              | OPTGROUP_VEC | OPTGROUP_OTHER)
>>>>>
>>>>>  /* Define a tree dump switch.  */
>>>>>  struct dump_file_info
>>>>> Index: ipa-inline-transform.c
>>>>> ===================================================================
>>>>> --- ipa-inline-transform.c      (revision 201461)
>>>>> +++ ipa-inline-transform.c      (working copy)
>>>>> @@ -192,6 +192,111 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d
>>>>>  }
>>>>>
>>>>>
>>>>> +#define MAX_INT_LENGTH 20
>>>>> +
>>>>> +/* Return NODE's name and profile count, if available.  */
>>>>> +
>>>>> +static const char *
>>>>> +cgraph_node_opt_info (struct cgraph_node *node)
>>>>> +{
>>>>> +  char *buf;
>>>>> +  size_t buf_size;
>>>>> +  const char *bfd_name = lang_hooks.dwarf_name (node->symbol.decl, 0);
>>>>> +
>>>>> +  if (!bfd_name)
>>>>> +    bfd_name = "unknown";
>>>>> +
>>>>> +  buf_size = strlen (bfd_name) + 1;
>>>>> +  if (profile_info)
>>>>> +    buf_size += (MAX_INT_LENGTH + 3);
>>>>> +  buf_size += MAX_INT_LENGTH;
>>>>> +
>>>>> +  buf = (char *) xmalloc (buf_size);
>>>>> +
>>>>> +  strcpy (buf, bfd_name);
>>>>> +  //sprintf (buf, "%s/%i", buf, node->symbol.order);
>>>>> +  sprintf (buf, "%s [%i]", buf, node->symbol.order);
>>>>> +
>>>>> +  if (profile_info)
>>>>> +    sprintf (buf, "%s ("HOST_WIDEST_INT_PRINT_DEC")", buf, node->count);
>>>>> +  return buf;
>>>>> +}
>>>>> +
>>>>> +
>>>>> +/* Return CALLER's inlined call chain. Save the cgraph_node of the ultimate
>>>>> +   function that the caller is inlined to in FINAL_CALLER.  */
>>>>> +
>>>>> +static const char *
>>>>> +cgraph_node_call_chain (struct cgraph_node *caller,
>>>>> +                       struct cgraph_node **final_caller)
>>>>> +{
>>>>> +  struct cgraph_node *node;
>>>>> +  const char *via_str = " (via inline instance";
>>>>> +  size_t current_string_len = strlen (via_str) + 1;
>>>>> +  size_t buf_size = current_string_len;
>>>>> +  char *buf = (char *) xmalloc (buf_size);
>>>>> +
>>>>> +  buf[0] = 0;
>>>>> +  gcc_assert (caller->global.inlined_to != NULL);
>>>>> +  strcat (buf, via_str);
>>>>> +  for (node = caller; node->global.inlined_to != NULL;
>>>>> +       node = node->callers->caller)
>>>>> +    {
>>>>> +      const char *name = cgraph_node_opt_info (node);
>>>>> +      current_string_len += (strlen (name) + 1);
>>>>> +      if (current_string_len >= buf_size)
>>>>> +       {
>>>>> +         buf_size = current_string_len * 2;
>>>>> +         buf = (char *) xrealloc (buf, buf_size);
>>>>> +       }
>>>>> +      strcat (buf, " ");
>>>>> +      strcat (buf, name);
>>>>> +    }
>>>>> +  strcat (buf, ")");
>>>>> +  *final_caller = node;
>>>>> +  return buf;
>>>>> +}
>>>>> +
>>>>> +
>>>>> +/* Dump the inline decision of EDGE.  */
>>>>> +
>>>>> +static void
>>>>> +dump_inline_decision (struct cgraph_edge *edge, bool early)
>>>>> +{
>>>>> +  location_t locus;
>>>>> +  const char *inline_chain_text;
>>>>> +  const char *call_count_text;
>>>>> +  struct cgraph_node *final_caller = edge->caller;
>>>>> +
>>>>> +  if (final_caller->global.inlined_to != NULL)
>>>>> +    inline_chain_text = cgraph_node_call_chain (final_caller, &final_caller);
>>>>> +  else
>>>>> +    inline_chain_text = "";
>>>>> +
>>>>> +  if (edge->count > 0)
>>>>> +    {
>>>>> +      const char *call_count_str = " with call count ";
>>>>> +      char *buf = (char *) xmalloc (strlen (call_count_str) + MAX_INT_LENGTH);
>>>>> +      sprintf (buf, "%s"HOST_WIDEST_INT_PRINT_DEC, call_count_str,
>>>>> +              edge->count);
>>>>> +      call_count_text = buf;
>>>>> +    }
>>>>> +  else
>>>>> +    {
>>>>> +      call_count_text = "";
>>>>> +    }
>>>>> +
>>>>> +  locus = gimple_location (edge->call_stmt);
>>>>> +  dump_printf_loc (early ? MSG_NOTE : MSG_OPTIMIZED_LOCATIONS,
>>>>> +                   locus,
>>>>> +                   "%s inlined into %s%s%s\n",
>>>>> +                   cgraph_node_opt_info (edge->callee),
>>>>> +                   cgraph_node_opt_info (final_caller),
>>>>> +                   call_count_text,
>>>>> +                   inline_chain_text);
>>>>> +}
>>>>> +
>>>>> +
>>>>>  /* Mark edge E as inlined and update callgraph accordingly.  UPDATE_ORIGINAL
>>>>>     specify whether profile of original function should be updated.  If any new
>>>>>     indirect edges are discovered in the process, add them to NEW_EDGES, unless
>>>>> @@ -205,7 +310,8 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d
>>>>>  bool
>>>>>  inline_call (struct cgraph_edge *e, bool update_original,
>>>>>              vec<cgraph_edge_p> *new_edges,
>>>>> -            int *overall_size, bool update_overall_summary)
>>>>> +            int *overall_size, bool update_overall_summary,
>>>>> +             bool early)
>>>>>  {
>>>>>    int old_size = 0, new_size = 0;
>>>>>    struct cgraph_node *to = NULL;
>>>>> @@ -218,6 +324,9 @@ inline_call (struct cgraph_edge *e, bool update_or
>>>>>    bool predicated = inline_edge_summary (e)->predicate != NULL;
>>>>>  #endif
>>>>>
>>>>> +  if (dump_enabled_p ())
>>>>> +    dump_inline_decision (e, early);
>>>>> +
>>>>>    /* Don't inline inlined edges.  */
>>>>>    gcc_assert (e->inline_failed);
>>>>>    /* Don't even think of inlining inline clone.  */
>>>>> Index: doc/invoke.texi
>>>>> ===================================================================
>>>>> --- doc/invoke.texi     (revision 201461)
>>>>> +++ doc/invoke.texi     (working copy)
>>>>> @@ -6234,6 +6234,9 @@ Enable dumps from all loop optimizations.
>>>>>  Enable dumps from all inlining optimizations.
>>>>>  @item vec
>>>>>  Enable dumps from all vectorization optimizations.
>>>>> +@item optall
>>>>> +Enable dumps from all optimizations. This is a superset of
>>>>> +the optimization groups listed above.
>>>>>  @end table
>>>>>
>>>>>  For example,
>>>>> Index: profile.c
>>>>> ===================================================================
>>>>> --- profile.c   (revision 201461)
>>>>> +++ profile.c   (working copy)
>>>>> @@ -432,8 +432,8 @@ read_profile_edge_counts (gcov_type *exec_counts)
>>>>>                     if (flag_profile_correction)
>>>>>                       {
>>>>>                         static bool informed = 0;
>>>>> -                       if (!informed)
>>>>> -                         inform (input_location,
>>>>> +                       if (dump_enabled_p () && !informed)
>>>>> +                         dump_printf_loc (MSG_NOTE, input_location,
>>>>>                                   "corrupted profile info: edge count
>>>>> exceeds maximal count");
>>>>>                         informed = 1;
>>>>>                       }
>>>>> @@ -692,10 +692,11 @@ compute_branch_probabilities (unsigned cfg_checksu
>>>>>         {
>>>>>           /* Inconsistency detected. Make it flow-consistent. */
>>>>>           static int informed = 0;
>>>>> -         if (informed == 0)
>>>>> +         if (dump_enabled_p () && informed == 0)
>>>>>             {
>>>>>               informed = 1;
>>>>> -             inform (input_location, "correcting inconsistent profile data");
>>>>> +             dump_printf_loc (MSG_NOTE, input_location,
>>>>> +                              "correcting inconsistent profile data");
>>>>>             }
>>>>>           correct_negative_edge_counts ();
>>>>>           /* Set bb counts to the sum of the outgoing edge counts */
>>>>> Index: passes.c
>>>>> ===================================================================
>>>>> --- passes.c    (revision 201461)
>>>>> +++ passes.c    (working copy)
>>>>> @@ -524,6 +524,11 @@ pass_manager::register_one_dump_file (struct opt_p
>>>>>    flag_name = concat (prefix, name, num, NULL);
>>>>>    glob_name = concat (prefix, name, NULL);
>>>>>    optgroup_flags |= pass->optinfo_flags;
>>>>> +  /* For any passes that do not have an optgroup set, and which are not
>>>>> +     IPA passes setup above, set the optgroup to OPTGROUP_OTHER so that
>>>>> +     any dump messages are emitted properly under -fopt-info(-optall).  */
>>>>> +  if (optgroup_flags == OPTGROUP_NONE)
>>>>> +    optgroup_flags = OPTGROUP_OTHER;
>>>>>    id = dump_register (dot_name, flag_name, glob_name, flags, optgroup_flags);
>>>>>    set_pass_for_id (id, pass);
>>>>>    full_name = concat (prefix, pass->name, num, NULL);
>>>>> Index: value-prof.c
>>>>> ===================================================================
>>>>> --- value-prof.c        (revision 201461)
>>>>> +++ value-prof.c        (working copy)
>>>>> @@ -585,9 +585,11 @@ check_counter (gimple stmt, const char * name,
>>>>>                : DECL_SOURCE_LOCATION (current_function_decl);
>>>>>        if (flag_profile_correction)
>>>>>          {
>>>>> -         inform (locus, "correcting inconsistent value profile: "
>>>>> -                 "%s profiler overall count (%d) does not match BB count "
>>>>> -                  "(%d)", name, (int)*all, (int)bb_count);
>>>>> +          if (dump_enabled_p ())
>>>>> +            dump_printf_loc (MSG_MISSED_OPTIMIZATION, locus,
>>>>> +                             "correcting inconsistent value profile: %s "
>>>>> +                             "profiler overall count (%d) does not match BB "
>>>>> +                             "count (%d)", name, (int)*all, (int)bb_count);
>>>>>           *all = bb_count;
>>>>>           if (*count > *all)
>>>>>              *count = *all;
>>>>> @@ -1209,9 +1211,11 @@ find_func_by_funcdef_no (int func_id)
>>>>>    int max_id = get_last_funcdef_no ();
>>>>>    if (func_id >= max_id || cgraph_node_map[func_id] == NULL)
>>>>>      {
>>>>> -      if (flag_profile_correction)
>>>>> -        inform (DECL_SOURCE_LOCATION (current_function_decl),
>>>>> -                "Inconsistent profile: indirect call target (%d) does
>>>>> not exist", func_id);
>>>>> +      if (flag_profile_correction && dump_enabled_p ())
>>>>> +        dump_printf_loc (MSG_MISSED_OPTIMIZATION,
>>>>> +                         DECL_SOURCE_LOCATION (current_function_decl),
>>>>> +                         "Inconsistent profile: indirect call target (%d) "
>>>>> +                         "does not exist", func_id);
>>>>>        else
>>>>>          error ("Inconsistent profile: indirect call target (%d) does
>>>>> not exist", func_id);
>>>>>
>>>>> @@ -1235,8 +1239,10 @@ check_ic_target (gimple call_stmt, struct cgraph_n
>>>>>       return true;
>>>>>
>>>>>     locus =  gimple_location (call_stmt);
>>>>> -   inform (locus, "Skipping target %s with mismatching types for icall ",
>>>>> -           cgraph_node_name (target));
>>>>> +   if (dump_enabled_p ())
>>>>> +     dump_printf_loc (MSG_MISSED_OPTIMIZATION, locus,
>>>>> +                      "Skipping target %s with mismatching types for icall ",
>>>>> +                      cgraph_node_name (target));
>>>>>     return false;
>>>>>  }
>>>>>
>>>>> Index: coverage.c
>>>>> ===================================================================
>>>>> --- coverage.c  (revision 201461)
>>>>> +++ coverage.c  (working copy)
>>>>> @@ -43,6 +43,7 @@ along with GCC; see the file COPYING3.  If not see
>>>>>  #include "langhooks.h"
>>>>>  #include "hash-table.h"
>>>>>  #include "tree-iterator.h"
>>>>> +#include "tree-pass.h"
>>>>>  #include "cgraph.h"
>>>>>  #include "dumpfile.h"
>>>>>  #include "diagnostic-core.h"
>>>>> @@ -341,11 +342,13 @@ get_coverage_counts (unsigned counter, unsigned ex
>>>>>      {
>>>>>        static int warned = 0;
>>>>>
>>>>> -      if (!warned++)
>>>>> -       inform (input_location, (flag_guess_branch_prob
>>>>> -                ? "file %s not found, execution counts estimated"
>>>>> -                : "file %s not found, execution counts assumed to be zero"),
>>>>> -               da_file_name);
>>>>> +      if (!warned++ && dump_enabled_p ())
>>>>> +       dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, input_location,
>>>>> +                         (flag_guess_branch_prob
>>>>> +                          ? "file %s not found, execution counts estimated"
>>>>> +                          : "file %s not found, execution counts assumed to "
>>>>> +                            "be zero"),
>>>>> +                         da_file_name);
>>>>>        return NULL;
>>>>>      }
>>>>>
>>>>> @@ -369,21 +372,25 @@ get_coverage_counts (unsigned counter, unsigned ex
>>>>>         warning_at (input_location, OPT_Wcoverage_mismatch,
>>>>>                     "the control flow of function %qE does not match "
>>>>>                     "its profile data (counter %qs)", id, ctr_names[counter]);
>>>>> -      if (warning_printed)
>>>>> +      if (warning_printed && dump_enabled_p ())
>>>>>         {
>>>>> -        inform (input_location, "use -Wno-error=coverage-mismatch to tolerate "
>>>>> -                "the mismatch but performance may drop if the
>>>>> function is hot");
>>>>> +          dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, input_location,
>>>>> +                           "use -Wno-error=coverage-mismatch to tolerate "
>>>>> +                           "the mismatch but performance may drop if the "
>>>>> +                           "function is hot");
>>>>>
>>>>>           if (!seen_error ()
>>>>>               && !warned++)
>>>>>             {
>>>>> -             inform (input_location, "coverage mismatch ignored");
>>>>> -             inform (input_location, flag_guess_branch_prob
>>>>> -                     ? G_("execution counts estimated")
>>>>> -                     : G_("execution counts assumed to be zero"));
>>>>> +             dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, input_location,
>>>>> +                               "coverage mismatch ignored");
>>>>> +             dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, input_location,
>>>>> +                               flag_guess_branch_prob
>>>>> +                               ? G_("execution counts estimated")
>>>>> +                               : G_("execution counts assumed to be zero"));
>>>>>               if (!flag_guess_branch_prob)
>>>>> -               inform (input_location,
>>>>> -                       "this can result in poorly optimized code");
>>>>> +               dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, input_location,
>>>>> +                                 "this can result in poorly optimized code");
>>>>>             }
>>>>>         }
>>>>>
>>>>> @@ -1103,6 +1110,11 @@ coverage_init (const char *filename)
>>>>>    int len = strlen (filename);
>>>>>    int prefix_len = 0;
>>>>>
>>>>> +  /* Since coverage_init is invoked very early, before the pass
>>>>> +     manager, we need to set up the dumping explicitly. This is
>>>>> +     similar to the handling in finish_optimization_passes.  */
>>>>> +  dump_start (pass_profile.pass.static_pass_number, NULL);
>>>>> +
>>>>>    if (!profile_data_prefix && !IS_ABSOLUTE_PATH (filename))
>>>>>      profile_data_prefix = getpwd ();
>>>>>
>>>>> @@ -1145,6 +1157,8 @@ coverage_init (const char *filename)
>>>>>           gcov_write_unsigned (bbg_file_stamp);
>>>>>         }
>>>>>      }
>>>>> +
>>>>> +  dump_finish (pass_profile.pass.static_pass_number);
>>>>>  }
>>>>>
>>>>>  /* Performs file-level cleanup.  Close notes file, generate coverage
>>>>> Index: ipa-inline.c
>>>>> ===================================================================
>>>>> --- ipa-inline.c        (revision 201461)
>>>>> +++ ipa-inline.c        (working copy)
>>>>> @@ -1322,7 +1322,7 @@ recursive_inlining (struct cgraph_edge *edge,
>>>>>            reset_edge_growth_cache (curr);
>>>>>         }
>>>>>
>>>>> -      inline_call (curr, false, new_edges, &overall_size, true);
>>>>> +      inline_call (curr, false, new_edges, &overall_size, true, false);
>>>>>        lookup_recursive_calls (node, curr->callee, heap);
>>>>>        n++;
>>>>>      }
>>>>> @@ -1612,7 +1612,8 @@ inline_small_functions (void)
>>>>>             fprintf (dump_file, " Peeling recursion with depth %i\n", depth);
>>>>>
>>>>>           gcc_checking_assert (!callee->global.inlined_to);
>>>>> -         inline_call (edge, true, &new_indirect_edges, &overall_size, true);
>>>>> +         inline_call (edge, true, &new_indirect_edges, &overall_size, true,
>>>>> +                       false);
>>>>>           if (flag_indirect_inlining)
>>>>>             add_new_edges_to_heap (edge_heap, new_indirect_edges);
>>>>>
>>>>> @@ -1733,7 +1734,7 @@ flatten_function (struct cgraph_node *node, bool e
>>>>>                  xstrdup (cgraph_node_name (callee)),
>>>>>                  xstrdup (cgraph_node_name (e->caller)));
>>>>>        orig_callee = callee;
>>>>> -      inline_call (e, true, NULL, NULL, false);
>>>>> +      inline_call (e, true, NULL, NULL, false, early);
>>>>>        if (e->callee != orig_callee)
>>>>>         orig_callee->symbol.aux = (void *) node;
>>>>>        flatten_function (e->callee, early);
>>>>> @@ -1852,7 +1853,8 @@ ipa_inline (void)
>>>>>                                    inline_summary
>>>>> (node->callers->caller)->size);
>>>>>                         }
>>>>>
>>>>> -                     inline_call (node->callers, true, NULL, NULL, true);
>>>>> +                     inline_call (node->callers, true, NULL, NULL, true,
>>>>> +                                   false);
>>>>>                       if (dump_file)
>>>>>                         fprintf (dump_file,
>>>>>                                  " Inlined into %s which now has %i size\n",
>>>>> @@ -1925,7 +1927,7 @@ inline_always_inline_functions (struct cgraph_node
>>>>>         fprintf (dump_file, "  Inlining %s into %s (always_inline).\n",
>>>>>                  xstrdup (cgraph_node_name (e->callee)),
>>>>>                  xstrdup (cgraph_node_name (e->caller)));
>>>>> -      inline_call (e, true, NULL, NULL, false);
>>>>> +      inline_call (e, true, NULL, NULL, false, true);
>>>>>        inlined = true;
>>>>>      }
>>>>>    if (inlined)
>>>>> @@ -1977,7 +1979,7 @@ early_inline_small_functions (struct cgraph_node *
>>>>>         fprintf (dump_file, " Inlining %s into %s.\n",
>>>>>                  xstrdup (cgraph_node_name (callee)),
>>>>>                  xstrdup (cgraph_node_name (e->caller)));
>>>>> -      inline_call (e, true, NULL, NULL, true);
>>>>> +      inline_call (e, true, NULL, NULL, true, true);
>>>>>        inlined = true;
>>>>>      }
>>>>>
>>>>> Index: ipa-inline.h
>>>>> ===================================================================
>>>>> --- ipa-inline.h        (revision 201461)
>>>>> +++ ipa-inline.h        (working copy)
>>>>> @@ -228,7 +228,8 @@ void free_growth_caches (void);
>>>>>  void compute_inline_parameters (struct cgraph_node *, bool);
>>>>>
>>>>>  /* In ipa-inline-transform.c  */
>>>>> -bool inline_call (struct cgraph_edge *, bool, vec<cgraph_edge_p> *,
>>>>> int *, bool);
>>>>> +bool inline_call (struct cgraph_edge *, bool, vec<cgraph_edge_p> *, int *,
>>>>> +                  bool, bool);
>>>>>  unsigned int inline_transform (struct cgraph_node *);
>>>>>  void clone_inlined_nodes (struct cgraph_edge *e, bool, bool, int *);
>>>>>
>>>>> Index: testsuite/gcc.dg/pr40209.c
>>>>> ===================================================================
>>>>> --- testsuite/gcc.dg/pr40209.c  (revision 201461)
>>>>> +++ testsuite/gcc.dg/pr40209.c  (working copy)
>>>>> @@ -1,5 +1,5 @@
>>>>>  /* { dg-do compile } */
>>>>> -/* { dg-options "-O2 -fprofile-use" } */
>>>>> +/* { dg-options "-O2 -fprofile-use -fopt-info" } */
>>>>>
>>>>>  void process(const char *s);
>>>>>
>>>>> Index: testsuite/gcc.dg/pr26570.c
>>>>> ===================================================================
>>>>> --- testsuite/gcc.dg/pr26570.c  (revision 201461)
>>>>> +++ testsuite/gcc.dg/pr26570.c  (working copy)
>>>>> @@ -1,5 +1,5 @@
>>>>>  /* { dg-do compile } */
>>>>> -/* { dg-options "-O2 -fprofile-generate -fprofile-use" } */
>>>>> +/* { dg-options "-O2 -fprofile-generate -fprofile-use -fopt-info" } */
>>>>>
>>>>>  unsigned test (unsigned a, unsigned b)
>>>>>  {
>>>>> Index: testsuite/gcc.dg/pr32773.c
>>>>> ===================================================================
>>>>> --- testsuite/gcc.dg/pr32773.c  (revision 201461)
>>>>> +++ testsuite/gcc.dg/pr32773.c  (working copy)
>>>>> @@ -1,6 +1,6 @@
>>>>>  /* { dg-do compile } */
>>>>> -/* { dg-options "-O -fprofile-use" } */
>>>>> -/* { dg-options "-O -m4 -fprofile-use" { target sh-*-* } } */
>>>>> +/* { dg-options "-O -fprofile-use -fopt-info" } */
>>>>> +/* { dg-options "-O -m4 -fprofile-use -fopt-info" { target sh-*-* } } */
>>>>>
>>>>>  void foo (int *p)
>>>>>  {
>>>>> Index: testsuite/g++.dg/tree-ssa/dom-invalid.C
>>>>> ===================================================================
>>>>> --- testsuite/g++.dg/tree-ssa/dom-invalid.C     (revision 201461)
>>>>> +++ testsuite/g++.dg/tree-ssa/dom-invalid.C     (working copy)
>>>>> @@ -1,7 +1,7 @@
>>>>>  // PR tree-optimization/39557
>>>>>  // invalid post-dom info leads to infinite loop
>>>>>  // { dg-do run }
>>>>> -// { dg-options "-Wall -fno-exceptions -O2 -fprofile-use -fno-rtti" }
>>>>> +// { dg-options "-Wall -fno-exceptions -O2 -fprofile-use -fopt-info
>>>>> -fno-rtti" }
>>>>>
>>>>>  struct C
>>>>>  {
>>>>> Index: testsuite/gcc.dg/inline-dump.c
>>>>> ===================================================================
>>>>> --- testsuite/gcc.dg/inline-dump.c      (revision 0)
>>>>> +++ testsuite/gcc.dg/inline-dump.c      (revision 0)
>>>>> @@ -0,0 +1,11 @@
>>>>> +/* Verify that -fopt-info can output correct inline info.  */
>>>>> +/* { dg-do compile } */
>>>>> +/* { dg-options "-Wall -fopt-info-inline=stderr -O2 -fno-early-inlining" } */
>>>>> +static inline int leaf() {
>>>>> +  int i, ret = 0;
>>>>> +  for (i = 0; i < 10; i++)
>>>>> +    ret += i;
>>>>> +  return ret;
>>>>> +}
>>>>> +static inline int foo(void) { return leaf(); } /* { dg-message "note:
>>>>> leaf .*inlined into bar .*via inline instance foo.*\n" } */
>>>>> +int bar(void) { return foo(); }
>>>>>>
>>>>>> Thanks,
>>>>>> Teresa
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Martin
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
>>>
>>>
>>>
>>> --
>>> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
>>
>>
>>
>> --
>> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413



More information about the Gcc-patches mailing list