This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Help with rich_location and GIMPLE stmts


On Thu, May 18, 2017 at 01:22:02PM +0200, Martin Liška wrote:
> On 05/16/2017 09:14 PM, David Malcolm wrote:
> > On Mon, 2017-05-15 at 15:36 +0200, Martin Liška wrote:
> >> Hi.
> >>
> >> I sent this email to David some time ago, but it should be probably
> >> answered
> >> on gcc mailing list.
> > 
> >> I have idea one to improve gcov tool and I'm interested in more
> >> precise locations for gimple
> >> statements. For gcov purpose, we dump location in ipa-profile pass,
> >> which is an early IPA
> >> pass and this data is used by gcov tool to map statements (blocks) to
> >> lines of code.
> >>
> >> I did a small experiment on the place we emit the location data:
> >> 		  inform (gimple_location (stmt), "output_location");
> >>
> >> and it shows for:
> >> $ cat m2.c
> >> unsigned int
> >> UuT (void)
> >> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned int
> >> ret = 0; if (true_var) /* count(1) */ { if (false_var) /* count(1) */
> >> ret = 111; /* count(#####) */ } else ret = 999; /* count(#####) */
> >> return ret; }
> >>
> >> int
> >> main (int argc, char **argv)
> >> {
> >>   UuT ();
> >>   return 0;
> >> }
> >>
> >> $ gcc --coverage m2.c
> >> m2.c: In function ‘main’:
> >> m2.c:8:3: note: output_location
> >>    UuT ();
> >>    ^~~~~~
> >> # .MEM_2 = VDEF <.MEM_1(D)>
> >> UuT ();
> >> m2.c:9:10: note: output_location
> >>    return 0;
> >>           ^
> >> _3 = 0;
> >> m2.c: In function ‘UuT’:
> >> m2.c:3:16: note: output_location
> >>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> >> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> >> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> >> count(#####) */ return ret; }
> >>                 ^~~~~~~~
> >> true_var_3 = 1;
> >> m2.c:3:43: note: output_location
> >>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> >> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> >> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> >> count(#####) */ return ret; }
> >>                                            ^~~~~~~~~
> >> false_var_4 = 0;
> >> m2.c:3:71: note: output_location
> >>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> >> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> >> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> >> count(#####) */ return ret; }
> >>                                                                      
> >>   ^~~
> >> ret_5 = 0;
> >> m2.c:3:83: note: output_location
> >>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> >> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> >> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> >> count(#####) */ return ret; }
> >>                                                                      
> >>               ^
> >> if (true_var_3 != 0)
> >> m2.c:3:114: note: output_location
> >>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> >> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> >> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> >> count(#####) */ return ret; }
> >>                                                                      
> >>                                              ^
> >> if (false_var_4 != 0)
> >> m2.c:3:145: note: output_location
> >>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> >> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> >> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> >> count(#####) */ return ret; }
> >>                                                                      
> >>                                                                      
> >>    ~~~~^~~~~
> >> ret_7 = 111;
> >> m2.c:3:182: note: output_location
> >>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> >> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> >> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> >> count(#####) */ return ret; }
> >>                                                                      
> >>                                                                      
> >>                                         ~~~~^~~~~
> >> ret_6 = 999;
> >> m2.c:3:215: note: output_location
> >>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> >> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> >> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> >> count(#####) */ return ret; }
> >>                                                                      
> >>                                                                      
> >>                                                                      
> >>         ^~~
> >> _8 = ret_2;
> >> m2.c:3:215: note: output_location
> >> # VUSE <.MEM_9(D)>
> >> return _8;
> >>
> >> Which is not optimal, for some assignments I see just LHS 
> >> (false_var_4 = 0),

Note that

  unsigned int false_var = 0;

is not an assignment-expression, it's an initialization.  Only the
'0' here is parsed as an assignment-expression, but in this case
set_c_expr_source_range isn't called.

> > 
> > My first though was: are there assignments for which this isn't the
> > case?  The only one I see is the:
> >   ret = 999;
> >   ~~~~^~~~~
> > 
> > Are the locations for these assignments coming through from the
> > frontend?
> 
> Hi.
> 
> Actually not all, the default assignments are created in gimplifier and 
> location is assigned from DECL_EXPR:
> 
> (gdb) p debug_tree(*expr_p)
>  <decl_expr 0x7ffff6988c80
>     type <void_type 0x7ffff6878f18 void VOID
>         align 8 symtab 0 alias set -1 canonical type 0x7ffff6878f18
>         pointer_to_this <pointer_type 0x7ffff68800a8>>
>     side-effects
>     arg 0 <var_decl 0x7ffff7f9ae10 true_var
>         type <integer_type 0x7ffff6878690 unsigned int public unsigned SI
>             size <integer_cst 0x7ffff6860f18 constant 32>
>             unit size <integer_cst 0x7ffff6860f30 constant 4>
>             align 32 symtab 0 alias set -1 canonical type 0x7ffff6878690 precision 32 min <integer_cst 0x7ffff6860f48 0> max <integer_cst 0x7ffff6860f00 4294967295>
>             pointer_to_this <pointer_type 0x7ffff6885dc8>>
>         used unsigned SI file /tmp/m2.c line 4 col 16 size <integer_cst 0x7ffff6860f18 32> unit size <integer_cst 0x7ffff6860f30 4>
>         align 32 context <function_decl 0x7ffff697ce00 UuT> initial <integer_cst 0x7ffff698b258 1>
>         chain <var_decl 0x7ffff7f9aea0 false_var type <integer_type 0x7ffff6878690 unsigned int>
>             used unsigned SI file /tmp/m2.c line 4 col 43 size <integer_cst 0x7ffff6860f18 32> unit size <integer_cst 0x7ffff6860f30 4>
>             align 32 context <function_decl 0x7ffff697ce00 UuT> initial <integer_cst 0x7ffff6860f48 0> chain <var_decl 0x7ffff7f9af30 ret>>>
>     /tmp/m2.c:4:16 start: /tmp/m2.c:4:16 finish: /tmp/m2.c:4:23>
> 
> That explains why only LHS of these assignments is selected.
> 
> > 
> > I believe that in the C frontend these are assignment-expression, and
> > hence handled by c_parser_expr_no_commas; in particular the use of
> > op_location and the call:
> > 
> >   set_c_expr_source_range (&ret, lhs.get_start (), rhs.get_finish ());
> > 
> > ought to be setting up the caret of the assignment to be on the
> > operator token, and for the start/finish to range from the start of the
> > lhs to the end of the rhs i.e. what we see for:
> > 
> >   ret = 999;
> >   ~~~~^~~~~
> 
> Yep, MODIFY_EXPRs created in FE go this way and it's fine.
> 
> > 
> > 
> >> for return statements only a returned value is displayed. 
> > 
> > Is this running on SSA form?  If so, I wonder if you're running into
> > something like this:
> > 
> >   retval_N = PHI <lots of values>;
> >   return retval_N;
> > 
> > where it's using the location of that "return retval_N;" for all of the
> > return statements in the function, rather than the individual source
> > locations.
> 
> Yep, but we properly assign each assignment to a SSA name that's going to
> be merged in exit BB by PHI node:
> 
> _8 = ret_2;
> /tmp/m2.c:7:8: note: output_location
>  return ret; }
>         ^~~
> 
> Here the location comes from c_finish_return function where location
> comes from a value that's returned.
> 
> > 
> >> For conditions, only condition beginning is showed.
> >> Is this known behavior or do I miss
> >> something?
> > 
> > c_parser_if_statement has:
> > 
> >   loc = c_parser_peek_token (parser)->location;
> > 
> > which is that of the open-paren.  Maybe we should build a location
> > covering the range of the "if ( expression )" part of the if-statement?
> 
> Adding Marek as C FE maintainer to reply the question.

I suppose we could do better and I'd probably highlight just the expression
part of "if ( expression )".  But not sure how many use cases this range
location would have.

	Marek


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]