This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Help with rich_location and GIMPLE stmts
On 05/19/2017 02:14 PM, Marek Polacek wrote:
> On Thu, May 18, 2017 at 01:22:02PM +0200, Martin Liška wrote:
>> On 05/16/2017 09:14 PM, David Malcolm wrote:
>>> On Mon, 2017-05-15 at 15:36 +0200, Martin Liška wrote:
>>>> Hi.
>>>>
>>>> I sent this email to David some time ago, but it should be probably
>>>> answered
>>>> on gcc mailing list.
>>>
>>>> I have idea one to improve gcov tool and I'm interested in more
>>>> precise locations for gimple
>>>> statements. For gcov purpose, we dump location in ipa-profile pass,
>>>> which is an early IPA
>>>> pass and this data is used by gcov tool to map statements (blocks) to
>>>> lines of code.
>>>>
>>>> I did a small experiment on the place we emit the location data:
>>>> inform (gimple_location (stmt), "output_location");
>>>>
>>>> and it shows for:
>>>> $ cat m2.c
>>>> unsigned int
>>>> UuT (void)
>>>> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned int
>>>> ret = 0; if (true_var) /* count(1) */ { if (false_var) /* count(1) */
>>>> ret = 111; /* count(#####) */ } else ret = 999; /* count(#####) */
>>>> return ret; }
>>>>
>>>> int
>>>> main (int argc, char **argv)
>>>> {
>>>> UuT ();
>>>> return 0;
>>>> }
>>>>
>>>> $ gcc --coverage m2.c
>>>> m2.c: In function ‘main’:
>>>> m2.c:8:3: note: output_location
>>>> UuT ();
>>>> ^~~~~~
>>>> # .MEM_2 = VDEF <.MEM_1(D)>
>>>> UuT ();
>>>> m2.c:9:10: note: output_location
>>>> return 0;
>>>> ^
>>>> _3 = 0;
>>>> m2.c: In function ‘UuT’:
>>>> m2.c:3:16: note: output_location
>>>> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>>>> count(#####) */ return ret; }
>>>> ^~~~~~~~
>>>> true_var_3 = 1;
>>>> m2.c:3:43: note: output_location
>>>> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>>>> count(#####) */ return ret; }
>>>> ^~~~~~~~~
>>>> false_var_4 = 0;
>>>> m2.c:3:71: note: output_location
>>>> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>>>> count(#####) */ return ret; }
>>>>
>>>> ^~~
>>>> ret_5 = 0;
>>>> m2.c:3:83: note: output_location
>>>> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>>>> count(#####) */ return ret; }
>>>>
>>>> ^
>>>> if (true_var_3 != 0)
>>>> m2.c:3:114: note: output_location
>>>> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>>>> count(#####) */ return ret; }
>>>>
>>>> ^
>>>> if (false_var_4 != 0)
>>>> m2.c:3:145: note: output_location
>>>> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>>>> count(#####) */ return ret; }
>>>>
>>>>
>>>> ~~~~^~~~~
>>>> ret_7 = 111;
>>>> m2.c:3:182: note: output_location
>>>> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>>>> count(#####) */ return ret; }
>>>>
>>>>
>>>> ~~~~^~~~~
>>>> ret_6 = 999;
>>>> m2.c:3:215: note: output_location
>>>> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>>>> count(#####) */ return ret; }
>>>>
>>>>
>>>>
>>>> ^~~
>>>> _8 = ret_2;
>>>> m2.c:3:215: note: output_location
>>>> # VUSE <.MEM_9(D)>
>>>> return _8;
>>>>
>>>> Which is not optimal, for some assignments I see just LHS
>>>> (false_var_4 = 0),
>
> Note that
>
> unsigned int false_var = 0;
>
> is not an assignment-expression, it's an initialization. Only the
> '0' here is parsed as an assignment-expression, but in this case
> set_c_expr_source_range isn't called.
Hello.
Yes, I noticed that it's not called.
>
>>>
>>> My first though was: are there assignments for which this isn't the
>>> case? The only one I see is the:
>>> ret = 999;
>>> ~~~~^~~~~
>>>
>>> Are the locations for these assignments coming through from the
>>> frontend?
>>
>> Hi.
>>
>> Actually not all, the default assignments are created in gimplifier and
>> location is assigned from DECL_EXPR:
>>
>> (gdb) p debug_tree(*expr_p)
>> <decl_expr 0x7ffff6988c80
>> type <void_type 0x7ffff6878f18 void VOID
>> align 8 symtab 0 alias set -1 canonical type 0x7ffff6878f18
>> pointer_to_this <pointer_type 0x7ffff68800a8>>
>> side-effects
>> arg 0 <var_decl 0x7ffff7f9ae10 true_var
>> type <integer_type 0x7ffff6878690 unsigned int public unsigned SI
>> size <integer_cst 0x7ffff6860f18 constant 32>
>> unit size <integer_cst 0x7ffff6860f30 constant 4>
>> align 32 symtab 0 alias set -1 canonical type 0x7ffff6878690 precision 32 min <integer_cst 0x7ffff6860f48 0> max <integer_cst 0x7ffff6860f00 4294967295>
>> pointer_to_this <pointer_type 0x7ffff6885dc8>>
>> used unsigned SI file /tmp/m2.c line 4 col 16 size <integer_cst 0x7ffff6860f18 32> unit size <integer_cst 0x7ffff6860f30 4>
>> align 32 context <function_decl 0x7ffff697ce00 UuT> initial <integer_cst 0x7ffff698b258 1>
>> chain <var_decl 0x7ffff7f9aea0 false_var type <integer_type 0x7ffff6878690 unsigned int>
>> used unsigned SI file /tmp/m2.c line 4 col 43 size <integer_cst 0x7ffff6860f18 32> unit size <integer_cst 0x7ffff6860f30 4>
>> align 32 context <function_decl 0x7ffff697ce00 UuT> initial <integer_cst 0x7ffff6860f48 0> chain <var_decl 0x7ffff7f9af30 ret>>>
>> /tmp/m2.c:4:16 start: /tmp/m2.c:4:16 finish: /tmp/m2.c:4:23>
>>
>> That explains why only LHS of these assignments is selected.
>>
>>>
>>> I believe that in the C frontend these are assignment-expression, and
>>> hence handled by c_parser_expr_no_commas; in particular the use of
>>> op_location and the call:
>>>
>>> set_c_expr_source_range (&ret, lhs.get_start (), rhs.get_finish ());
>>>
>>> ought to be setting up the caret of the assignment to be on the
>>> operator token, and for the start/finish to range from the start of the
>>> lhs to the end of the rhs i.e. what we see for:
>>>
>>> ret = 999;
>>> ~~~~^~~~~
>>
>> Yep, MODIFY_EXPRs created in FE go this way and it's fine.
>>
>>>
>>>
>>>> for return statements only a returned value is displayed.
>>>
>>> Is this running on SSA form? If so, I wonder if you're running into
>>> something like this:
>>>
>>> retval_N = PHI <lots of values>;
>>> return retval_N;
>>>
>>> where it's using the location of that "return retval_N;" for all of the
>>> return statements in the function, rather than the individual source
>>> locations.
>>
>> Yep, but we properly assign each assignment to a SSA name that's going to
>> be merged in exit BB by PHI node:
>>
>> _8 = ret_2;
>> /tmp/m2.c:7:8: note: output_location
>> return ret; }
>> ^~~
>>
>> Here the location comes from c_finish_return function where location
>> comes from a value that's returned.
>>
>>>
>>>> For conditions, only condition beginning is showed.
>>>> Is this known behavior or do I miss
>>>> something?
>>>
>>> c_parser_if_statement has:
>>>
>>> loc = c_parser_peek_token (parser)->location;
>>>
>>> which is that of the open-paren. Maybe we should build a location
>>> covering the range of the "if ( expression )" part of the if-statement?
>>
>> Adding Marek as C FE maintainer to reply the question.
>
> I suppose we could do better and I'd probably highlight just the expression
> part of "if ( expression )". But not sure how many use cases this range
> location would have.
Works for me. I guess it can take some time to improve locations of GIMPLE expressions.
Anyhow, I can start enhancing gcov tool even with current locations. Having that, we can
provide users following kind of information:
1|int b, c, d, e;
2|
3|int main()
4|{
^1
5| int a = b < 1 ? (c < 3 ? d : c) : e;
^1 ^1 ^0 ^0
6| return a;
7|}
Where '^0' means the block (statements) are not executed.
Martin
>
> Marek
>