[PATCH] gcov: Mark BBs that do not correspond to a line in source code (PR gcov-profile/79891).

Martin Liška mliska@suse.cz
Tue Mar 14 12:14:00 GMT 2017


On 03/14/2017 12:55 PM, Martin Liška wrote:
> On 03/14/2017 11:48 AM, Richard Biener wrote:
>> On Tue, 14 Mar 2017, Martin Liška wrote:
>>
>>> On 03/14/2017 11:30 AM, Richard Biener wrote:
>>>> On Tue, 14 Mar 2017, Martin Liška wrote:
>>>>
>>>>> On 03/14/2017 11:13 AM, Richard Biener wrote:
>>>>>> On Tue, 14 Mar 2017, Martin Liška wrote:
>>>>>>
>>>>>>> On 03/14/2017 10:12 AM, Richard Biener wrote:
>>>>>>>> On Tue, 14 Mar 2017, Martin Liška wrote:
>>>>>>>>
>>>>>>>>> On 03/14/2017 09:13 AM, Richard Biener wrote:
>>>>>>>>>> On Tue, 14 Mar 2017, Martin Liška wrote:
>>>>>>>>>>
>>>>>>>>>>> On 03/13/2017 04:16 PM, Richard Biener wrote:
>>>>>>>>>>>> On Mon, 13 Mar 2017, Martin Liška wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> On 03/13/2017 02:53 PM, Richard Biener wrote:
>>>>>>>>>>>>>> On Mon, 13 Mar 2017, Martin Liška wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 03/13/2017 02:01 PM, Richard Biener wrote:
>>>>>>>>>>>>>>>> On Mon, 13 Mar 2017, Richard Biener wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Fri, 10 Mar 2017, Martin Liška wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hello.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> As briefly discussed in the PR, there are BB that do not correspond to a real
>>>>>>>>>>>>>>>>>> line in source
>>>>>>>>>>>>>>>>>> code. profile.c emits locations for all BBs that have a gimple statement
>>>>>>>>>>>>>>>>>> belonging to a line.
>>>>>>>>>>>>>>>>>> I hope these should be marked in gcov utility and not added in --all-block
>>>>>>>>>>>>>>>>>> mode to counts of lines.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Patch survives make check RUNTESTFLAGS="gcov.exp".
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks for review and feedback.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Humm, the patch doesn't seem to change the BBs associated with a line
>>>>>>>>>>>>>>>>> but rather somehow changes how we compute counts (the patch lacks a
>>>>>>>>>>>>>>>>> description of how you arrived at it).  I expected the line-to-BB
>>>>>>>>>>>>>>>>> assignment process to be changed (whereever that is...).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Currently, each basic block must belong to a source line. It's how gcov
>>>>>>>>>>>>>>> iterates all blocks (via lines).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Ah, ok, looking at where output_location is called on I see we do not
>>>>>>>>>>>>>>>> assign any line to that block.  But still why does
>>>>>>>>>>>>>>>> line->has_block (arc->src) return true?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Good objection! Problematic that  4->5 edge really comes from the same line.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   <bb 4> [0.00%]:
>>>>>>>>>>>>>>>   ret_7 = 111;
>>>>>>>>>>>>>>>   PROF_edge_counter_10 = __gcov0.UuT[0];
>>>>>>>>>>>>>>>   PROF_edge_counter_11 = PROF_edge_counter_10 + 1;
>>>>>>>>>>>>>>>   __gcov0.UuT[0] = PROF_edge_counter_11;
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   <bb 5> [0.00%]:
>>>>>>>>>>>>>>>   # ret_1 = PHI <ret_5(3), ret_7(4)>
>>>>>>>>>>>>>>>   goto <bb 7>; [0.00%]
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes, but that's basically meaningless, unless not all edges come from the
>>>>>>>>>>>>>> same line?  I see nowhere where we'd explicitely assign lines to
>>>>>>>>>>>>>> edges so it must be gcov "reconstructing" this somewhere.
>>>>>>>>>>>>>
>>>>>>>>>>>>> That's why I added the another flag. We stream locations for basic blocks via
>>>>>>>>>>>>> 'output_location' function. And assignment blocks to lines happens here:
>>>>>>>>>>>>>
>>>>>>>>>>>>> static void
>>>>>>>>>>>>> add_line_counts (coverage_t *coverage, function_t *fn)
>>>>>>>>>>>>> {
>>>>>>>>>>>>> ...
>>>>>>>>>>>>>       if (!ix || ix + 1 == fn->num_blocks)
>>>>>>>>>>>>> 	/* Entry or exit block */;
>>>>>>>>>>>>>       else if (flag_all_blocks)
>>>>>>>>>>>>> 	{
>>>>>>>>>>>>> 	  line_t *block_line = line;
>>>>>>>>>>>>>
>>>>>>>>>>>>> 	  if (!block_line)
>>>>>>>>>>>>> 	    block_line = &sources[fn->src].lines[fn->line];
>>>>>>>>>>>>>
>>>>>>>>>>>>> 	  block->chain = block_line->u.blocks;
>>>>>>>>>>>>> 	  block_line->u.blocks = block;
>>>>>>>>>>>>> 	}
>>>>>>>>>>>>>
>>>>>>>>>>>>> where line is always changes when we reach a BB that has a source line assigned. Thus it's changed
>>>>>>>>>>>>> for BB 4 and that's why BB 5 is added to same line.
>>>>>>>>>>>>
>>>>>>>>>>>> Ah, so this means we should "clear" the current line for BB 5 in
>>>>>>>>>>>> output_location?  At least I don't see how your patch may not regress
>>>>>>>>>>>> some cases where the line wasn't output as an optimization?
>>>>>>>>>>>
>>>>>>>>>>> The new flag on block is kind of clearing that the BB is artificial and in fact does not
>>>>>>>>>>> belong to the line. I didn't want to do a bigger refactoring how blocks are iterated via lines.
>>>>>>>>>>>
>>>>>>>>>>> Can you be please more specific about such a case?
>>>>>>>>>>
>>>>>>>>>> in profile.c I see
>>>>>>>>>>
>>>>>>>>>>   if (name_differs || line_differs)
>>>>>>>>>>     {
>>>>>>>>>>       if (!*offset)
>>>>>>>>>>         {
>>>>>>>>>>           *offset = gcov_write_tag (GCOV_TAG_LINES);
>>>>>>>>>>           gcov_write_unsigned (bb->index);
>>>>>>>>>>           name_differs = line_differs=true;
>>>>>>>>>>         }
>>>>>>>>>>
>>>>>>>>>> ...
>>>>>>>>>>
>>>>>>>>>> so if line_differs is false we might not output GCOV_TAG_LINES either
>>>>>>>>>> because 1) optimization, less stuff output, 2) the block has no line.
>>>>>>>>>> Looks like we can't really distinguish those.
>>>>>>>>>
>>>>>>>>> Agree with that.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Not sure how on the input side we end up associating a BB with
>>>>>>>>>> a line if nothing was output for it though.
>>>>>>>>>>
>>>>>>>>>> That is, with your change don't we need
>>>>>>>>>>
>>>>>>>>>> Index: gcc/profile.c
>>>>>>>>>> ===================================================================
>>>>>>>>>> --- gcc/profile.c       (revision 246082)
>>>>>>>>>> +++ gcc/profile.c       (working copy)
>>>>>>>>>> @@ -941,8 +941,6 @@ output_location (char const *file_name,
>>>>>>>>>>    name_differs = !prev_file_name || filename_cmp (file_name, 
>>>>>>>>>> prev_file_name);
>>>>>>>>>>    line_differs = prev_line != line;
>>>>>>>>>>  
>>>>>>>>>> -  if (name_differs || line_differs)
>>>>>>>>>> -    {
>>>>>>>>>>        if (!*offset)
>>>>>>>>>>         {
>>>>>>>>>>           *offset = gcov_write_tag (GCOV_TAG_LINES);
>>>>>>>>>> @@ -950,6 +948,9 @@ output_location (char const *file_name,
>>>>>>>>>>           name_differs = line_differs=true;
>>>>>>>>>>         }
>>>>>>>>>>  
>>>>>>>>>> +  if (name_differs || line_differs)
>>>>>>>>>> +    {
>>>>>>>>>> +
>>>>>>>>>>        /* If this is a new source file, then output the
>>>>>>>>>>          file's name to the .bb file.  */
>>>>>>>>>>        if (name_differs)
>>>>>>>>>>
>>>>>>>>>> to resolve this ambiguity?  That is, _always_ emit GCOV_TAG_LINES
>>>>>>>>>> for a BB?  So then a BB w/o GCOV_TAG_LINES does _not_ have any
>>>>>>>>>> lines associated.
>>>>>>>>>
>>>>>>>>> That should revolve it. Let me find and example where we do not emit
>>>>>>>>> GCOV_TAG_LINES jsut because there's not difference in lines.
>>>>>>>>
>>>>>>>> sth like
>>>>>>>>
>>>>>>>>  a = b < 1 ? (c < 3 ? d : c);
>>>>>>>>
>>>>>>>> or even
>>>>>>>>
>>>>>>>>  if (..) { ... } else { ... }
>>>>>>>
>>>>>>> These samples work, however your patch would break situations like:
>>>>>>>
>>>>>>>         1:   10:int main ()
>>>>>>>         -:   11:{
>>>>>>>         -:   12:  int i;
>>>>>>>         -:   13:
>>>>>>>        22:   14:  for (i = 0; i < 10; i++)	/* count(11) */
>>>>>>>        10:   15:    noop ();			/* count(10) */
>>>>>>>
>>>>>>> where 22 is summed as (1+10+11), which kind of makes sense as it contains
>>>>>>> of 3 statements.
>>>>>>
>>>>>> 22 is with my patch or without?  I think 22 makes no sense.
>>>>>>
>>>>>> Richard.
>>>>>
>>>>> With your patch.
>>>>
>>>> I see.  As said, I have zero (well, now some little ;)) knowledge
>>>> about gcov.
>>>
>>> :) I'll continue twiddling with that because even loop-less construct
>>> like:
>>>
>>>         1:    1:int foo(int b, int c, int d)
>>>         -:    2:{
>>>         5:    3:  int a = b < 1 ? (c < 3 ? d : c) : a;
>>>         2:    4:  return a;
>>>         -:    5:}
>>>
>>> gives bogus output with your patch (which I believe does proper thing).
>>
>> Reading into the code (yes, it really seems it's for caching purposes
>> given we walk BBs in "random" order) I also observe
> 
> Huh, yeah. Currently line count is a sum of all basic blocks that are emitted
> by profile.c with GCOV_TAG_LINES. That explains why considered loop has count == 11:
> 
> /tmp/gcov-1.gcno:	block 2:`/home/marxin/Programming/gcc/gcc/testsuite/gcc.misc-tests/gcov-1.c':10, 14
> /tmp/gcov-1.gcno:	block 4:`/home/marxin/Programming/gcc/gcc/testsuite/gcc.misc-tests/gcov-1.c':14
> 
> where blocks 2 and 4 are:
> 
>   <bb 2> [0.00%]:
>   i_3 = 0;
>   goto <bb 4>; [0.00%]
> 
> ...
> 
>   <bb 4> [0.00%]:
>   # i_1 = PHI <i_3(2), i_7(3)>
>   if (i_1 <= 9)
>     goto <bb 3>; [0.00%]
>   else
>     goto <bb 5>; [0.00%]
> 
> The same happens to int a = b < 1 ? (c < 3 ? d : c) : a;
> 
> /tmp/gcov2.gcno:	block 2:`/tmp/gcov2.c':1, 3
> 
>   <bb 2> [0.00%]:
>   if (b_3(D) <= 0)
>     goto <bb 3>; [0.00%]
>   else
>     goto <bb 7>; [0.00%]
> 
> That showed a caching of locations actually magically handles loops and ternary operations.
> I'm still wondering how should be defined line count for a multiple statements happening
> on the line? Having that we can find a proper solution.
> 
> Martin

Out of curiosity, there's another example that's broken:

        1:   10:int main ()
        -:   11:{
        -:   12:  int i;
        -:   13:
       12:   14:  for (i = 0;
        -:   15:       i < 10;
       10:   16:       i++)	/* count(11) */
       10:   17:    noop ();			/* count(10) */
        -:   18:
        1:   19:  return 0;			/* count(1) */
        -:   20:}

Martin

> 
>>
>>           for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>>             {
>>               gimple *stmt = gsi_stmt (gsi);
>>               if (!RESERVED_LOCATION_P (gimple_location (stmt)))
>>                 output_location (gimple_filename (stmt), gimple_lineno 
>> (stmt),
>>                                  &offset, bb);
>>
>> should use expand_location and then look at the spelling location,
>> otherwise we'll get interesting effects with macro expansion?
>>
>>             }
>>
>> Richard.
>>
>>> Martin
>>>
>>>
>>>>
>>>> Richard.
>>>>
>>>>> Martin
>>>>>
>>>>>>
>>>>>>> Martin
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Martin
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Richard.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Hope Nathan will find time to provide review as he's familiar with content of gcov.c.
>>>>>>>>>>>
>>>>>>>>>>> Martin
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> OTOH I don't know much about gcov format.
>>>>>>>>>>>>
>>>>>>>>>>>> Richard.
>>>>>>>>>>>>
>>>>>>>>>>>>> Martin
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Richard.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Martin
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Richard.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
> 



More information about the Gcc-patches mailing list