Bug 108658 - [GCOV] Function entry is not recorded in a function containing an infinite loop from another thread depending on the optimization level
Summary: [GCOV] Function entry is not recorded in a function containing an infinite lo...
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: gcov-profile (show other bugs)
Version: 12.2.1
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-02-03 15:12 UTC by Sebastian Huber
Modified: 2023-02-09 10:23 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2023-02-06 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sebastian Huber 2023-02-03 15:12:26 UTC
Consider the following test code:

idle.c

void *idle(void *ignored)
{
  while (1) {
    /* Do nothing */
  }

  return 0;
}

main.c

#include <unistd.h>

void *idle(void *ignored);

int main(void)
{
  pthread_t th;
  pthread_create(&th, NULL, idle, NULL);
  sleep(1);
  return 0;
}

This sequence of commands shows that the idle() function entry is not recorded for -O2 and -Og:

gcc-12 -O2 --coverage -c main.c
rm -f *.gc??
gcc-12 -O2 --coverage -c idle.c
gcc-12 -pthread --coverage main.o idle.o
./a.out
gcov-12 idle.c
File 'idle.c'
Lines executed:0.00% of 1
Creating 'idle.c.gcov'

Lines executed:0.00% of 1
cat idle.c.gcov
        -:    0:Source:idle.c
        -:    0:Graph:idle.gcno
        -:    0:Data:idle.gcda
        -:    0:Runs:1
    #####:    1:void *idle(void *ignored)
        -:    2:{
        -:    3:  while (1) {
        -:    4:    /* Do nothing */
        -:    5:  }
        -:    6:
        -:    7:  return 0;
        -:    8:}
rm -f *.gc??
gcc-12 -Og --coverage -c idle.c
gcc-12 -pthread --coverage main.o idle.o
./a.out
gcov-12 idle.c
File 'idle.c'
Lines executed:50.00% of 2
Creating 'idle.c.gcov'

Lines executed:50.00% of 2
cat idle.c.gcov
        -:    0:Source:idle.c
        -:    0:Graph:idle.gcno
        -:    0:Data:idle.gcda
        -:    0:Runs:1
    #####:    1:void *idle(void *ignored)
        -:    2:{
472195650:    3:  while (1) {
        -:    4:    /* Do nothing */
        -:    5:  }
        -:    6:
        -:    7:  return 0;
        -:    8:}
rm -f *.gc??
gcc-12 -O0 --coverage -c idle.c
gcc-12 -pthread --coverage main.o idle.o
./a.out
gcov-12 idle.c
File 'idle.c'
Lines executed:100.00% of 2
Creating 'idle.c.gcov'

Lines executed:100.00% of 2
cat idle.c.gcov
        -:    0:Source:idle.c
        -:    0:Graph:idle.gcno
        -:    0:Data:idle.gcda
        -:    0:Runs:1
472440920:    1:void *idle(void *ignored)
        -:    2:{
472440920:    3:  while (1) {
        -:    4:    /* Do nothing */
        -:    5:  }
        -:    6:
        -:    7:  return 0;
        -:    8:}

For -O0 the line count is also wrong from my point of view. Line 1 should have a count of 1.
Comment 1 Andrew Pinski 2023-02-03 15:23:00 UTC
Try compiling with -pthread too? Otherwise the instrumentation code assumes it is single threaded.
Comment 2 Andrew Pinski 2023-02-03 16:36:47 UTC
https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/Instrumentation-Options.html#index-fprofile-update

> The GCC driver automatically selects ‘prefer-atomic’ when -pthread is present in the command line.

The default for -fprofile-update= is single.
Comment 3 Sebastian Huber 2023-02-03 17:39:48 UTC
Thanks for the hint, however, adding -pthread or -fprofile-update=atomic doesn't change anything.
Comment 4 Sebastian Huber 2023-02-03 18:28:22 UTC
What is interesting is that -g changes the behaviour. I guess there is an error in the mapping of the profiling counter to the associated source code lines.

gcc-12 -O2 --coverage -c main.c -g
rm -f *.gc??
gcc-12 -pthread -fprofile-update=atomic -O2 --coverage -c idle.c -g
gcc-12 -pthread --coverage main.o idle.o
./a.out
gcov-12 idle.c
File 'idle.c'
Lines executed:66.67% of 3
Creating 'idle.c.gcov'

Lines executed:66.67% of 3
cat idle.c.gcov
        -:    0:Source:idle.c
        -:    0:Graph:idle.gcno
        -:    0:Data:idle.gcda
        -:    0:Runs:1
    #####:    1:void *idle(void *ignored)
        -:    2:{
213413784:    3:  while (1) {
        -:    4:    /* Do nothing */
213413784:    5:  }
        -:    6:
        -:    7:  return 0;
        -:    8:}
rm -f *.gc??
gcc-12 -pthread -fprofile-update=atomic -Og --coverage -c idle.c -g
gcc-12 -pthread --coverage main.o idle.o
./a.out
gcov-12 idle.c
File 'idle.c'
Lines executed:66.67% of 3
Creating 'idle.c.gcov'

Lines executed:66.67% of 3
cat idle.c.gcov
        -:    0:Source:idle.c
        -:    0:Graph:idle.gcno
        -:    0:Data:idle.gcda
        -:    0:Runs:1
    #####:    1:void *idle(void *ignored)
        -:    2:{
214569562:    3:  while (1) {
        -:    4:    /* Do nothing */
214569562:    5:  }
        -:    6:
        -:    7:  return 0;
        -:    8:}
rm -f *.gc??
gcc-12 -pthread -fprofile-update=atomic -O0 --coverage -c idle.c -g
gcc-12 -pthread --coverage main.o idle.o
./a.out
gcov-12 idle.c
File 'idle.c'
Lines executed:100.00% of 2
Creating 'idle.c.gcov'

Lines executed:100.00% of 2
cat idle.c.gcov
        -:    0:Source:idle.c
        -:    0:Graph:idle.gcno
        -:    0:Data:idle.gcda
        -:    0:Runs:1
214896204:    1:void *idle(void *ignored)
        -:    2:{
214896204:    3:  while (1) {
        -:    4:    /* Do nothing */
        -:    5:  }
        -:    6:
        -:    7:  return 0;
        -:    8:}
Comment 5 Martin Liška 2023-02-06 09:20:33 UTC
Well, with -O2 it's dead-code elimination pass that removes the GCOV counter stores:

void * idle (void * ignored)
{
  long int __gcov0.idle_I_lsm.4;
  long int PROF_edge_counter_2;

  <bb 2> [local count: 10631108]:
  __gcov0.idle_I_lsm.4_7 = __gcov0.idle[0];

  <bb 3> [local count: 1073741824]:
  # __gcov0.idle_I_lsm.4_6 = PHI <__gcov0.idle_I_lsm.4_7(2), PROF_edge_counter_2(4)>
  PROF_edge_counter_2 = __gcov0.idle_I_lsm.4_6 + 1;

  <bb 4> [local count: 1073741824]:
  goto <bb 3>; [100.00%]

}

after:

Eliminating unnecessary statements:
Deleting : PROF_edge_counter_2 = __gcov0.idle_I_lsm.4_6 + 1;

Deleting : __gcov0.idle_I_lsm.4_6 = PHI <__gcov0.idle_I_lsm.4_7(2), _2(4)>

Deleting : __gcov0.idle_I_lsm.4_7 = __gcov0.idle[0];

Removed 2 of 2 statements (100%)
Removed 1 of 2 PHI nodes (50%)
Merging blocks 3 and 4
fix_loop_structure: fixing up loops for function
void * idle (void * ignored)
{
  long int __gcov0.idle_I_lsm.4;

  <bb 2> [local count: 10631108]:

  <bb 3> [local count: 1073741824]:
  goto <bb 3>; [100.00%]

}

@Richi: is it expected behavior?
Comment 6 Richard Biener 2023-02-06 09:56:49 UTC
The relevant optimization happens in invariant motion which applies store-motion to

void * idle (void * ignored)
{
  long int PROF_edge_counter_1;
  long int PROF_edge_counter_2;

  <bb 2> [local count: 10631108]:

  <bb 3> [local count: 1073741824]:
  PROF_edge_counter_1 = __gcov0.idle[0];
  PROF_edge_counter_2 = PROF_edge_counter_1 + 1;
  __gcov0.idle[0] = PROF_edge_counter_2;
  goto <bb 3>; [100.00%]

producing

void * idle (void * ignored)
{
  long int __gcov0.idle_I_lsm.4;
  long int PROF_edge_counter_1;
  long int PROF_edge_counter_2;

  <bb 2> [local count: 10631108]:
  __gcov0.idle_I_lsm.4_7 = __gcov0.idle[0];

  <bb 3> [local count: 1073741824]:
  # __gcov0.idle_I_lsm.4_6 = PHI <__gcov0.idle_I_lsm.4_7(2), __gcov0.idle_I_lsm.4_8(4)>
  PROF_edge_counter_1 = __gcov0.idle_I_lsm.4_6;
  PROF_edge_counter_2 = PROF_edge_counter_1 + 1;
  __gcov0.idle_I_lsm.4_8 = PROF_edge_counter_2;

  <bb 4> [local count: 1073741824]:
  goto <bb 3>; [100.00%]

that's not wrong I think.  With -fprofile-update=atomic that doesn't
happen but the atomic update call never gets a location assigned, instead
we rely on the stmt-begin/end notes here?

void * idle (void * ignored)
{
  <bb 2> [local count: 10631108]:

  <bb 3> [local count: 1073741824]:
  __atomic_fetch_add_8 (&__gcov0.idle[0], 1, 0);
  goto <bb 3>; [100.00%]
Comment 7 Richard Biener 2023-02-06 09:57:59 UTC
-fno-move-loop-stores disables the store motion.
Comment 8 Martin Liška 2023-02-09 10:23:47 UTC
(In reply to Richard Biener from comment #7)
> -fno-move-loop-stores disables the store motion.

Ok, so I can confirm both -fno-move-loop-stores or -fprofile-update=atomic lead to properly collected numbers with -O2:

$ rm *gcda ; gcc pr108658.c -O2 idle.c --coverage -fprofile-update=atomic && ./a.out && gcov-dump -l a-idle.gcda
a-idle.gcda:data:magic `gcda':version `B30 '
a-idle.gcda:stamp 900763911
a-idle.gcda:checksum 3617524158
a-idle.gcda:  a1000000:   8:OBJECT_SUMMARY runs=1, sum_max=711808344
a-idle.gcda:  01000000:  12:FUNCTION ident=2013603264, lineno_checksum=0x5f7f7dbf, cfg_checksum=0xc48fabfe
a-idle.gcda:    01a10000:   8:COUNTERS arcs 1 counts
a-idle.gcda:                   0: 711842716 

However, we still end up with zero execution number of the problematic line:

gcov -t a-idle.gcda
        -:    0:Source:idle.c
        -:    0:Graph:a-idle.gcno
        -:    0:Data:a-idle.gcda
        -:    0:Runs:1
    #####:    1:void *idle(void *ignored)
        -:    2:{
        -:    3:  while (1) {
        -:    4:    /* Do nothing */
        -:    5:  }
        -:    6:
        -:    7:  return 0;
        -:    8:}

That's caused by many empty blocks at the time of the creation of the note file:

(gdb) pcfun
void * idle (void * ignored)
{
  <bb 2> [local count: 10631108]:

  <bb 3> [local count: 1073741824]:

  <bb 4> [local count: 1073741824]:
  goto <bb 3>; [100.00%]

}

$ gcov-dump -l a-idle.gcno
a-idle.gcno:note:magic `gcno':version `B30 '
a-idle.gcno:stamp 900904516
a-idle.gcno:checksum 0
a-idle.gcno:cwd: /home/marxin/Programming/testcases
a-idle.gcno:  01000000:  52:FUNCTION ident=2013603264, lineno_checksum=0x5f7f7dbf, cfg_checksum=0xc48fabfe, `idle' idle.c:1:7-8:1
a-idle.gcno:    01410000:   4:BLOCKS 5 blocks
a-idle.gcno:    01430000:  12:ARCS 1 arcs
a-idle.gcno:                  block 0: 2:0005(tree,fall)
a-idle.gcno:    01430000:  12:ARCS 1 arcs
a-idle.gcno:                  block 2: 3:0005(tree,fall)
a-idle.gcno:    01430000:  12:ARCS 1 arcs
a-idle.gcno:                  block 3: 4:0004(fall)
a-idle.gcno:    01430000:  12:ARCS 1 arcs
a-idle.gcno:                  block 4: 3:0005(tree,fall)
a-idle.gcno:    01450000:  31:LINES
a-idle.gcno:                  block 2:`idle.c':1

so with -O2 we only tract function entrance (idle.c:1) and it belongs to block 2, but the looping happens in BBs 3->4. So we loose the tracking.