Bug 63572 - [6/7/8 Regression] ICF breaks user debugging experience
Summary: [6/7/8 Regression] ICF breaks user debugging experience
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: debug (show other bugs)
Version: 5.0
: P4 normal
Target Milestone: 5.5
Assignee: Martin Liška
URL:
Keywords:
Depends on: 63566
Blocks: 63571 65303
  Show dependency treegraph
 
Reported: 2014-10-17 09:05 UTC by Jakub Jelinek
Modified: 2016-09-01 14:18 UTC (History)
15 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2014-10-17 00:00:00


Attachments
Gold ICF dwarfdump (4.40 KB, text/html)
2014-10-17 13:21 UTC, Martin Liška
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jakub Jelinek 2014-10-17 09:05:57 UTC
+++ This bug was initially created as a clone of Bug #63566 +++

On -O2 -g:
#define A \
  x += y * z; \
  y = (y << ((z & 2) + 1)) \
      ^ (y >> (__SIZEOF_INT__ * __CHAR_BIT__ - (z & 2) - 1)); \
  z *= 7;
#define B A A A A A A A A A A
#define C B B B B B B B B B B

static unsigned int
f1 (unsigned int x, unsigned int y, unsigned int z)
{
  C
  return x + y + z;
}

static unsigned int
f2 (unsigned int x, unsigned int y, unsigned int z)
{
  C
  return x + y + z;
}

__attribute__((noinline, noclone)) unsigned int
f3 (unsigned int x, unsigned int y, unsigned int z)
{
  return f1 (x, z, y) + 6;
}

__attribute__((noinline, noclone)) unsigned int
f4 (unsigned int x, unsigned int y, unsigned int z)
{
  return f2 (y, x, z) + 7;
}

__attribute__((noinline, noclone, used)) unsigned int
f5 (unsigned int x, unsigned int y, unsigned int z)
{
  return f1 (2 * x, z / 2, y + 3) - 6;
}

__attribute__((noinline, noclone, used)) unsigned int
f6 (unsigned int x, unsigned int y, unsigned int z)
{
  return f2 (y + 2, x | 1, z / 73) + 1;
}

int
main ()
{
  unsigned int x = f3 (0x173214, 0x182172, 0x9314);
  unsigned int y = f4 (0x173214, 0x182172, 0x9314);
#if __SIZEOF_INT__ * __CHAR_BIT__ == 32
  if (x != 0xd8e56f78U || y != 0x494c6699U)
    __builtin_abort ();
#endif
  return 0;
}

GCC emits incomplete DW_TAG_GNU_call_site DIEs for the f1 calls (that have been optimized into ICF alias to f2) - no DW_AT_abstract_origin is provided,
and does not emit any DW_TAG_subprogram for f1 at all, just f2.
The question is how exactly do we want to emit the DW_TAG_subprogram for icf_merged aliases.  Name/abstract origin/function type/names of arguments must be it's own, I guess ICF will happily merge functions where the arguments have e.g. different argument names, or different DECL_ABSTRACT_ORIGIN etc.
My question is mainly about DW_TAG_lexical_block/DW_TAG_inlined_subroutine, but as even names of local variables can differ, or it can have different functions inlined, I bet we need to fully populate DW_TAG_subprogram for the ICF aliases,
and have some mapping between BLOCKs in the ICF alias and BLOCKs in the kept definition, so that we can reconstruct ranges.  Then, when we emit DW_TAG_subprogram details for the kept original function definition when finalizing its RTL, we also need to handle similarly all the ICF aliases.
Which is going to be non-fun.  Also, supposedly not just that, but we also need a way to track multiple locations, not just blocks.
E.g. in one function, first 3 statements can have one location_t, say line 31,
and last 3 statements can have another location_t, say line 38.  Now, in another
function ICF merged with that, the similar first 2 statements can have one location_t, say line 47, the second 2 statements can have location_t line 49 and last 2 statements location_t line 53.  So, it might be impossible to create
e.g. a pointer_map from one set of locations to another set, we'd need to analyze the correspondence between location_t's in between all the spots in the function (gimple_location, EXPR_LOCATION) where they appear, and perhaps create some artificial locations to make sure there is a 1-1 mapping between all ICF merged functions.  Another problem is that we emit these days .debug_line using .file/.loc directives and gas constructs that.  Not sure if we can have multiple .debug_line proglets describing the same area of code (perhaps we need some DWARF extension for that?) and if we had that possibility, the question is how to emit it.
As for the DW_AT_abstract_origin of DW_TAG_GNU_call_site, that should probably be just very easy, if there is a DIE for the ICF merged alias, supposedly it will be found normally.

And, then we'll certainly need some GDB changes (and systemtap/elfutils etc.) - 
if some code range has multiple DW_TAG_subprogram DIEs covering that range,
e.g. if you want to put a breakpoint into a line in one of the functions
Comment 1 Richard Biener 2014-10-17 09:20:52 UTC
For debugging most important is to get

(gdb) b foo

still work when foo was merged with bar and the program now calls bar

Similarly nice (but probably impossible) is

(gdb) b foo.c:23

with foo.c:23 inside foo.  Of course foo and bar, even if considered equal
by ICF, are by no means equivalent lexically.  So foo.c:23 could at most
break at the start of bar.  (or give a useful diagnostic from gdb)
Comment 2 Jakub Jelinek 2014-10-17 09:29:20 UTC
For the different locations and blocks, I meant something like:

struct S { int a; int b; int c; };
struct T { int d; int e; int f; };

static int
f1 (struct S *x)
{
  int g = x->a * 7;
  {
    int h = x->b * 11;
    {
      int i = x->c;
      return g + h + i;
    }
  }
}

static int
f2 (struct T *x)
{
  int j = x->d * 7; int k = x->e * 11;
  int l = x->f;
  return j + k + l;
}

int f3 (struct S *x) { return f1 (x); }
int f4 (struct T *x) { return f2 (x); }
int f5 (struct S *x) { return f1 (x) + 1; }
int f6 (struct T *x) { return f2 (x) + 1; }

(but for some reason ICF doesn't merge this, so we need better testcase).
The point was that with identical .text there doesn't have to be 1:1 mapping between location_t's (and those now include both file:line:column and BLOCK),
in the above example supposedly there is many:1 mapping from f2 to f1 location_t, but generally there could be overlaps etc.
Comment 3 Markus Trippelsdorf 2014-10-17 09:34:54 UTC
*** Bug 63562 has been marked as a duplicate of this bug. ***
Comment 4 Jakub Jelinek 2014-10-17 09:56:22 UTC
Better testcase, where ICF actually happens.

struct S { int a; int b; int c; };

__attribute__((noinline)) static int
f1 (struct S *x)
{
  static int u = 1;
  int g = x->a * 7;
  {
    static int v = 2;
    int h = x->b * 11;
    int i = x->c;
    return g + h + i;
  }
}

__attribute__((noinline)) static int
f2 (struct S *x)
{
  static int w = 3;
  int j = x->a * 7; int k = x->b * 11;
  {
    static int y = 4;
    int l = x->c;
    return j + k + l;
  }
}

__attribute__((noinline)) int f3 (struct S *x) { return f1 (x); }
__attribute__((noinline)) int f4 (struct S *x) { return f2 (x) + 1; }
__attribute__((noinline)) int f5 (struct S *x) { return f1 (x) + 2; }
__attribute__((noinline)) int f6 (struct S *x) { return f2 (x) + 3; }

int
main ()
{
  struct S s = { 1, 2, 3 };
  asm volatile ("" : : "r" (&s) : "memory");
  int a[4];
  a[0] = f3 (&s);
  a[1] = f4 (&s);
  a[2] = f5 (&s);
  a[3] = f6 (&s);
  asm volatile ("" : : "r" (a) : "memory");
  return 0;
}

As for .debug_line, the only solution which doesn't require any extensions would be IMHO to put the icf_merged clone's DW_TAG_subprogram into its own DW_TAG_partial_unit, import it into the DW_TAG_compile_unit where it is needed,
and use a different DW_AT_stmt_list offset in there, and emit part of .debug_line (the one for the icf_merged aliases) manually into .debug_line by the compiler and see whether the assembler will deal with it properly.

Anyway, the ideal user debugging experience IMHO with the above testcase is:
b f1 - debugger finds out that f1 and f2 subprograms have overlapping ranges,
puts a breakpoint into f1==f2 prologue, and when the breakpoint is hit, unwinds, checks if from the backtrace it is possible using DW_TAG_GNU_call_site
figure out which of the functions has been called; if it is, depending on if it is f1 or f2 either honors or ignores the breakpoint; if it isn't possible to uniquely identify what the caller meant to call, honor the breakpoint.
To map instructions back to line info, blocks etc., again, check unwind info/backtrace which function it is, if it is known which one it is, go into .debug_line table corresponding to the CU/PU of the subprogram, otherwise pick one.  Ditto for the DW_TAG_subprogram/DW_TAG_lexical_block/DW_TAG_inlined_subroutine trees.

Does that sound like a plan?
Comment 5 Jakub Jelinek 2014-10-17 10:12:24 UTC
struct S { int a; int b; int c; };
volatile int t;

__attribute__((noinline)) static int
f1 (struct S *x)
{
  static int u = 1;
  int g = x->a * 7;
  t++;
  {
    static int v = 2;
    int h = x->b * 11;
    t++;
    int i = x->c;
    t++;
    return g + h + i;
  }
}

__attribute__((noinline)) static int
f2 (struct S *x)
{
  static int w = 3;
  int j = x->a * 7; t++; int k = x->b * 11; t++;
  {
    static int y = 4;
    int l = x->c;
    t++;
    return j + k + l;
  }
}

__attribute__((noinline)) int f3 (struct S *x) { return f1 (x); }
__attribute__((noinline)) int f4 (struct S *x) { return f2 (x) + 1; }
__attribute__((noinline)) int f5 (struct S *x) { return f1 (x) + 2; }
__attribute__((noinline)) int f6 (struct S *x) { return f2 (x) + 3; }

int
main ()
{
  struct S s = { 1, 2, 3 };
  asm volatile ("" : : "r" (&s) : "memory");
  int a[4];
  a[0] = f3 (&s);
  a[1] = f4 (&s);
  a[2] = f5 (&s);
  a[3] = f6 (&s);
  asm volatile ("" : : "r" (a) : "memory");
  return 0;
}

Even better testcase, which actually has more executable statements, and thus allows to inspect if one sees e.g. the u/v/w/y/g/h/i/j/k/l variables in the scope etc.
Comment 6 Martin Liška 2014-10-17 13:11:06 UTC
There's how gold's ICF works for test attached by Jakub:

gcc --version:
gcc version 5.0.0 20141016 (experimental) (GCC) 

ld --version:
GNU gold (GNU Binutils 2.24.51.20141010) 1.11

$ gcc icf-gdb.c -c -g --function-sections
$ gcc icf-gdb.o -o a.out -Wl,--icf=all,--print-icf-sections
...
ld: ICF folding section '.text.f2' in file 'icf-gdb.o' into '.text.f1' in file 'icf-gdb.o'
...

dwarfdump a.ou shows:
< 1><0x00000059>    DW_TAG_subprogram
                      DW_AT_name                  f1
                      DW_AT_decl_file             0x00000001 /home/marxin/Programming/testcases/icf-gdb.c
                      DW_AT_decl_line             0x00000005
                      DW_AT_prototyped            yes(1)
                      DW_AT_type                  <0x00000052>
                      DW_AT_low_pc                0x00400546
                      DW_AT_high_pc               <offset-from-lowpc>115
                      DW_AT_frame_base            len 0x0001: 9c: DW_OP_call_frame_cfa
                      DW_AT_GNU_all_call_sites    yes(1)
                      DW_AT_sibling               <0x000000e2>


and:
< 1><0x000000e8>    DW_TAG_subprogram
                      DW_AT_name                  f2
                      DW_AT_decl_file             0x00000001 /home/marxin/Programming/testcases/icf-gdb.c
                      DW_AT_decl_line             0x00000015
                      DW_AT_prototyped            yes(1)
                      DW_AT_type                  <0x00000052>
                      DW_AT_low_pc                0x00400546
                      DW_AT_high_pc               <offset-from-lowpc>115
                      DW_AT_frame_base            len 0x0001: 9c: DW_OP_call_frame_cfa
                      DW_AT_GNU_all_call_sites    yes(1)
                      DW_AT_sibling               <0x00000171>

If I tried to put breakpoint in GDB to f2, breakpoint is triggered 4 times with back-trace from all functions f3-f6.
Example:
#0  f1 (x=0x7fffffffda10) at icf-gdb.c:24
#1  0x00000000004005d1 in f3 (x=0x7fffffffda10) at icf-gdb.c:33
#2  0x0000000000400657 in main () at icf-gdb.c:44


Maybe I miss something, but gold also does not support correct DWARF merging.
I will create issue for gold.

Martin
Comment 7 Jakub Jelinek 2014-10-17 13:18:36 UTC
(In reply to Martin Liška from comment #6)
> There's how gold's ICF works for test attached by Jakub:
> 
> gcc --version:
> gcc version 5.0.0 20141016 (experimental) (GCC) 
> 
> ld --version:
> GNU gold (GNU Binutils 2.24.51.20141010) 1.11
> 
> $ gcc icf-gdb.c -c -g --function-sections
> $ gcc icf-gdb.o -o a.out -Wl,--icf=all,--print-icf-sections
> ...
> ld: ICF folding section '.text.f2' in file 'icf-gdb.o' into '.text.f1' in
> file 'icf-gdb.o'
> ...
> 
> dwarfdump a.ou shows:
> < 1><0x00000059>    DW_TAG_subprogram
>                       DW_AT_name                  f1
>                       DW_AT_decl_file             0x00000001
> /home/marxin/Programming/testcases/icf-gdb.c
>                       DW_AT_decl_line             0x00000005
>                       DW_AT_prototyped            yes(1)
>                       DW_AT_type                  <0x00000052>
>                       DW_AT_low_pc                0x00400546
>                       DW_AT_high_pc               <offset-from-lowpc>115
>                       DW_AT_frame_base            len 0x0001: 9c:
> DW_OP_call_frame_cfa
>                       DW_AT_GNU_all_call_sites    yes(1)
>                       DW_AT_sibling               <0x000000e2>
> 
> 
> and:
> < 1><0x000000e8>    DW_TAG_subprogram
>                       DW_AT_name                  f2
>                       DW_AT_decl_file             0x00000001
> /home/marxin/Programming/testcases/icf-gdb.c
>                       DW_AT_decl_line             0x00000015
>                       DW_AT_prototyped            yes(1)
>                       DW_AT_type                  <0x00000052>
>                       DW_AT_low_pc                0x00400546
>                       DW_AT_high_pc               <offset-from-lowpc>115
>                       DW_AT_frame_base            len 0x0001: 9c:
> DW_OP_call_frame_cfa
>                       DW_AT_GNU_all_call_sites    yes(1)
>                       DW_AT_sibling               <0x00000171>
> 
> If I tried to put breakpoint in GDB to f2, breakpoint is triggered 4 times
> with back-trace from all functions f3-f6.
> Example:
> #0  f1 (x=0x7fffffffda10) at icf-gdb.c:24
> #1  0x00000000004005d1 in f3 (x=0x7fffffffda10) at icf-gdb.c:33
> #2  0x0000000000400657 in main () at icf-gdb.c:44
> 
> 
> Maybe I miss something, but gold also does not support correct DWARF merging.
> I will create issue for gold.

Well, the debug info you get after ld ICF merging is better than what GCC creates for ICF merging right now, though still not ideal, but as gold likely doesn't want to rewrite debug info (which is costly operation), it can't do much more than that, while GCC can.

As you can see above, at least you do have two separate DW_TAG_subprogram for f1/f2, vars/lexical blocks/parameters/inline functions in them in a good shape.
Supposedly the DW_TAG_GNU_call_site info is present and correct too, so the debugger has the option to find out in which function it is, but that doesn't necessarily mean gdb has such support present.

What is broken with ld ICF merging is supposedly .debug_line, for both functions you are pointed to the same portion of .debug_line, and that portion
contains two sets of line instructions for the same spots, not sure if it can be considered valid or how would gdb be able to find out which function is which in there.
Comment 8 Martin Liška 2014-10-17 13:21:31 UTC
Created attachment 33747 [details]
Gold ICF dwarfdump
Comment 9 Richard Biener 2014-11-20 12:28:41 UTC
Not sure if we can do anything about this.  ICF breaks this in a similar way
as tail merging does (which makes this kind of regressions older).

I'd say we should suspend this bug ....
Comment 10 Martin Liška 2015-02-09 15:43:50 UTC
As Cary Countant wrote me, gold's ICF has never implemented support for call site tables in GDB. Moreover, when Jakub designed the new call site tags, the design hasn't been changed. In last 4 years, there still hasn't been any demand for it.

Well, I would at least implement emission of a separate DW_TAG_subprogram DIE with correct DW_AT_abstract_origin. As I have no experience with debug info, may I ask Jakub for a hint where to start and how to properly do it?
Comment 11 Jan Hubicka 2015-03-02 17:49:08 UTC
Reported by HJ:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65237

As of r221117,  I still see

FAIL: gcc.dg/guality/sra-1.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  line 32 a[0] == 4
FAIL: gcc.dg/guality/sra-1.c   -O2  line 21 a.i == 4
FAIL: gcc.dg/guality/sra-1.c   -O3 -fomit-frame-pointer  line 21 a.i == 4
FAIL: gcc.dg/guality/sra-1.c   -O3 -g  line 21 a.i == 4
FAIL: gcc.dg/guality/sra-1.c   -Os  line 21 a.i == 4

on Linux/x86-32.
Comment 12 Martin Liška 2015-03-04 10:58:42 UTC
(In reply to Jan Hubicka from comment #11)
> Reported by HJ:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65237
> 
> As of r221117,  I still see
> 
> FAIL: gcc.dg/guality/sra-1.c   -O2 -flto -fno-use-linker-plugin
> -flto-partition=none  line 32 a[0] == 4
> FAIL: gcc.dg/guality/sra-1.c   -O2  line 21 a.i == 4
> FAIL: gcc.dg/guality/sra-1.c   -O3 -fomit-frame-pointer  line 21 a.i == 4
> FAIL: gcc.dg/guality/sra-1.c   -O3 -g  line 21 a.i == 4
> FAIL: gcc.dg/guality/sra-1.c   -Os  line 21 a.i == 4
> 
> on Linux/x86-32.

I've just run guality testsuite and I'm given identical results with and w/o -fipa-icf. Thus, I guess sra-1.c test isn't caused by ICF.

Thanks,
Martin
Comment 13 Jack Howarth 2015-03-19 18:13:58 UTC
Is this bug going to be suspended for 5.0? If so, it would also allow Bug 65303, "Tracking bug for ICF issues", to be closed as well for 5.0.
Comment 14 Jakub Jelinek 2015-04-22 12:00:32 UTC
GCC 5.1 has been released.
Comment 15 Richard Biener 2015-07-16 09:12:28 UTC
GCC 5.2 is being released, adjusting target milestone to 5.3.
Comment 16 Richard Biener 2015-12-04 10:45:43 UTC
GCC 5.3 is being released, adjusting target milestone.
Comment 17 Richard Biener 2016-06-03 10:05:15 UTC
GCC 5.4 is being released, adjusting target milestone.