Bug 109805 - LTO affecting -fdebug-prefix-map
Summary: LTO affecting -fdebug-prefix-map
Status: ASSIGNED
Alias: None
Product: gcc
Classification: Unclassified
Component: debug (show other bugs)
Version: 14.0
: P3 normal
Target Milestone: ---
Assignee: Richard Biener
URL:
Keywords: lto
Depends on:
Blocks:
 
Reported: 2023-05-11 00:48 UTC by Sergio Durigan Junior
Modified: 2023-11-27 18:44 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2023-05-12 00:00:00


Attachments
prototype (1.44 KB, patch)
2023-05-12 12:21 UTC, Richard Biener
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Sergio Durigan Junior 2023-05-11 00:48:11 UTC
Hi,

In Ubuntu we use -fdebug-prefix-map to remap a package's build directory (which contains random stuff) into a predictable path under /usr/src.  This is done in order to help debuginfod index the source code for our packages.

Things work very well, but I found a weird corner case involving LTO.  The affected package is vim.  You can see the build logs for it here:

https://launchpadlibrarian.net/665520301/buildlog_ubuntu-mantic-amd64.vim_2%3A9.0.1378-2ubuntu1_BUILDING.txt.gz

As you can notice, we're using -fdebug-prefix-map and LTO for the build.

The problem is that the resulting debuginfo doesn't have the remaped directory.  I can replicate the issue locally, and at first I thought this was either bug #108464 or #87726, but after some digging I'm convinced it's something else.  I've compiled gcc (GCC) 14.0.0 20230510 from the master branch (608e7f3ab47fe746279c552c3574147aa3d8ee76), and I still can reproduce the problem.

A simple reproducer for the problem follows:

$ echo 'int main(){}' > foo.c $ ~/gcc/install/bin/gcc -c foo.c -O2 -g -flto=auto -ffat-lto-objects -fdebug-prefix-map=`pwd`=/aaaaaaa -o foo $ ~/gcc/install/bin/gcc foo -flto=auto -ffat-lto-objects -o bar

A workaround for this bug is to either stop using LTO or explicitly set -fdebug-prefix-map when linking the object.
Comment 1 Sergio Durigan Junior 2023-05-11 00:52:23 UTC
The formatting for the example snippet got messed up.  Here's a fixed version:

$ echo 'int main(){}' > foo.c
$ ~/gcc/install/bin/gcc -c foo.c -O2 -g -flto=auto -ffat-lto-objects -fdebug-prefix-map=`pwd`=/aaaaaaa -o foo
$ ~/gcc/install/bin/gcc foo -flto=auto -ffat-lto-objects -o bar
Comment 2 Andrew Pinski 2023-05-11 02:49:36 UTC
Dup of bug 87726.

*** This bug has been marked as a duplicate of bug 87726 ***
Comment 3 Andrew Pinski 2023-05-11 02:50:28 UTC
Let's reopen this one for a few.
Comment 4 Richard Biener 2023-05-11 06:43:18 UTC
It works for the actual source file translation units for me, it's just the
LTRANS units that have a DW_AT_comp_dir that's not remapped.  It's actually
difficult to do the right thing here and I think the correct thing to do if
you don't like the "bogus" DW_AT_comp_dir is to actually specify
-fdebug-prefix-map at link time.

The issue it's difficult to do the right thing is because you have to
consider

gcc -c t1.c -flto -fdebug-prefix-map=`pwd`=/aaaa
gcc -c t2.c -flto -fdebug-prefix-map=`pwd`=/bbbb
gcc t1.o t2.o

now, what DW_AT_comp_dir should the possibly single LTRANS CU use?

One "fix" might be to emit multiple DWARF CUs for each LTRANS unit and thus
keep the association to the original CUs 1:1 (I have some patches for this
lying around for a few years).  But then we're still mixing CUs by means
of inlining and cloning.

Note the DW_AT_name of the LTRANS CUs is <artificial> (DWARF doesn't allow
to omit it).  What's more "problematic" is that somehow the file list of
the CU contains t.c - it might be worth figuring out how this gets there.

A pragmatic fix could be to detect the case where all LTO inputs had the
same -fdebug-prefix-map specified and carry that over to link time
automatically in lto-wrapper (we are currently not streaming the various
remapping flags).

Can you clarify what the actual problem with the generated dwarf is?
Comment 5 Sergio Durigan Junior 2023-05-11 14:36:29 UTC
(In reply to Richard Biener from comment #4)
> It works for the actual source file translation units for me, it's just the
> LTRANS units that have a DW_AT_comp_dir that's not remapped.  It's actually
> difficult to do the right thing here and I think the correct thing to do if
> you don't like the "bogus" DW_AT_comp_dir is to actually specify
> -fdebug-prefix-map at link time.
>
> The issue it's difficult to do the right thing is because you have to
> consider
>
> gcc -c t1.c -flto -fdebug-prefix-map=`pwd`=/aaaa
> gcc -c t2.c -flto -fdebug-prefix-map=`pwd`=/bbbb
> gcc t1.o t2.o
>
> now, what DW_AT_comp_dir should the possibly single LTRANS CU use?

Thanks for the reply.

I understand the problem here.  But taking the vim package as an example, the remapping was done to a specific directory, for all object files.

> One "fix" might be to emit multiple DWARF CUs for each LTRANS unit and thus
> keep the association to the original CUs 1:1 (I have some patches for this
> lying around for a few years).  But then we're still mixing CUs by means
> of inlining and cloning.
>
> Note the DW_AT_name of the LTRANS CUs is <artificial> (DWARF doesn't allow
> to omit it).  What's more "problematic" is that somehow the file list of
> the CU contains t.c - it might be worth figuring out how this gets there.
>
> A pragmatic fix could be to detect the case where all LTO inputs had the
> same -fdebug-prefix-map specified and carry that over to link time
> automatically in lto-wrapper (we are currently not streaming the various
> remapping flags).

That'd solve the problem I'm seeing, I believe.

> Can you clarify what the actual problem with the generated dwarf is?

If we take the vim package as an example (again), the problem I'm seeing is that -fdebug-prefix-map is being used when compiling all .o files, but the generated binary ends up with the old path in its directory table.  This confuses debuginfod, which uses this information to determine where the source files are located.  It's interesting to note that I also see the new path listed in the DWARF, but for some reason debuginfod/GDB get confused about it.

I found yet another package that seems to be affected by this problem: samba.  Curiously enough, there are other packages that don't seem to be affected, even though they're also being compiled using LTO.
Comment 6 Sergio Durigan Junior 2023-05-11 14:38:04 UTC
As I mentioned in the description, a workaround for this would be to use -fdebug-prefix-map in LDFLAGS as well.  I'm leaning towards implementing this in Ubuntu until we figure out how to properly solve the issue.
Comment 7 Richard Biener 2023-05-12 07:25:33 UTC
OK, so the .debug_line entries appear because of

main:   
.LFB0:  
        .file 1 "t.c"
        .loc 1 1 11

I think I've seen a duplicate bugreport about this where it was suggested
we should apply the remapping when we stream locations.  But that also
has effects on diagnostics we emit during LTRANS.

The pragmatic approach with lto-wrapper would still work, but this shows
why a "proper" solution is even more difficult.
Comment 8 Richard Biener 2023-05-12 12:21:20 UTC
Created attachment 55064 [details]
prototype

This works for me.  The consistency check is not fully implemented and instead
of passing down no -fdebug-prefix-map the patch passes the first but warns:

> ./xgcc -B. t.o t2.o -o t               
lto-wrapper: warning: option -fdebug-prefix-map=/home/rguenther/obj-trunk-g/gcc=/bbb with different values, using /home/rguenther/obj-trunk-g/gcc=/aaa

to make consistency checking work we need to record -fcanon-prefix-map
and the full set of -f{file,debug}-prefix-map options in order (I think
file and debug variants can be considered the same) of the first TU and
compare that to each of the following TUs.

As implemented we only diagnose mismatches of options that are actually
given (because we don't stream the option when not given).

We could also emit a hard error when there's a mismatch.

Note a link-time specified option will simply ignore all options from the compile-time (but only for the link-time unit, the compile-time debug info
has already been generated with the originally specified options).

Not sure what the best behavior is here, any input appreciated.
Comment 9 Sergio Durigan Junior 2023-05-13 19:21:43 UTC
at(In reply to Richard Biener from comment #8)
> This works for me.  The consistency check is not fully implemented and
> instead
> of passing down no -fdebug-prefix-map the patch passes the first but warns:
>
> > ./xgcc -B. t.o t2.o -o t
> lto-wrapper: warning: option
> -fdebug-prefix-map=/home/rguenther/obj-trunk-g/gcc=/bbb with different
> values, using /home/rguenther/obj-trunk-g/gcc=/aaa
>
> to make consistency checking work we need to record -fcanon-prefix-map
> and the full set of -f{file,debug}-prefix-map options in order (I think
> file and debug variants can be considered the same) of the first TU and
> compare that to each of the following TUs.

Thanks a lot for the patch.  I tried it locally, and it indeed works for the simple example I posted in the description of this bug.  However, for some reason it doesn't seem to make a difference for the vim compilation.  I'm still seeing a directory table like the following:

Directory table:
      [path(line_strp)]
 0     /home/ubuntu/vim/vim-9.0.1378/src/vim-basic (0)
 1     /usr/include/x86_64-linux-gnu/bits (57)
 2     /usr/include (92)

whereas if I pass -fdebug-prefix-map to LDFLAGS, the directory table becomes:

Directory table:
      [path(line_strp)]
 0     /usr/src/vim-2:9.0.1378-2ubuntu2~ppa1/src/vim-basic (0)
 1     /usr/include/x86_64-linux-gnu/bits (65)
 2     /usr/include (100)

which is what I expected to see.

> Note a link-time specified option will simply ignore all options from the
> compile-time (but only for the link-time unit, the compile-time debug info
> has already been generated with the originally specified options).

FWIW, I think this bug is related to #108534 (and the related discussion at https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606205.html).
Comment 10 Richard Biener 2023-05-15 06:34:30 UTC
(In reply to Sergio Durigan Junior from comment #9)
> at(In reply to Richard Biener from comment #8)
> > This works for me.  The consistency check is not fully implemented and
> > instead
> > of passing down no -fdebug-prefix-map the patch passes the first but warns:
> >
> > > ./xgcc -B. t.o t2.o -o t
> > lto-wrapper: warning: option
> > -fdebug-prefix-map=/home/rguenther/obj-trunk-g/gcc=/bbb with different
> > values, using /home/rguenther/obj-trunk-g/gcc=/aaa
> >
> > to make consistency checking work we need to record -fcanon-prefix-map
> > and the full set of -f{file,debug}-prefix-map options in order (I think
> > file and debug variants can be considered the same) of the first TU and
> > compare that to each of the following TUs.
> 
> Thanks a lot for the patch.  I tried it locally, and it indeed works for the
> simple example I posted in the description of this bug.  However, for some
> reason it doesn't seem to make a difference for the vim compilation.  I'm
> still seeing a directory table like the following:
> 
> Directory table:
>       [path(line_strp)]
>  0     /home/ubuntu/vim/vim-9.0.1378/src/vim-basic (0)
>  1     /usr/include/x86_64-linux-gnu/bits (57)
>  2     /usr/include (92)
> 
> whereas if I pass -fdebug-prefix-map to LDFLAGS, the directory table becomes:
> 
> Directory table:
>       [path(line_strp)]
>  0     /usr/src/vim-2:9.0.1378-2ubuntu2~ppa1/src/vim-basic (0)
>  1     /usr/include/x86_64-linux-gnu/bits (65)
>  2     /usr/include (100)
> 
> which is what I expected to see.

Odd.

> > Note a link-time specified option will simply ignore all options from the
> > compile-time (but only for the link-time unit, the compile-time debug info
> > has already been generated with the originally specified options).
> 
> FWIW, I think this bug is related to #108534 (and the related discussion at
> https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606205.html).

Yes, that looks related.  Note we do remap the file part of locations
but indeed not the streamed PWD.

When I remap PWD as well I get

 The Directory Table (offset 0x10a, lines 2, columns 1):
  Entry Name
  0     (indirect line string, offset: 0x18): /aaa
  1     (indirect line string, offset: 0xd): ../../../../aaa

 The File Name Table (offset 0x118, lines 2, columns 2):
  Entry Dir     Name
  0     0       (indirect line string, offset: 0x0): <artificial>
  1     1       (indirect line string, offset: 0x1d): t.c

for the toy example.  That's quite odd, but possibly the behavior of the
original intent of the patch you quoted - but it also shows that the
remapping of the streamed PWD is likely wrong?

For the record the above is with just

diff --git a/gcc/lto-streamer-out.cc b/gcc/lto-streamer-out.cc
index 0bca530313c..89b602f080b 100644
--- a/gcc/lto-streamer-out.cc
+++ b/gcc/lto-streamer-out.cc
@@ -231,7 +231,8 @@ lto_output_location_1 (struct output_block *ob, struct bitpack_d *bp,
            }
          bp_pack_value (bp, stream_pwd, 1);
          if (stream_pwd)
-           bp_pack_string (ob, bp, get_src_pwd (), true);
+           bp_pack_string (ob, bp, remap_debug_filename (get_src_pwd ()),
+                           true);
          bp_pack_string (ob, bp, remapped, true);
          bp_pack_value (bp, xloc.sysp, 1);
        }

r10-6887-gd12153046816f9 did the original bits of remapping and shows we
originally passed through the remapping options.  r11-3096-g3d0af0c997fe42
was Jakubs fix for .debug_line and relative paths vs. changing CWD.
Comment 11 Richard Biener 2023-05-15 09:20:51 UTC
Btw, streaming of the CWD is prone to breakage when it changes between preprocessing and compilation stage.  I think at least for C family languages
the CWD would need to be recorded by the preprocessor and made available
via line directives somehow?

This also seems "wrong" without LTO.

> ls t.c tmp
t.c

tmp:
t.h  t.o
> gcc -E t.c -o t.i
> cd tmp
> gcc ../t.i -g .c
> readelf -w t.o
...
 <0><b>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <c>   DW_AT_producer    : (indirect string, offset: 0x0): GNU C17 12.2.1 20220830 [revision e927d1cf141f221c5a32574bde0913307e140984] -mtune=generic -march=x86-64 -g
    <10>   DW_AT_language    : 12       (ANSI C99)
    <11>   DW_AT_name        : t.c
    <15>   DW_AT_comp_dir    : (indirect string, offset: 0x71): /tmp/tmp
...
 The Directory Table (offset 0x1c):
  1     tmp

 The File Name Table (offset 0x21):
  Entry Dir     Time    Size    Name
  1     0       0       0       t.c
  2     1       0       0       t.h

and LTO just adds another level of "directory changing" (but for sure a more
common one).
Comment 12 Sergio Durigan Junior 2023-05-16 22:25:10 UTC
Sorry, I have been busy with other things, but I'm paying attention to the developments here.

I still have to test the workaround I suggested (passing -fdebug-prefix-map to LDFLAGS) more broadly, because I think I may have found at least one scenario where it doesn't work.  Something else that's puzzling me is the fact that I don't see this behaviour everywhere; some packages do have the expected DW_AT_comp_dir even after being compiled with LTO enabled.
Comment 13 rguenther@suse.de 2023-05-17 07:23:59 UTC
On Tue, 16 May 2023, sergiodj at sergiodj dot net wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109805
>
> --- Comment #12 from Sergio Durigan Junior <sergiodj at sergiodj dot net> ---
> Sorry, I have been busy with other things, but I'm paying attention to the
> developments here.
>
> I still have to test the workaround I suggested (passing -fdebug-prefix-map to
> LDFLAGS) more broadly, because I think I may have found at least one scenario
> where it doesn't work.  Something else that's puzzling me is the fact that I
> don't see this behaviour everywhere; some packages do have the expected
> DW_AT_comp_dir even after being compiled with LTO enabled.

Yeah, it's clearly odd and we lack testsuite coverage completely.
Having small testcases that show cases that work and cases that do not
would be very useful in understanding the bits and how they do
(not) work together properly.