Bug 94845 - DWARF function name doesn't match demangled name in base type template parameters
Summary: DWARF function name doesn't match demangled name in base type template parame...
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: c++ (show other bugs)
Version: 9.3.1
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-04-29 11:03 UTC by robert@ocallahan.org
Modified: 2024-01-21 15:54 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description robert@ocallahan.org 2020-04-29 11:03:11 UTC
Example:

template <typename T> void func(T s) {}
int main(void) {
  func<short>(-1);
  return 0;
}

$ g++ -g -o ~/tmp/test ~/tmp/test.cc && objdump -g ~/tmp/test|grep func
    <2a>   DW_AT_name        : (indirect string, offset: 0x0): func<short int>
    <31>   DW_AT_linkage_name: (indirect string, offset: 0x10): _Z4funcIsEvT_
$ c++filt _Z4funcIsEvT_
void func<short>(short)

It's unclear why 'short int' appears instead of just 'short'. It would be simpler if they were consistent (and, well, shorter). clang++ generates 'short'.
Comment 1 robert@ocallahan.org 2020-04-29 11:29:45 UTC
One case where this causes problems is implementing a debugger where you want to be able to evaluate expressions containing type names. Type names containing template type parameters that are base types need to be normalized to match the type names in the debuginfo. g++ requires us to normalize those type names in a way that's different from the C++ demangler and from clang++.
Comment 2 Andrew Pinski 2020-04-29 11:50:28 UTC
Hmm from http://wiki.dwarfstd.org/index.php?title=Best_Practices
For template instantiations, the DW_AT_name attribute should contain both the source language name of the object and the template parameters that distinguish one instantiation from another. The resulting string should be in the natural form for the language, and should have a canonical representation (i.e., different producers should generate the same representation). For C++, the string should match that produced by the target platform's canonical demangler; spaces should only be inserted where syntactically required by the compiler.
Comment 3 Andrew Pinski 2020-04-29 11:52:06 UTC
But that is just best practices, that does not mean an consumer of the dwarf does not need to consume slightly different but still correct dwarf code.
Comment 4 Andrew Pinski 2020-04-29 12:00:26 UTC
See also PR 81932 where we talked about 2 and 2u
Comment 5 robert@ocallahan.org 2020-04-29 12:19:08 UTC
We do our best to consume what g++ produces, but in the situation of comment #1 that is difficult. Whether or not it's "correct DWARF" is really irrelevant; not matching the demangler causes real problems.

Thanks for the link to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81932. That is very similar to the problem I'm facing. No strategy was decided on there, but it seems to me that matching the demanged name would be a clear step in the right direction. I can't see how it could *hurt*.
Comment 6 Tom Tromey 2021-04-22 19:06:28 UTC
gdb does this canonicalization precisely because the form
in the DWARF cannot be relied upon.
It would be great to remove this, because it is expensive.

One idea for a migration route would be for g++ to promise
to emit the same form that the demangler emits; then
add an attribute to the comp-unit DIE saying that the names
have been canonicalized.  (Or, I suppose gdb could use
producer sniffing; but I'd rather avoid that as much as possible.)
Comment 7 robert@ocallahan.org 2021-04-22 22:49:00 UTC
So gdb reads DW_AT_name "func<short int>", parses it, reserializes it to "func<short>", and uses that?
Comment 8 Tom Tromey 2021-04-23 01:38:07 UTC
(In reply to robert@ocallahan.org from comment #7)
> So gdb reads DW_AT_name "func<short int>", parses it, reserializes it to
> "func<short>", and uses that?

Yeah.  (Actually it's even worse than that, because at least one
compiler doesn't emit the template parameters in the name, so
in that case gdb will read the children of the DIE to try to
construct this form.)

I think the reasoning behind the canonicalization is two-fold.
First, I think we tried to get g++ changed, back in the day,
without success.

Second, gdb has to canonicalize user input anyway, so that
things like "print func<short int>(3)" or "break func<short int>"
work.  And once you have a canonicalizer it is simpler to just
use it to work around the problem.
Comment 9 robert@ocallahan.org 2021-04-23 02:27:12 UTC
That makes sense ... well, except implementing a full C++ parser and reserializer is horrific.
Comment 10 Tom Tromey 2022-10-21 17:48:20 UTC
See also bug #49130 and bug #49537, which we filed when
gdb hit these same problems.