Bug 100437 - libiberty: Support more characters for function clones
Summary: libiberty: Support more characters for function clones
Alias: None
Product: gcc
Classification: Unclassified
Component: demangler (show other bugs)
Version: 11.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
Depends on:
Reported: 2021-05-05 18:04 UTC by Fangrui Song
Modified: 2021-05-06 06:28 UTC (History)
1 user (show)

See Also:
Known to work:
Known to fail:
Last reconfirmed:


Note You need to log in before you can comment on or make changes to this bug.
Description Fangrui Song 2021-05-05 18:04:33 UTC
In the demangler, the ('.' (alpha|'_')+) ('.' digit+)* scheme as implemented for PR40831 allows a decimal but not a hexadecimal.
It'd be great to support a hexadecimal (or more characters e.g. base64).
There are at least two use cases in clang now.

1. In Clang ThinLTO, a local symbol needs to be promoted to a global symbol so that it can be imported into other modules.
  Such a symbol gets a suffix with a hash (a simple increasing ID scheme cannot avoid collision), e.g. _ZL5localv.llvm.104029495979337208

  % c++filt <<< _ZL5localv.llvm.104029495979337208
  local() [clone .llvm.104029495979337208]

  # A suffix with mixed digits and letters (e.g. many hexadecimals) doesn't work.
  % c++filt <<< _ZL5localv.llvm.11aa

2. clang -funique-internal-linkage-names -c a.cc  # use clang trunk
  (Improve profile accuracy for local symbols)
  There is a long decimal representation of a MD5 module hash.


If more digits are allowed, clang can switch to that so that shorter symbol names can be used, saving .strtab space.

I understand that the original digit/letter separation is to allow multiple clones.
There should be some way supporting more characters.
If it is not useful to know there are 4 clones, just lift the restriction?

% c++filt <<< _ZL5localv.llvm.aaa.000.bbb.111.ccc.222
local() [clone .llvm] [clone .aaa.000] [clone .bbb.111] [clone .ccc.222]