Type representation in CTF and DWARF
Indu Bhagat
indu.bhagat@oracle.com
Fri Oct 25 03:43:00 GMT 2019
On 10/11/2019 04:41 AM, Jakub Jelinek wrote:
> On Fri, Oct 11, 2019 at 01:23:12PM +0200, Richard Biener wrote:
>>> (coreutils-0.22)
>>> .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | .ctf (uncompressed) | ratio (.ctf/(D1+D2+0.5*D4))
>>> ls 30616 | 1136 | 21098 | 26240 | 0.62
>>> pwd 10734 | 788 | 10433 | 13929 | 0.83
>>> groups 10706 | 811 | 10249 | 13378 | 0.80
>>>
>>> (emacs-26.3)
>>> .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | .ctf (uncompressed) | ratio (.ctf/(D1+D2+0.5*D4))
>>> emacs-26.3.1 674657 | 6402 | 273963 | 273910 | 0.33
>>>
>>> I chose to account for 50% of .debug_str because at this point, it will be
>>> unfair to not account for them. Actually, one could even argue that upto 70%
>>> of the .debug_str are names of entities. CTF section sizes do include the CTF
>>> string tables.
>>>
>>> Across coreutils, I see a geomean of 0.73 (ratio of
>>> .ctf/(.debug_info + .debug_abbrev + 50% of .debug_str)). So, with the
>>> "-gdwarf-like-ctf code stubs" and dwz, DWARF continues to have a larger
>>> footprint than CTF (with 50% of .debug_str accounted for).
>> I'm not convinced this "improvement" in size is worth maintainig another
>> debug-info format much less since it lacks desirable features right now
>> and thus evaluation is tricky.
>>
>> At least you can improve dwarf size considerably with a low amount of work.
>>
>> I suspect another factor where dwarf is bigger compared to CTF is that dwarf
>> is recording typedef names as well as qualified type variants. But maybe
>> CTF just has a more compact representation for the bits it actually implements.
> Does CTF record automatic variables in functions, or just global variables?
> If only the latter, it would be fair to also disable addition of local
> variable DIEs, lexical blocks. Does CTF record inline functions? Again, if
> not, it would be fair to not emit that either in .debug_info.
> -gno-record-gcc-switches so that the compiler command line is not encoded in
> the debug info (unless it is in CTF).
CTF includes file-scope and global-scope entities. So, CTF for a function
defined/declared at these scopes is available in .ctf section, even if it is
inlined.
To not generate DWARF for function-local entities, I made a tweak in the
gen_decl_die API to have an early exit when TREE_CODE (DECL_CONTEXT (decl))
is FUNCTION_DECL.
@@ -26374,6 +26374,12 @@ gen_decl_die (tree decl, tree origin, struct vlr_context *ctx,
if (DECL_P (decl_or_origin) && DECL_IGNORED_P (decl_or_origin))
return NULL;
+ /* Do not generate info for function local decl when -gdwarf-like-ctf is
+ enabled. */
+ if (debug_dwarf_like_ctf && DECL_CONTEXT (decl)
+ && (TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL))
+ return NULL;
+
switch (TREE_CODE (decl_or_origin))
{
case ERROR_MARK:
For the numbers in the email today:
1. CFLAGS="-g -gdwarf-like-ctf -gno-record-gcc-switches -O2". dwz is used on
generated binaries.
2. At this time, I wanted to account for .debug_str entities appropriately (not
50% as done previously). Using a small script to count chars for
accounting the "path-like" strings, specifically those strings that start
with a ".", I gathered the data in column named D5.
(coreutils-0.22)
.debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | path strings (D5) | .ctf (uncompressed) | ratio (.ctf/(D1+D2+D4-D5))
ls 14100 | 994 | 16945 | 1328 | 26240 | 0.85
pwd 6341 | 632 | 9311 | 596 | 13929 | 0.88
groups 6410 | 714 | 9218 | 667 | 13378 | 0.85
Average geomean across coreutils = 0.84
(emacs-26.3)
.debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | path strings (D5) | .ctf (uncompressed) | ratio (.ctf/(D1+D2+D4-D5))
emacs-26.3.1 373678 | 3794 | 219048 | 3842 | 273910 | 0.46
> DWARF is highly extensible format, what exactly is and is not emitted is
> something that consumers can choose.
> Yes, DWARF can be large, but mainly because it provides a lot of
> information, the actual representation has been designed with size concerns
> in mind and newer versions of the standard keep improving that too.
>
> Jakub
Yes.
I started out to provide some numbers around the size impact of CTF vs DWARF
as it was a legitimate curiosity many of us have had. Comparing Compactness or
feature matrices is only one dimension of evaluating the utility of supporting
CTF in the toolchain (including GCC; Bintuils and GDB have already accepted
initial CTF support). The other dimension is a user friendly workflow which
supports current users and eases further adoption and growth.
Indu
More information about the Gcc-patches
mailing list