[PATCH] Various fixes for DWARF register size computation

Thu Jan 12 16:50:07 GMT 2023

Jakub Jelinek via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> On Tue, Jan 03, 2023 at 02:25:21PM +0100, Florian Weimer wrote:
>> > Though, I still wonder, because all of this is a hack for a single target
>> > - x86_64-linux -m64 - I think no other target has similar constant
>> > sizes,
>> 
>> Really?  That's odd.
>
> I've tried about 30 cross compilers I had around, I admit it isn't
> exhaustive.
>
>> Is it because other architectures track callee-saved vector registers
>> through unwinding?
>
> I think it is far more than just vector registers, it can be floating point
> registers, or just one integral or special register of a different size etc.
>
>> > Or, if you want to do it on the compiler side, instead of predefining
>> > __LIBGCC_DWARF_REG_SIZES_CONSTANT__ and __LIBGCC_DWARF_REG_MAXIMUM__
>> > register conditionally a new builtin, __builtin_dwarf_reg_size,
>> > which would be defined only if -fbuilding-libgcc and the compiler determines
>> > dwarf_reg_size is desirable to be computed inline without a table and
>> > would fold the builtin to say that
>> > index <= 16U ? 8 : 0 on x86_64 -m64,
>> > index <= 9U ? 4 : index - 11U <= 5U ? 12 : 0 on x86_64 -m32 etc.
>> > and if the expression is too large/complex, wouldn't predefine the builtin.
>> 
>> I think the pre-computation of the size array is useful even for targets
>> where the expression is not so simple, but the array elements are still
>> constants.  A builtin like __builtin_dwarf_reg_size could use a
>> reference to a constant static array, so that we can get rid of the
>> array initialization code in libgcc.  Before we can do that, we need to
>
> I think constant static array might be sometimes an option too, but not
> always, not just because of AArch64, or other potential poly-int arches
> (RISCV?), but also if something could depend on some runtime check.
> It is true that most of the time it is constant and depends just on the
> target or more often on target + options combo (mostly ABI related options).
>
>> figure out if the fully dynamic register sizes on AArch64 with SVE are
>> actually correct—and if we need to fix the non-SVE unwinder to work
>> properly for SVE programs.
>> 
>> So I don't want to revert the size array computation just yet.
>> 
>> > Or, is it actually the use of table that is bad on the unwinder side,
>> > or lack of a small upper bound for what you get from the table?
>> > In that case you could predefine upper bound on the sizes instead (if
>> > constant) and simply add if (size > __LIBGCC_DWARF_REG_SIZE_MAX__)
>> > __builtin_unreachable ()).
>> 
>> It also matters what kind of register sizes are used in practice.
>
> Yes, but that is hard to find out easily.  E.g. some registers might be
> only saved/restored in signal frames and nowhere else, others only rarely
> touched but still we'd need their sizes if they are ever used.
>
>> Should I repost this patch with the three nits fixed?  Or should I
>> revert two of the three patches I committed instead?
>
> I lean towards reversion and trying to figure out something that works
> on more than one arch.  It doesn't have to improve all arches (say
> AArch64 is clearly a nightmare that isn't handled correctly even without
> any changes - as noted in the PRs, if libgcc is built without SVE, it will
> hardcode 8, while if it is built with SVE, it will be runtime dependent and
> will be wrong in theory when some HW has 2048 bit SVE vectors - when it is
> 256 bytes), but still watching into what we compile -O2 -nostdinc -fbuilding-libgcc

I'm jumping in here without fully understanding the context, so maybe this
is exactly your point, but: the SIMD/FP DWARF registers are supposed to be
size 8 regardless of which features are enabled.  That's already only half
of the hardware register size for base Armv8-A, since Advanced SIMD registers
are 16 bytes in size.

So yeah, if we're using the hardware register size then something is wrong.

Thanks,
Richard

> static unsigned char dwarf_reg_size_table[__LIBGCC_DWARF_FRAME_REGISTERS__+1];
>
> void
> foo (void)
> {
>   __builtin_init_dwarf_reg_size_table (dwarf_reg_size_table);
> }
>
> on at least 10 most common arches would be useful and optimizing what is
> easily possible would be nice.
>
> 	Jakub