This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: [RFC] DW_OP_piece vs. DW_OP_bit_piece on a Register


Sorry for the slow response...

Andreas Arnez <arnez@linux.vnet.ibm.com> writes:
> On Mon, Jan 25 2016, Matthew Fortune wrote:
> 
> > My dwarf knowledge is not brilliant but I have had to recently
> > consider it for MIPS floating point ABI changes aka FPXX and friends.
> > I don't know exactly where this fits in to your whole description but
> > in case it has a bearing on this we now have the following uses of
> DW_OP_piece:
> >
> > 1) double precision data split over two 32-bit FPRs
> > Uses a pair of 32-bit DW_OP_piece (ordered depending on endianness).
> >
> > 2) double precision data in one 64-bit FPR
> > No need for DW_OP_piece.
> >
> > 3) double precision data that may be in two 32-bit FPRs or may be in
> >    one 64-bit FPR depending on hardware mode
> > Uses a single 64-bit DW_OP_piece on the even numbered register.
> 
> Hm, so in 32-bit hardware mode the DWARF consumer is expected to treat
> the DW_OP_piece on the even numbered register as if it were two pieces
> from two consecutive registers?

Yes.

> Or should we rather consider the even
> numbered register to be 64 bit wide, where the right half shadows the
> next odd-numbered register?  If so, I believe you generally want pieces
> from FPRs to be taken from the left, correct?

No I think this is backwards it is the left half that shadows the next
register and pieces are taken from the right. I've attempted a description
below to see if it helps.

I don't believe (in the MIPS case) we could unconditionally view the even
numbered register as 64-bit or 32-bit as the shadowing onto the next
register only exists in some hardware modes.

The size of a register has to be determined from the current hardware mode
and then the logic would be to read as much as possible from the referenced
register and use it as the lower bits of the overall value. Then continue
reading consecutive registers filling the next most significant bits
until the full size of the DW_OP_piece has been read. This for MIPS
FP registers is endian agnostic as the higher numbered register always
has the most significant bits. For GPRs the even numbered register will
provide either the most or least significant bits depending on endian but
we have no reason to use this paradoxical DW_OP_piece for GPRs as they
have compile time deterministic size.

> > I'm guilty of not actually finishing this off and doing the GDB side
> > but the theory seemed OK when I did it! From your description this
> > behaviour best matches DW_OP_piece having ABI dependent behaviour
> > which would make it acceptable. These three variations can potentially
> > exist in the same program albeit that (1) and (3) can't appear in a
> > single shared library or executable. It's all a little strange but the
> > whole floating point MIPS o32 ABI is pretty complex now.
> 
> I don't quite understand why (1) and (3) can not coexist in the same
> shared library or executable.  Can you elaborate a bit?

Oops. Sorry it is (1) and (2) that can't coexist. I'm not sure you
really want to know the gory details but the explanation is below if
you're feeling brave.

The reason these can't co-exist is really just because there would need
to be yet another ABI variant/ELF marker to record the requirements of
such an executable. It would be a combination of FP64A and FP32 and that
would mandate a hardware mode of FR=1 FRE=1 which is the one mode that
we desperately do not want to be in as it uses kernel emulation to make
it all work. The combination of FP64A and FP32 ABIs is supported to
enable some level of transition from original o32 (FP32) through to FP64
without requiring moving everything to FPXX first. We allow this across
a shared library boundary to give just enough support for software to
transition. The aim is to encourage the full transition to FPXX rather
than going through a period of creating binaries that will always need
kernel emulation regardless of the host architecture.
 
> I'm curious about the interaction with vector registers.  AFAIK, vector
> registers on MIPS also embed the FPRs (left or right?).

Probably best to compare NEON with other SIMD. MSA works like NEON on
AArch64 rather than AArch32. I.e. it widens each register to 128-bit
and uses the same DWARF registers as FPRs. A DW_OP_piece therefore
corresponds to part of the 128-bit register. 

> Are the same DWARF register numbers used for both?

Yes. I think we can get away with using the same dwarf numbers as the
FPRs sit in the LSB of the vector register regardless of endian or
double/single data but that is a moot point, see below.

> And when taking a 64-bit DW_OP_piece from a vector register, would
> this piece correspond to the embedded FPR?

Strict architecture definition says no as the register sets do not
necessarily have to be the same. In reality all the implementations I
know of do simply have the FPU and SIMD unit use the same physical
register set. However we would/should never consider a DW_OP_piece
on a vector register to refer to the underlying FPR as there are
situations that the least significant 32-bits of an odd numbered
FPR do NOT correspond to the single precision value in the same
numbered register. This is the key to the FRE mode of execution and
it means we only treat an FP register as having the type/value that was
last written to it; if written as a vector then it should be accessed
as a vector, if written as a float/scalar then accessed as such. This
isn't really anything new as MIPS has always been strict about the
dynamic 'type' of an FPR and accessing then consistently.

A concrete example ($w is just an alias for $f):

LDI.W $w1, 1
MFC1 $4, $f1
COPY_S.W $5, $w1[0]

$4 and $5 can have different values in one of the hardware modes.

> Also, how do you think that the following should be represented in
> DWARF:

I think I've formed an opinion on this but it may be naïve! It matches
4.3 I believe from your original description but I offer a potential
solution to the SPU issue.

Concepts like SPU preferred slots should be represented as different
dwarf registers depending on the notional size of the register that
the dwarf producer assumed. This means that LSB is deterministically
known to be within that preferred slot and 0 is simply the LSB. The
maximum extent of that register is also capped so a piece of the
'word' sized dwarf register can only contain 32-bits; beyond that it
spills to the next register in an ABI dependent manner.
 
What I am saying above is that I think we should step back from the
idea that physical register size matters when generating dwarf and
instead ensure that dwarf registers only refer to the visible/valid
portion of a register. I.e. The dwarf consumer's interpretation of
hardware registers needs to be able to find the right portion of a
physical register for any given dwarf register and the remaining
bits are ignored. This allows any architectural register to extend
in any direction as long as the dwarf consumer is updated, the
producer should be shielded from some detail (which is what I think
I learnt from my MIPS FPXX trauma).

> * Odd-sized bit field in one of a vector register's elements;

I'd go for the bit position to be determined by identifying the
element as <element-bitsize>*<element-number> + <bit position>
where bit position is 0 for the LSB irrespective of endian.

> * odd-sized bit field spilled into an FPR;

Given what we have for MIPS then I'd have to say either a
paradoxical DW_OP_bit_piece where the bits beyond the size of the
specified FPR are taken from the next register in an ABI dependent
manner... or 2+ DW_OP_bit_piece referring the various portions
again having 0 for the LSB regardless of endian.

> * single-precision `complex float' living in two consecutive 64-bit
>   FPRs.

Not sure here... two 32-bit DW_OP_piece but in the presence of SPU
preferred slots this would need to use the dwarf register numbers for
'32-bit' SPU registers or it wouldn't know the correct position for
LSB. You have made me wonder what happens for this case in MIPS FPXX
though!

The more I try and think about this then the more problems I see, I
really don't know if my comments are helpful. Sorry if they're not.

Thanks,
Matthew

> 
> Thanks for your input!
> 
> --
> Andreas


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]