This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: PDP-10 backend for gcc
- To: Michael Meissner <meissner at cygnus dot com>
- Subject: Re: PDP-10 backend for gcc
- From: lars brinkhoff <lars at nocrew dot org>
- Date: 06 Sep 2000 11:15:11 +0200
- Cc: law at cygnus dot com, Alan Lehotsky <lehotsky at tiac dot net>, gcc at gcc dot gnu dot org
- References: <9718.967737237@upchuck> <85u2c0fp61.fsf@junk.nocrew.org> <20000905190538.39634@cse.cygnus.com>
Michael Meissner <meissner@cygnus.com> writes:
> > There are two kinds of pointers:
> >
> > Word pointers. These are 30-bit values in 36-bit words, and are
> > used to point to anything at least one word in size. Integer
> > arithmetic works. I think PSImode will work here.
> >
> > Byte pointers. These are 36-bit values in 36-bit words. The
> > lower 30 bits is a word pointer, and the upper 6 bits is a code
> > to locate the byte (i.e. char or short) within the word pointed
> > to. Since normal ints are 36-bits, SImode can't be used for byte
> > pointers. Maybe a new mode will have to be invented?
>
> As the others have said, GCC internally believes that all pointers
> are byte pointers, and it can just add 1 to a pointer to increment
> to the next byte. That means without reworking the code, you would
> not be able to use any other format, such as using the upper bits
> for the byte offset, etc.
I even found a reference to this in Using and Porting GCC, section
RTL Template.
`(address (match_operand:M N "address_operand" ""))'
This complex of expressions is a placeholder for an operand number
N in a "load address" instruction: an operand which specifies a
memory location in the usual way, but for which the actual operand
value used is the address of the location, not the contents of the
location.
`address' expressions never appear in RTL code, only in machine
descriptions. And they are used only in machine descriptions that
do not use the operand constraint feature. When operand
constraints are in use, the letter `p' in the constraint serves
this purpose.
M is the machine mode of the *memory location being addressed*,
not the machine mode of the address itself. That mode is always
the same on a given target machine (it is `Pmode', which normally
is `SImode'), so there is no point in mentioning it; thus, no
machine mode is written in the `address' expression. If some day
support is added for machines in which addresses of different
kinds of objects appear differently or are used differently (such
as the PDP-10), different formats would perhaps need different
machine modes and these modes might be written in the `address'
expression.
The PDP-10 may need three pointer machine modes, perhaps named like this:
QIPmode pointer to char
HIPmode pointer to short
SIPmode pointer to objects at least an int in size
> If I remember my computer architecture classes, the classical way to
> represent characters on a -10 was to have 5 7-bit fields with a bit
> left over. This violates ISO C, which mandates that all types be a
> whole number of bytes (so that memcpy works), and that a byte be at
> least 8 bits. This would mean you would have to use either 36-bit
> bytes, or 9-bit bytes. The path of least reistance is to have
> 36-bit 'bytes'. This is the path chosen by other word oriented
> machines, such as the C4x.
9-bit chars will probably be used. I don't think that 36-bit chars
would be acceptable. When doing text I/O, the 7-bit characters will
be zero-extended to 9 bits.
> It would be useful to have GCC be able to deal with different sized
> pointers and/or different encodings for the pointers. I have run
> into this on some machines, such as the Mitsubshi D10V (function
> pointers are 16-bit word pointers, while data pointers are 16-bit
> byte pointers), but I suspect you will have to get buy-in from the
> people that work on the front ends (ie, the TREE interface), since
> that is where a lot of the work will need to be done. Given this
> byte pointer-ness has been in the compiler since its inception, I
> suspect you will find undocumented assumptions rife throughout the
> compiler.
I was hoping that the front ends would not be affected much at all.
Say, if the back end is presented with a constant char *, it could
do the necessary translation into a PDP-10 byte pointer. Similarly,
there could be machine description patterns for pointer arithmetic.