This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Status of the MMIX GCC port: runs testsuite. Time to check in?


Hi.  I'll try to address both mmixmasters and GCC people here.

Some of you (at least some of the mmixmasters) are perhaps
interested in knowing the state of the MMIX GCC port.

Summary:
- GCC compiles and works for trivial useless programs.
- I've written "stupid" assembler, linker and archiver
  wrapper-scripts, enough to compile simple programs so I can
  run the GCC test-suite.  Likewise necessary stub headers.
- An ABI is sketched.  The ABI suggested by Knuth does not
  fit exactly, but I'll add support to interface to code written
  for that ABI.
- The GCC test-suite breaks, but not all over the place.
- For the die-hard interested, the GCC code is provided as a
  patch to current (2000-02-20 15:11 CET) GCC CVS at
  <URL:http://bitrange.com/mmix/gcc/gcc-mmix-1.patch.gz>
  and further info linked from <URL:http://bitrange.com/mmix/>.
  I'd like to check the port into GCC as soon as I get an OK.
- The "fake" assembler, linker, archiver, libc, libm and crt0
  stubs and header files are available at
  <URL:http://bitrange.com/mmix/binutils/mmix-fakes-1.tar.gz>.


 First, the ABI.
See Knuth's paper (1 or 2), where an ABI for traditional
function calling is outlined, preferred for efficiency.
However, I'm not convinced that what Knuth mentions there is
(a subset of) an optimal ABI.  Before you cry "heresy!" or
"ignoramus!" read on: 

Knuth proposes in passing, to pass parameters in register $0 and
up to $N (where N < rG), as seen by the called function, with
return values passed back similarly in registers starting at
$0.  From $N and up, the contents on return is either part of a
return value or junk.  Because of the register stack (register
windows, somewhat SPARC-like), the parameters must be stored by
the caller in register $N, where N is equal to the number of
call-saved registers in the calling function; functions that
will have their value retained over the function call.

GCC people now probably see at least one problem:  GCC has
support for renumbering of windowed registers (see (3)
FUNCTION_ARG, FUNCTION_INCOMING_ARG, INCOMING_REGNO and
OUTGOING_REGNO).  However, this renumbering is static.
Furthermore, when assigning registers for function parameters,
register allocation hasn't even started yet, so N is unknown.
Guesses or intermediate registers have to be used, or perhaps
registers can be renumbered before output.  While it seems
tricky to get a reasonably good register allocation here, I
think there's a reason (whew!) to pass parameters in other
registers.
 On the other hand I have a funny feeling that Knuth is very
aware of how GCC works and is posing this problem as an
exercise. :-)

For any ABI on MMIX, registers $0 and up to rG are preferably
call-saved registers.  I assume here that any good ABI saves
registers to the register stack at the caller, using the
push-and-jump PUSHJ instruction.  This seems the only reasonable
way to save multiple registers in MMIX.  So far I agree with
the ABI in Knuth's paper.  But, I like to believe that
parameters be best passed in call-clobbered registers, still
having return values returned similarly.
 At least that's what I found in my master's thesis (4).
(It also seems to be common practice for ABI:s where parameters
are passed in registers, but common practice is at times
uncorrelated to an optimal solution. ;-)
Admittedly, this is comparing apples and oranges, particularly
since CRIS is (to most extent) a two-operand machine, and MMIX
is (to most extent) a three-operand machine.  And of course,
this is comparing gcc-2.7.2 vs. current CVS.

Nevertheless, it should be obvious (alright, I'm hand-waving :-)
that passing parameters incoming in $0 and up will more often
increase the number of saved registers, compared to passing
parameters in call-clobbered registers.  For the same reason,
return-values should go in call-clobbered registers (from the
callers view) rather than saved registers.

Here's a proposed partial ABI (for the rest, ask or RTFS).  I
assume rG is 32, but if it seems usable, it will be parametrized
later, using for example "-mrG=...".
- Global registers $32..$200 are call-saved registers.  They
  can be assigned variables using GCC-specific syntax; see the
  GCC info pages.  It seems feasible to eventually provide a way
  to specify them anonymously.  For example 'register int foo
  asm ("GREG");' rather than 'register int foo asm ("$42");',
  letting the assembler and linker associate "foo" with an
  available register.
- Global registers $200..$253 are call-clobbered.
- All integer-type parameters are promoted to 64 bits.
- Parameters are passed (left-to-right) in registers $200..$231
  when they fit in 64 bits, otherwise by reference to a
  read-only copy.
- Register $254 is named stack pointer.
- Register $255 is a call-clobbered register, a temporary for
  short-term use as Knuth writes.  It is never allocated (a
  "fixed" register), but sometimes used in kludgy multi-insn
  patterns and such.  (The binutils port will use it when
  relaxing for out-of-reach operands.) 
- Parameters number 32 and up are passed on stack (the "normal"
  one, not the register stack), aligned to octabytes if they
  fit, else by reference to a read-only copy. 
- Scalar return values are passed back in registers $200 and
  $201.  Structure return values are returned by XXXX
- Register allocation in each function is preferred (loosely,
  RTFS for the truth) in order $253..$200, then $0..$199.
- STRUCT_VALUE_REGNUM is $252
- STATIC_CHAIN_REGNUM is $253
- FRAME_POINTER_REGNUM is $199
- Objects must be put on addresses aligned to their size
  (STRICT_ALIGNMENT).

Here's how I figure basic types:
- A char is 8 bits.  It is signed by default.
- A short int is 16 bits.
- An int is 32 bits.
- A long int is 64 bits.
- A long long int is (also) 64 bits.
- A float is 32 bits.
- A double is 64 bits.
- A long double is (also) 64 bits.
- A wide-character is an "unsigned int".

A rationale might be in order for the integer sizes: I choose
different sizes for char, short, int and long so people can
easily use these to get the data sizes they expect, without
using GCC attributes.  These settings also somewhat match those
of the Alpha.  I'd really like to change "int" to be 64 bits;
GCC is not good at "augmenting" operations from 32 to 64 bits,
for instance.  Having long long as 128 bits seems more natural,
but will just cause a lot of trouble currently (compiling on a
32-bit host), so I leave it for now.

To wit, this ABI is not in any way "cast in stone", it just lets
me go on with the port.  Your input is requested.  Everything
can (and probably will) change several times.  If not before,
then at least after thorough measurements on real programs using
the mmix simulator.

 State of the GCC port.
It seems to emit valid assembly code.  It does not run "Hello,
world", since there's no real libc.  I believe someone is
working on porting newlib.

Regardless of the libc issue, I do not think GCC plus the
fake-scripts plus mmixal is in a shape that it can be used to
measure performance on real-world programs yet.  For example, I
haven't solved the problem with "extern char foo[]; char *bar =
foo + 1;", which is that GCC emits an initialization for "bar"
which is an arithmetic expression with an unknown symbol (foo).
Unfortunately, "mmixal" cannot cope.  I'm not even sure I want
to waste time hacking around it; time is better spent porting
binutils.  There's no C++ support for similar reasons.

I've not implemented yet, but added hooks and framework for
options named "-mabi=...", with "-mabi=gcc" being the current
default proposed above, and "-mabi=mmixware" for the Knuth ABI.
The latter to generate code for Knuth's proposed
call-convention, kludgily copying to and from registers at
function entry, exit and call.
 This could be used to interface with assembler code written
using that ABI.  It can later be implemented with a register
renumbering pass, which should be generalized or can at least be
implemented by abusing MACHINE_DEPENDENT_REORG.
Perhaps the Knuth ABI can eventually be added somewhat less
suboptimally, despite the problems described above, so the
different ABI:s can be measured satisfactorily.

Most other details are complete although suboptimal in some
areas; test-suite problems *should* be with real bugs rather
than an incomplete port.  Not that the port emits optimal code
yet, but that will improve.
GCC test summary:
# of expected passes		5348
# of unexpected failures	2613
# of unexpected successes	1
# of expected failures		24
# of unresolved testcases	1189
# of unsupported tests		35
/gcc/xgcc version 2.96 20000220 (experimental)
There are instructions at (5) to set up things so you can
execute the gcc test-suite.


 The assembler, linker and archiver & Co.
(This is a bit off-topic for GCC, I'm afraid).
Since Knuth's MMIX assembler mmixal does not emit relocatable
code, I had to take steps to fix up the situation.  I hope to
port binutils soon, but unfortunately there's a delay from words
to usable work.
 In the meantime, there's an assembler that is a shell script
that transforms parameters and calls "cat", an archiver that
inserts begin-file- and end-file- markers and extracts and adds
files using "sed".  Localization of symbols is handled by adding
a " PREFIX P" in each file (they stack up nicely), but declaring
and referencing externally visible symbol prefixed with the
global namespace prefix ":".  The mmixal assembler is called
with "-b 10000", so we probably will not see breakages for
reasons of symbol length for a little while.

The linker "cat"s together files and calls "mmixal" (which is
assumed to be in the path) with the result, so there better be
no loose ends in the libraries.  With some options, it wraps a
shell script around the uuencoded program, which calls the
"mmix" simulator as if it was running native, when it is run.
For the running program, the cycle count (mems and oops) from
the "-s" simulator option can be redirected to a file specified
at link time or through an environment variable.  With the help
of crt0, it also passes on a zero or nonzero exit code from the
simulated program, working around the fact that "mmix" does not
provide an exit code.  This way, I can in one blow both run the
GCC test-suite without fiddling with DejaGNU for now, and have
the necessary machinery to easier run separate programs.

For the planned binutils port, using ELF for object files seems
best, by default providing linked programs in the MMO object
format.  It seems all ELF/DWARF debugging information can be
wrapped into BSPEC/ESPEC pairs, so most debugging information
can be retained if MMO files are translated back to ELF for some
reason.  The GAS port should naturally accept the MMIX assembly
format as well as conventional format; it seems they can
peacefully coexist.  At least I'm sure they can if there's an
initial pseudo-directive to specify the format.


References:
1 MMIX Definition of architecture details, pp 23
  <URL:http://www-cs-faculty.stanford.edu/~knuth/mmix-doc.ps.gz>

2 (More verbose than 1) TAOCP Fascicle Number One, pp 59
  <URL:http://www-cs-faculty.stanford.edu/~knuth/fasc1.ps.gz>

3 Using and Porting GCC
  <URL:http://gcc.gnu.org/onlinedocs/gcc_toc.html>

4 Porting the GNU C Compiler to the CRIS architecture, chapter
  5.7 (not included in "Porting GCC for Dummies").  Available at
  <URL:ftp://ftp.axis.se/pub/axis/tools/cris/misc/rapport.ps.gz>
  (260K, also as .ps, 1.1M)

5 Installation instructions to setup tools for running the gcc
  test-suite.
  <URL:http://bitrange.com/mmix/install.html>

6 The MMIX assembly language and loader format.
  <URL:http://www-cs-faculty.stanford.edu/~knuth/mmixal-intro.ps.gz>

7 MMIX news page.  There be MMIX links here.
  <URL:http://www-cs-faculty.stanford.edu/~knuth/mmix-news.html>

8 MMIXmasters <URL:http://www.mmixmasters.org/>

brgds, H-P



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]