Better handling for multi-word values?

Bernd Schmidt
Fri Sep 4 14:21:00 GMT 1998

Currently, the code that is generated by GCC for multi-word operations is
somewhat suboptimal.  A multi-word value (which occurs for long longs, and
for structures/arrays that GCC decides to keep in a register) is stored in a
pseudo registers that is wider than a word (e.g. DImode for most 32 bit
targets).  The register allocator must either allocate several consecutive
hard registers for them, or place them into memory.
However, most of these multi-word operations are actually emitted as a sequence
of operations on word-sized parts of the register.  And since it typically
is not required that these parts are allocated to consecutive hard registers,
this requirement often leads to inefficient code, especially (again...) on
a machine like the 386 which has a poor supply of registers.

My question is now, what could be done about it?  I've played with the compiler
a bit yesterday and came up with a half-done patch (which I won't post here,
since it doesn't work properly yet; I can send it on request).  I'll describe
what I did so that people can comment on whether they think it's a good
starting point or not.

I added a new rtx code, MULTIWORD, which represents a value that consists of
multiple word-sized parts.  The idea is that instead of a DImode pseudo, two
SImode pseudos are allocated and put inside of a MULTIWORD:

 [(reg:SI 93)
  (reg:SI 94)])

I tried to make this behave as similar to a DImode pseudo as possible by
modifying general_operand and register_operand to recognize them.  Also, I
changed operand_subword and gen_lowpart to extract the correct subparts of
such an object.  Since pseudo reg rtxs are unique and never duplicated, I
made sure that the same was true of MULTIWORDs, and that validate_change
would reject any change within such an rtx.
A few other modifications in some parts of the tree to rtl conversion were
necessary, but after that the compiler already did something that resembled
the right thing.  The machine description now had the control over whether
an operation required a DImode value.  Patterns like iordi, which describe
an operation performed on word-sized parts can be deleted, the code in
optabs.c will then perform the operation efficiently on the SImode pseudos. 
However, in the i386 case, and surely on other machines as well, some patterns
(e.g. floatdidf) would be hard to express as an operation on several SImode
values.  Leaving such a pattern in the compiler "worked" - reload even managed
to reload a MULTIWORD into a DImode hard reg - but this area is where the main
problems lie.

Reloading a MULTIWORD is more complicated than reloading a DImode pseudo.
Imagine a (match_operand:DI "general_operand" "rm").  This tells reload that
a DImode hard register (implemented as consecutive SImode regs) or a DImode
memory location is acceptable.  Some care needs to be taken that if one of
the pseudos that make up a MULTIWORD is spilled, enough space is allocated
in the frame so that if the other half is spilled, it can be placed adjacent.
That is rather easy.  I believe it could be made to work this way, but I
think it is not desirable.  Even if the pattern in the machine description
operates on DImode values, it may be desirable not to reload it into
consecutive hard registers.  For example, consider the adddi pattern.  It may
be easier to use such a pattern if the machine does not have a double word
add instruction than trying to express in RTL the effects of a simple
add on the carry flag, and how an add with carry instruction uses the flag.
But still the machine allows both halves of the involved DImode values in
arbitrary locations, not necessarily consecutive.  One idea seems obvious:

(set (multiword:DI [(match_operand:SI 0 "general_operand" "=rm")
		    (match_operand:SI 1 "general_operand" "=rm")])
     (plus:DI (multiword:DI [(match_operand:SI 2 "general_operand" "0")
			     (match_operand:SI 3 "general_operand" "1")])
	      (multiword:DI [(match_operand:SI 4 "general_operand" "=rm")
			     (match_operand:SI 5 "general_operand" "=rm")])

Unfortunately, that has different flaws.  We can't really express with the
current constraints that the operation is commutative.  Worse, this pattern
allows four different memory addresses.  If this were used in, it
could lead to a spill failure (four addresses * two registers per address
== more registers than the machine has).

An ideal solution would be able to tell reload that a DImode value can be
reloaded either into a DImode memory location, or into a MULTIWORD containing
independent SImode hard registers.  However, that seems impossible given the
currently available constraints.

I had one idea which seems rather awful to me but I'll describe it anyway:
add a "match_multiword" rtx, which acts mostly as a match_operand, but
also contains embedded match_operands; like this:

(match_multiword:DI "general_operand" "m,X"
   [(match_operand:SI "general_operand" X,r")
    (match_operand:SI "general_operand" X,r")])

which would tell reload to either leave the MULTIWORD in place and reload the
registers within, or reload the whole expression with a MEM.  But this is
rather too hackish, I'm afraid.

Ideas, anyone?


More information about the Gcc mailing list