This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Register-passed arguments copied to the stack?
- From: "Vladimir N. Makarov" <vmakarov at redhat dot com>
- To: Alexandre Courbot <Alexandre dot Courbot at lifl dot fr>
- Cc: gcc at gcc dot gnu dot org
- Date: Fri, 22 Aug 2003 12:00:13 -0400
- Subject: Re: Register-passed arguments copied to the stack?
- References: <200308221248.19246.Alexandre.Courbot@lifl.fr>
Alexandre Courbot wrote:
> Hello everybody,
>
> I've changed my virtual target so it passes arguments into registers instead
> of the stack, to make them easier to handle. Both caller and callee see the
> arguments in the same hard registers, which are reserved exclusively for this
> usage.
>
> But the callee makes something weird: at the beginning of the function, it
> copies all the arguments it received on the stack instead of directly using
> the register arguments. In the sample below, a short is passed through
> register 4 and a byte is passed through register 5. register 1 is the frame
> pointer. The prototype of the function is void func(short arg1, char arg2),
> and here is the RTL dump of the beginning of the function after the reloading
> pass:
>
> (note 16 2 3 0 [bb 0] NOTE_INSN_BASIC_BLOCK)
>
> (insn 3 16 4 0 (set (mem/f:HI (reg/f:SI 1 (null)) [0 arg1+0 S2 A8])
> (reg:HI 4 (null) [ arg1 ])) 5 {*mov_all_modes} (nil)
> (nil))
>
> (insn 4 3 5 0 (set (mem/f:QI (plus:SI (reg/f:SI 1 (null))
> (const_int 2 [0x2])) [0 arg2+0 S1 A8])
> (reg:QI 5 (null) [ arg2 ])) 5 {*mov_all_modes} (nil)
> (nil))
>
> (note 5 4 10 0 NOTE_INSN_FUNCTION_BEG)
>
> Right after that, registers 4 and 5 are no more used and references from the
> frame pointer are used instead. This is not what I want: instead, I'd like
> the argument registers to be used, from the beginning to the end.
>
> Note that this behavior is already visible right after RTL generation and
> can't come from the reloading.
>
> Any hint on what could cause this behavior?
>
The compiler uses virtual registers for scalar variables/arguments. As I
understand hard registers 4 and 5 are copied into virtual registers during rtl
generation. After that the global register allocator decides not to assign the
pseudo-registers to hard-registers. Reload follows the assignment and changes
the pseudo-registers to the stack slots.
Why does the compiler use the pseudo-registers for arguments in rtl
generation? It is not reasonable to use the hard-registers in general case for
this. The hard-registers could be used for other frequently used variables and
program could be faster. So it is a register allocator responsibility to get a
better assigning hard registers to pseudo-registers.
Why does the allocator not assign the hard-register to the argument
pseudo-registers? Probably because there are other higher priority variables.
The global allocator in gcc is usage count based register allocator (it is one
of the simplest allocator). The priority of (global) pseudo-registers is
calculated in allocno_compare of global.c. The more live range of the pseudo
is, the smaller priority is. Probably the argument pseudos have long live
range.
The allocator assigns a hard register or memory for pseudos. It can not
assign *more one* register for the pseudo because the result of the allocator is
in reg_renumber array indexed by pseudo number. It is a constraint for some
register allocation algorithms. It is not easy to fix it in reload. Reload is
a complicated phase who deals with all possible/impossible combinations of
register moves/loads/store (e.g. some machines have registers which can not
store register directly into memory and needs another register for this or can
move one register to another only through memory). Reload can change the
allocator assignments. The reload is consequence of (or sacrifice for) gcc
portability. Gcc has a very complicated register allocation
(regmove/local-alloc/global/reload and i think more to come) mainly because of
reload (e.g. regmove optimization tries to help reload before the register
allocator).
But there is still a solution permitting variable/argument to live in several
hard-registers or memory in different places. It is a live range splitting
which is usage of several pseudos for the variable (with adding move insns to
copy one variable pseudo into another one). The standard register allocator
does not do it. The new colour-based register allocator does do it. Aggressive
live range splitting can result in worse program because a lot of additional
moves/loads/stores (although some of them can be coalesced). In general, a good
allocator should do combined live-range
splitting/coalescing/spilling/restoring/rematerialization/assigning registers
taking CFG into account. Now many this tasks are solved separetely (sometimes
locally as in reload) in different phases.
So you could try to use the new register allocator to solve the problem. But
the new register allocator works worse for machine with small register file
(like x86). IMHO, It is important to take CFG information into account for
small register file. The colour based register allocator does not take this
information into account.
Vlad
>
> Unrelated: is there a way to easily retrieve the extra information that is
> visible in the RTL dumps for every rtx from the backend? (i.e. the stuff in
> brackets: [0 arg1+0 S2 A8]) This gives precious information on what kind of
> operand is manipulated, and although I manage to know it by making my own
> mapping from the source tree, this seems to be much better suited.
>
> Thanks in advance,
> Alex.